Microsoft announces new tool to detect and correct hallucinated content in AI outputs

Azure AI Content Safety is an AI service from Microsoft that detects harmful user-generated and AI-generated content in applications and services. It offers both text and image APIs, allowing developers to identify unwanted material.

The Groundedness detection API within Azure AI Content Safety can determine whether large language model responses are based on user-selected source materials. Since current large language models can produce inaccurate or non-factual information (hallucinations), this API helps developers identify such content in AI outputs.

Today, Microsoft announced a preview of a correction capability. Developers can now detect and fix hallucinated content in AI outputs in real time, ensuring end-users consistently receive factually accurate AI-generated content.

Here's how the correction feature works:

The application developer enables the correction capability.
When an ungrounded sentence is detected, a new request is sent to the generative AI model for a correction.
The LLM assesses the ungrounded sentence against the grounding document.
Sentences without content related to the grounding document may be filtered out completely.
If content is sourced from the grounding document, the foundation model rewrites the ungrounded sentence to align with the document.

Besides the corrections feature, Microsoft announced the public preview of hybrid Azure AI Content Safety (AACS). This allows developers to deploy content safety mechanisms in the cloud and on-device. AACS's Embedded SDK enables real-time content safety checks directly on devices, even without internet connectivity.

Finally, Microsoft announced the preview of Protected Materials Detection for Code that can be used with generative AI applications that generate code to detect whether the LLM has generated any protected code. This feature was previously available only via the Azure OpenAI Service. Microsoft is now making this feature available for customers to use in conjunction with other generative AI models that generate code.

These updates enhance the reliability and accessibility of AI content moderation, promoting safer and more trustworthy AI applications across various platforms and environments.