When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.

Microsoft's Azure AI Speech services add more voices and more chat avatars

Microsoft Azure logo white monochrome on blue background

Microsoft has been adding more AI-based realistic voices for customers of its Azure AI Speech services over the past year. Today, the company announced it is offering those customers even more additions and improvements for AI voices.

In a blog post, Microsoft stated that it has added more multilingual voices in Azure AI Speech. It said:

These voices are crafted from a variety of source languages, bringing a rich diversity of personas to enhance your user experience. With their authentic and natural interactions, they promise to transform your chatbot engagement through our technology.

The new AI voices include:

  • en-GB-AdaMultilingualNeural - en-GB (English – United Kingdom) - Female
  • en-GB-OllieMultilingualNeural - en-GB (English – United Kingdom) - Male
  • pt-BR-ThalitaMultilingualNeural - pt-BR (Portuguese – Portugal) - Female
  • es-ES-IsidoraMultilingualNeural - es-ES (Spanish – Spain) - Female
  • es-ES-ArabellaMultilingualNeural - es-ES (Spanish – Spain) - Female
  • it-IT-IsabellaMultilingualNeural - it-IT (Italian – Italy) - Female
  • it-IT-MarcelloMultilingualNeural - it-IT (Italian – Italy) - Male
  • it-IT-AlessioMultilingualNeural - it-IT (Italian – Italy) - Male

In addition, Microsoft has added two more optimized US-based voices in Azure AI Speech that were created specifically to be used in company call centers:

  • en-US-LunaNeural - En-US (English – United States) - Female
  • en-US-KaiNeural - En-US (English – United States) - Male

Microsoft has made all of these voices available as a public preview in the East US, West Europe, and South East Asia Azure regions.

The company also revealed today it has added five new text-to-speech realistic-looking human avatars for Azure AI Speech users. It also announced some improvements in how those avatars sound:

The Azure OpenAI GPT-4o model is now part of the live chat avatar application in Speech Studio. This allows users to see firsthand the collaborative functioning of the live chat avatar and Azure OpenAI GPT-4o. Additionally, we provide sample code to aid in integrating the text-to-speech avatar with the GPT-4o model.

Finally, Microsoft revealed a new Text Stream API designed to help speed up text-to-speech functions:

The Text Stream API represents a significant leap forward from traditional non-text stream TTS technologies. By accepting input in chunks (as opposed to whole responses), it significantly reduces the latency that typically hinders seamless audio synthesis. The Text Stream API not only minimizes latency but also enhances the fluidity and responsiveness of real-time speech outputs, making it an ideal choice for interactive applications, live events, and responsive AI-driven dialogues.

Developers can check out some sample code for the Text Stream API on GitHub.

Report a problem with article
The Blink Video Doorbell System  Amazon Echo Pop
Next Article

Prime members can get the Blink Video Doorbell System + Amazon Echo Pop for under $35

ebook offer
Previous Article

Download "Multi-Cloud Handbook for Developers" eBook (worth $39.99) for free

Join the conversation!

Login or Sign Up to read and post a comment.

0 Comments - Add comment