When you purchase through links on our site, we may earn an affiliate commission. Here’s how it works.

Microsoft announces general availability of Text to Speech Avatar in Azure AI Speech Service

Azure Text to speech Avatar

Azure AI Speech service allows developers to build voice-enabled, multilingual, generative AI apps with support for natural-sounding voices. The new Text to Speech Avatar feature in Azure AI Speech service can convert simple text into a video of a photorealistic human speaking with a natural-sounding voice. Developers can use any of the prebuilt avatars available as part of this service or create their own custom avatars.

Today, Microsoft announced the general availability of Text to Speech Avatar. This new capability enables developers to create personalized and engaging content for their users. The output video of this service will be 1920 x 1080 resolution with 25 frames per second (FPS).

Check out the demo of the Text to Speech Avatar service below.

Azure Speech Text to Speech Avatar comes with the following capabilities:

  • Converts text into a digital video of a photorealistic human speaking with natural-sounding voices powered by Azure AI text to speech.
  • Provides a collection of prebuilt avatars.
  • The voice of the avatar is generated by Azure AI text to speech.
  • Synthesizes text to speech avatar video asynchronously with the batch synthesis API or in real-time.
  • Provides a content creation tool in Speech Studio for creating video content without coding.
  • Enables real-time avatar conversations through the live chat avatar tool in Speech Studio.

The pricing of the Text to Speech Avatar service is a bit complicated. As expected, the charges will be based on the length of the video output and will be billed per second. Also, the text-to-speech, speech-to-text, Azure OpenAI, or other Azure services used as part of the Text to Speech Avatar service solution are charged separately. Also, this service is now available in the following Azure regions: Southeast Asia, North Europe, West Europe, Sweden Central, South Central US, and West US 2.

You can learn more about the Text to Speech Avatar service here.

Report a problem with article
LG S90TR soundbar with wireless subwoofer
Next Article

Amazon Deal: LG 7.1.3 Dolby Atmos, Vision sound bar with wireless subwoofer cheap again

Google Essentials app
Previous Article

Google launches Essentials app for easy access to its apps on Windows

Join the conversation!

Login or Sign Up to read and post a comment.

2 Comments - Add comment