Microsoft is going to let businesses and people create their own custom AI-based voice that could be used for dubbing in other languages, gaming, and more. Today as part of the company's Ignite 2023 developers conference, the Azure AI services division announced this new feature, called Personal Voice.
In a blog post, Microsoft says this is something of an extension of its current custom neural voice feature in Azure AI Speech. Personal Voice is different because, as the name implies, it uses a person's own voice to create AI-based audio, which can then be used to make voices in over 100 languages.
The blog states:
Preparing training samples for creating an AI voice could be difficult or costly. With personal voice, users can create a voice that just sound like them with a voice sample, as short as 60 seconds.
This feature could be used to create a voice assistant that sounds just like the person who use Personal Voice to make the AI chatbot. It could also be used by gamers to voice their characters, along with dubbing of an actor's voice in other languages and much more.
Obviously, this technology could be used to create fake voices of real people for less than honorable actions. Microsoft says that anyone who makes an AI voice with this feature must make a recorded statement, stating that the user knows that "the customer will create and use their voice."
In addition, the feature can only be used in certain cases, at least for now. Microsoft says:
- In applications where voice output is constrained and defined by customers who meet Limited Access eligibility criteria, and where the voice does not read user-generated or open-ended content. Voice model usage must remain within the application and output must not be publishable or shareable from the application. Some examples of applications that fit this description are voice assistants in smart devices and customizing a character voice in gaming.
- Dubbing for films, TV, video, and audio for entertainment scenarios only, where customers who meet Limited Access eligibility criteria maintain sole control over the creation of, access to, and use of the voice models and their output.
Users must also obey Microsoft's guidelines for using this technology and its code of conduct. At the moment this feature will only be available in the West Europe, East US, and South East Asia regions of the world. The public preview will go live on December 1.
3 Comments - Add comment