Microsoft details improvements to 'Read Aloud' in Chromium-based Edge

Microsoft released an update to the preview versions of Chromium-based Edge browsers yesterday, introducing a multitude of features, including many dark theme improvements and numerous fixes. The latest version also introduced support for cloud-powered text-to-speech voices. The company detailed the improvements in a blog post today.

Cloud-powered voices leverage Microsoft Cognitive Services, a set of APIs, SDK, and services offered by Microsoft, of which Speech is one. The Dev and Canary branches of the browser now support 24 cloud-powered text-to-speech voices across 21 locales, which are used by the 'Read Aloud' feature. The Read Aloud feature was present in the company’s EdgeHTML-based Edge browser, which was then ported to the new browser.

As the name suggests, Read Aloud is an assistive feature that reads the contents of either selected text or the entire page from the top of a web page. The company says that the improvements came from feedback that the current implementation made it difficult to install different language packs and that the voices sounded robotic and unnatural.

The cloud-powered voices in Read Aloud are categorized into two styles – Neural Voices and Standard Voices.

Neural voices – Powered by deep neural networks, these voices are the most natural sounding voices available today.

Standard voices – These voices are the standard online voices offered by Microsoft Cognitive Services. Voices with “24kbps” in their title will sound clearer compared to other standard voices due to their improved audio bitrate.

To try out these voices, you can fire up any website, select the desired text, right-click and select “Read Aloud Selection”. To have the feature read out the complete webpage, you can head to the ellipsis (…) menu at the top and hit the “Read Aloud” option. While the feature is active, the menu bar at the top provides options for a selection of available voice and reading speed preferences.

The company adds that the voices “have been exposed to developers through the JavaScript SpeechSynthesis API”. This means that other web-based text-to-speech applications can leverage them to enable these voices in the Edge browser.