
Researchers from UC Berkeley and UC San Francisco have made a major step forward in brain-computer interface (BCI) technology by developing a system that restores near-natural speech for people with severe paralysis. The new method focuses on solving a key challenge in speech neuroprostheses: the delay between when a person tries to speak and when their thoughts are converted into sound.
The breakthrough technology, published in Nature Neuroscience, uses artificial intelligence (AI) to decode brain signals into spoken words almost instantly. By streaming brain activity into audible speech in near-real time, this system gives a smoother and more natural flow to speech, allowing continuous expression without significant pauses. The study was funded by the National Institute on Deafness and Other Communication Disorders (NIDCD) under the National Institutes of Health (NIH).
“Our streaming system uses algorithms similar to those in devices like Alexa or Siri to decode brain signals and produce speech nearly as fast as it’s thought,” explained Gopala Anumanchipalli, co-principal investigator and assistant professor at UC Berkeley. “This is the first time we’ve been able to achieve fluent, continuous speech synthesis directly from neural data.”
The new technology also works across a range of devices. It supports non-invasive methods that use sensors on the skin to measure facial muscle activity and more complex systems involving electrodes placed on or inside the brain. According to Kaylo Littlejohn, a Ph.D. student and co-author, the algorithm can adapt to various brain-monitoring setups as long as it has access to reliable signals.
The neuroprosthesis converts neural activity from the brain’s motor cortex, which controls speech, into words. This is done after a person has already formed the thought and is preparing to move their vocal muscles. To train the system, a participant silently tried to speak sentences while the researchers recorded their brain activity. AI models filled in missing details, such as sound patterns, to create spoken output.
Notably, the team used the participant’s pre-injury voice as a reference, ensuring the output sounded familiar and personal. Previous studies showed an 8-second delay in decoding full sentences, but the new method achieves audible speech in under a second. This faster response is matched with high accuracy, demonstrating that real-time streaming is possible without sacrificing quality.
To test flexibility, researchers synthesized rare words that were not part of the system’s training data, such as those from the NATO phonetic alphabet (“Alpha,” “Bravo,” etc.). The technology performed well, indicating its potential for broader vocabulary use.
Edward Chang, a senior researcher and neurosurgeon at UCSF, emphasized the real-world applications. “This innovation brings us closer to practical BCIs that can greatly improve communication for those with severe speech impairments,” he said.
Future efforts aim to enhance the emotional tone and expressiveness of the speech. The goal is to reflect changes in pitch, volume, and emotion, making the output more lifelike. With further refinement, this technology could significantly improve communication options for people unable to speak.
Source: UC Berkeley
This article was generated with some help from AI and reviewed by an editor.
0 Comments - Add comment