Stroke, traumatic brain injury, and neurodegenerative diseases can all cause the loss of the ability to speak. Some people with severe speech disabilities learn to spell out their thoughts letter-by-letter using assistive devices capable of tracking very small eye or facial muscle movements. However, producing text or synthesized speech with such devices is laborious, error-prone, and painfully slow. Now, a study from researchers at UCSF engineers artificial intelligence possessing the ability to translate activity in the brain’s speech centres into a synthesized version of a person’s voice. The team states their technology could restore fluent communication in individuals with severe speech disabilities, and reproduce some of the musicality of the human voice which conveys the speaker’s emotions and personality. The study is published in the journal Nature.
Recent studies from the lab showed how the human brain’s speech centers choreograph the movements of the lips, jaw, tongue, and other vocal tract components to produce fluent speech. This means there is a need to take into account vocal tract movements and not just linguistic features like phonemes when studying speech production. This led the team to the sensorimotor cortex in the brain known to encode vocal tract movements, and to the theory this cortical activity could be decoded and translated via a speech prosthetic, giving a voice to people with intact neural functions who have lost the ability to speak. The current study uses brain signals produced by cortical activity, recorded from epilepsy patients to program a computer to mimic natural speech.
The current study asks 5 patients with intact speech who had electrodes temporarily implanted in their brains to map the source of their seizures in preparation for neurosurgery, to read several hundred sentences aloud while the researchers recorded activity from a brain region known to be involved in language production. Results show the group was able to build maps of how the brain directs the vocal tract, including the lips, tongue, jaw, and vocal cords, to make different sounds; these maps were then applied to a computer program producing synthetic speech.
The team explains this detailed mapping of sound allows the creation of a realistic virtual vocal tract for each participant controlled by their brain activity. Data findings show this comprises two neural network machine learning algorithms, firstly a decoder transforming brain activity patterns produced during speech into movements of the virtual vocal tract, and a synthesizer responsible for converting these vocal tract movements into a synthetic approximation of the participant’s voice. Volunteers were then asked to listen to the synthesized sentences and to transcribe what they heard; more than half the time, the listeners were able to correctly determine the sentences being spoken by the computer.
The team surmises they have developed a neural decoder leveraging kinematic and sound representations encoded in human cortical activity to synthesize audible speech. For the future, the researchers state they plan to design a clinical trial involving paralyzed, speech-impaired patients to determine how to best gather brain signal data which can then be applied to the previously trained computer algorithm.
Source: UC San Francisco
Get Healthinnovations delivered to your inbox:
Michelle Petersen is the founder of Healthinnovations, having worked in the health and science industry for over 21 years, which includes tenure within the NHS and Oxford University. Healthinnovations is a publication that has reported on, influenced, and researched current and future innovations in health for the past decade.
Michelle has been picked up as an expert writer for Informa publisher’s Clinical Trials community, as well as being listed as a blog source by the world’s leading medical journals, including the acclaimed Nature-Springer journal series.
Healthinnovations is currently indexed by the trusted Altmetric and PlumX metrics systems, respectively, as a blog source for published research globally. Healthinnovations is also featured in the world-renowned BioPortfolio, BioPortfolio.com, the life science, pharmaceutical and healthcare portal.
Most recently the Texas A&M University covered The Top 10 Healthinnovations series on their site with distinguished Professor Stephen Maren calling the inclusion of himself and his team on the list a reflection of “the hard work and dedication of my students and trainees”.
Michelle Petersen’s copy was used in the highly successful marketing campaign for the mega-hit film ‘Jumanji: The Next Level, starring Jack Black, Karen Gilian, Kevin Hart and Dwayne ‘The Rock’ Johnson. Michelle Petersen’s copywriting was part of the film’s coverage by the Republic TV network. Republic TV is the most-watched English language TV channel in India since its inception in 2017.
An avid campaigner in the fight against child sex abuse and trafficking, Michelle is a passionate humanist striving for a better quality of life for all humans by helping to provide traction for new technologies and techniques within healthcare.