Researchers translate brain signals directly into speech.
A Brain–Computer Interface (BCI) marries the brain to Artificial Intelligence (AI), to use signals recorded from the brain to enable communication and/or control neuroprosthesis by individuals who have impaired function. This technology is now being widely used, however, there is vast room for improvement with key biological and engineering problems remaining to be resolved. These include low-quality recordings by home users, the low translation speed, rudimentary accuracy of translation and adapting applications to the needs of the user. Now, a study from researchers at Columbia University develops a system which translates thought into intelligible, recognizable speech. The team state that by monitoring someone’s brain activity, their technology can reconstruct the words a person hears with unprecedented clarity. The opensource study is published in the journal Scientific Reports.
Previous studies show that when people speak, or imagine speaking, distinguishable patterns of activity appear in their brain. Distinct pattern of signals also emerge when listening to someone speak, or imagine listening. Reconstructing speech from the human auditory cortex creates the possibility of a speech neuroprosthetic to establish a direct communication with the brain and has been shown to be possible. However, the low quality of the reconstructed speech has severely limited the utility of this method for BCI applications. The current study combines recent advances in deep-learning with the latest innovations in speech synthesis technologies to reconstruct closed-set intelligible speech from the human auditory cortex.
The current study utilises a vocoder, a computer algorithm which synthesizes speech after being trained on recordings of people talking; the same technology used by Amazon Echo and Apple Siri. Epilepsy patients, already undergoing brain surgery, were asked to listen to sentences and numbers spoken by different people, while their patterns of brain activity were recorded via invasive electrocorticography, and used to train the vocoder. Results show the sound produced by the vocoder in response to the patient’s brain signals was analyzed and cleaned up by neural networks, AI which mimics the structure of neurons in the biological brain. Data findings show the output from this BCI is a robotic-sounding voice reciting a sequence of numbers.
To test the accuracy of the recording, the group asked the participants to listen to the recording and report what they heard. Results show that the patients could understand and repeat the sounds about 75% of the time. The lab state that the sensitive vocoder and powerful neural networks represented the sounds the patients had originally listened to with surprising accuracy. They go on to add they now plan to test more complicated words and sentences next, and hope their system could be part of an implant, similar to those worn by some epilepsy patients, which translates the wearer’s thoughts directly into words.
The team surmise they have developed a BCI which can translate brain signals directly into speech. For the future, the researchers state this breakthrough, which harnesses the power of speech synthesizers and artificial intelligence, could lead to new ways for computers to communicate directly with the brain.
Source: Columbia Engineering