Google’s DeepMind Will Speak Like A Human
Google is working on a project called Google DeepMind an Artificial Intelligence (AI) system which aims to build learning algorithms using neuroscience and deep learning technologies to create advanced AI machines. The company seems to have figured out a way to make the system talk with remarkable fluency.
In a program called WaveNet, Google says it has finally managed to achieve a more human-like speech. The ultimate goal though is to reach a completely fluent human speech as we see in our sci-fi movies.
What is DeepMind
Google has been working on this technology ever since it purchased it from an English company in 2014. DeepMind has the vision to create AI that works not only to make advanced computers but also to create devices that can flow with the human mind.
DeepMind is different from the other AI technologies like DeepBlue or Watson. DeepMind is not specifically programmed to a particular task, in fact, it is a technology where it is made to perform any work that is required.
As of now the team working with DeepMind is focused on exploring the capabilities of the AI system and Computer software to perform the thinking activities and challenges, like playing arcade games and develop other computer structures by itself.
DeepMind has a self-learning ability to perform the activity and execute it, without having to program it to perform that particular task. During the trials DeepMind was tested with different games it’s proved the capabilities to learn playing different games and reach high scores and with a considerable level of efficiency.
Talk with WaveNet
DeepMind has created WaveNet which will give the talking feature which is a crucial part of communication between man and machines.
Earlier this was done using the artificial voice generator that uses the text to speech systems(TTS). Google dropped this technology and opted for WaveNet which sounds more like a human.
This system is yet in a beta phase and not practical in real life devices. Google has also stated that the technology needs to undergo a lot of research and development before it is viable for commercial distribution. So the integration of WaveNet with other Google devices won’t happen anytime soon.
From some of the audio samples of WaveNet released by DeepMind in U.S. English and Mandarin Chinese, it certainly sounds more like a person than a robot.
Step Closer to AI Systems
Instead of using the language structure WaveNet uses the sound of the waves produced by the language. The network is based on neural system and connections, using the human brain as a reference.
WaveNet tries to replicate the model speech which provides raw waveforms of the audio. To achieve this the system requires a considerable amount of data and study. The existing TTS data with Google has proved to be crucial in achieving this result.
Aaron van den Oord, a researcher at DeepMind said
Mimicking realistic speech has always been a major challenge, with state-of-the-art systems, composed of a complicated and long pipeline of modules, still lagging behind real human speech. Our research shows that not only can neural networks learn how to generate speech, but they can already close the gap with human performance by over 50%. This is a major breakthrough for text-to-speech systems, with potential uses in everything from smartphones to movies, and we’re excited to publish the details for the wider research community to explore.