DeepMind’s WaveNet Technology Makes Google Assistant’s New Male and Female Voices Sound More Realistic

Zara Ali • Oct 9, 2017 at 05:39pm EDT

How to enable Google Assistant on Android Nougat rooted devices

Google recently rolled out Male and Female voice options for Google Assistant in English. A worthy alternative for those who have voice preferences for virtual assistants. The new voices for the assistant sound more real, thanks to the deep neural network for sound synthesis by Alphabet’s DeepMind division.

In 2016, Alphabet lab introduced the WaveNet deep neural network for “generating raw audio waveforms that is capable of producing better and more realistic-sounding speech than existing techniques.”

In the span of 12 months, the team tested this “computationally intensive” research prototype on consumer products, first one being Google Assistant voices for US English and Japanese. The new model can produce waveforms 1000 times faster with better resolution and fidelity than the original.

Computational Approach

Alphabet's computational approach to text-to-speech is a big leap forward in comparison to previous methods that involved voice artists in recording a huge database of sounds that were compiled together. On the downside, the computational method could result in synthetic sounds that are difficult to modify as the whole database needs tweaking whenever new changes are introduced such as intonations or emotions. But it takes way lesser time in processing sounds than the previous method.

DeepMind's computational approach introduced in 2016 included a “deep generative model that can create individual waveforms from scratch.”

It enabled inclusion of natural sounds that sync better and present natural accents, intonation, and even skeuomorphic sounds like “lip smacks.”

In its blog post, DeepMind explains:

It was built using a convolutional neural network, which was trained on a large dataset of speech samples. During this training phase, the network determined the underlying structure of the speech, such as which tones followed each other and what waveforms were realistic (and which were not). The trained network then synthesised a voice one sample at a time, with each generated sample taking into account the properties of the previous sample.

The resulting voice contained natural intonation and other features such as lip smacks. Its “accent” depended on the voices it had trained on, opening up the possibility of creating any number of unique voices from blended datasets. As with all text-to-speech systems, WaveNet used a text input to tell it which words it should generate in response to a query.

You can check out DeepMind's latest blog post on the new approach for male and female voices on Google Assistant.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on DeepMind’s WaveNet Technology Makes Google Assistant’s New Male and Female Voices Sound More Realistic

DeepMind’s WaveNet Technology Makes Google Assistant’s New Male and Female Voices Sound More Realistic

Computational Approach

Trending Stories

RTX Spark’s 20-Core CPU Disappoints In Cinebench 2026’s Multi-Core Leak Despite Set To High Performance Mode, Single-Core Results Show Promise But Only Against M3 Max

Intel’s Former CEO Gelsinger Admits Firm ‘Scoffed’ at NVIDIA’s GPUs While Riding High on CPU Dominance & Makes Big Quantum Computing Claims

Square Enix’s Final Fantasy VII Rebirth Looks Like a Remaster on PC, as Shader Injector 2.0 Delivers Series’ Best Visuals

GameStop May Have Leaked Zelda: Ocarina of Time Remake Pre-Orders for August 4, Hinting First Real Footage Isn’t Far

China’s Kimi K3 Identifies Itself As Anthropic’s Claude In At Least One Conversation, Betraying Its Distilled Origins

Popular Discussions

AMD Radeon Drivers Silently Add Multi Frame Generation “MFG 8x”, Ray Regeneration, and Neural Radiance Overrides, Hinting At A Bigger FSR Push

AMD Ryzen 7 7700X3D 4.5 GHz “3D V-Cache” CPU Review: The Budget X3D Champ For AM5

NVIDIA GeForce RTX 50 SUPER GPUs Have Reportedly Arrived At AIBs, But Are On Hold Due To Undecided Memory Prices

AMD Ryzen 7 5800X3D Outsells Ryzen 7 7800X3D For The Same Price On Amazon Despite Being Weaker

AMD Ryzen 7 7800X3D CPU Drops To $299 A Day Ahead of 7700X3D’s Launch, Bringing 3D V-Cache Goodness To Mainstream Gamers

DeepMind’s WaveNet Technology Makes Google Assistant’s New Male and Female Voices Sound More Realistic

Related Story Samsung Reportedly Outsources Google’s TPU I/O Late-Stage Design, Says Report

Computational Approach

Further Reading

Trending Stories

Popular Discussions