Watching a YouTube Video? Hackers can Infect Your Device Using Hidden Voice Commands

Author Photo
Jul 13, 2016

With malware and ransomware campaigns running left and right, it is not surprising to see security researchers trying to discover what new or futuristic methods could be used by attackers to launch these campaigns. We typically get to know about a new loophole, a vulnerability or an attack strategy once it’s already working in the wild. Early knowledge of these, however, helps researchers in not only keeping the public informed, but also to devise better security countermeasures. In one such attempt, researchers from the University of California, Berkeley and Georgetown University revealed that hidden voice commands can be issued to hack mobile devices without user knowledge

How to hack a device using a YouTube video

Researchers have devised a way to leverage YouTube to hack into mobile devices, where the intended victim has nothing to do but watch a YouTube video. Or any other video, for that matter. In a paper, the security research team said that the hidden voice commands used in these attacks are unintelligible to humans, but are easily interpreted by the devices, if they are in the close vicinity. This means that you don’t even have to watch a video yourself, as long as this specially crafted video is being watched on a nearby device, or broadcasted in some way. These commands will be received by Google Now or Apple Siri personal assistants which are often set to Always On modes, accepting all the commands they may receive.

screen-shot-2017-08-08-at-2-50-32-amRelatedYouTube For Android To Get AutoPlay Feature ‘Play As You Browse’ For Videos in Home Feed

“Adversaries with significant knowledge of the speech recognition system can construct hidden voice commands that humans cannot understand at all,” the paper said. But, if deciphered by the voice-based assistants, they could be potentially executed without user knowledge. These commands can instruct a mobile device to download malware, or change system settings, such as post a user’s location or activate airplane mode to cause denial of service, potentially leading to spying, data leak, and several other privacy and security issues.

Researchers described two methods to launch this attack, black-box and white-box models.

In the black-box model, an attacker uses the speech recognition system as an opaque oracle. We show that the adversary can produce difficult to understand commands that are effective against existing systems in the black-box model.

Under the white-box model, the attacker has full knowledge of the internals of the speech recognition system and uses it to create attack commands that we demonstrate through user testing are not understandable by humans.

A similar attack was devised last year by the Intelligence French agency ANSSI, who had discovered that a hacker can take full control of a device from 16 feet away. The hack only worked when the target device had the headphones plugged in.

screen-shot-2017-09-27-at-8-48-55-pmRelatedAmazon Echo Show Gets $30 Price Cut Following YouTube Restriction

This is a novel way to attack a target device, but not an unlikely threat. The research team has suggested a number of countermeasures, including the introduction of notifications for every time a voice command is received or executed. Researchers have also published a PoC video of the black box attack that is being carried out in the presence of background noise. Target device is placed at a 10 feet distance from the speakers used to launch the voice commands. For more technical details, take a look at “Hidden Voice Commands” [PDF].