Bytes and Beyond
Voice assistants: trick or treat?
First of all: Please don't expect me to provide you with a balanced view about voice assistants. Alexa & Co. have annoyed me far too often for me to stay fair and balanced. I didn't even need my own assistant to reach this point: Visits to friends with more or less Smart Homes were sufficient.
Of course, voice control and home automation aren't always pointless – in certain situations Bixby, Cortana, Google Now and Siri can be really helpful. If both hands have to remain on the steering wheel or if your arms are stuck in plaster, it is really practical when an assistant takes you by your word.
Like so many digital inventions, voice assistants can be as much a curse as a blessing. Let me help you weigh the pros and cons.
The other day a friend sent me this joke that I can't find again, forcing me to reconstruct it from memory. "In the 1980s we worried: 'Oh God, we're being eavesdropped on by hidden listening devices!' In 2019, it's: 'Hello listening device, be sure to add toilet paper to my shopping list.'"
Today, you can even choose who you want to do the eavesdropping. The best known voice assistants are Alexa by Amazon, Siri by Apple, Assistant by Google, Cortana by Microsoft and Bixby by Samsung. However, some corporate helplines also feature voice recognition technology to take down customer data and phone numbers as well as record error reports. And let's not forget my favorite, the voice feature of my car GPS which is best described as "stubborn like a Southern donkey."
By themselves, assistants aren't much
On their own, voice assistants can provide little more than basic information: What's the weather like, read me the news, what theaters are playing the current Spider-Man movie. To become active, they require additional home automation hardware – enter the buzzword "Smart Home".
Smart Homes are the model railways of the 21st century: On one hand, I do understand the satisfaction when you just have to say a few words to make the lights go out in the ground floor, instead of having to shuffle down the stairs yet again. On the other hand, this joy is just as childish as when the tiny gates at the miniature railroad crossing automatically lower before the electric locomotive races through.
Voice assistants plus home automation hardware can preheat apartments, tilt windows, lower blinds, irrigate lawns and lock doors. All this from afar or from the bathtub. Chores formerly assigned to the youngest child of the family are now taken care of by a voice assistant – which will never complain. That's progress to you.
Where assistants are useful
In all fairness, assistants can be extremely useful. For example, for a person in a wheelchair, voice-controlled home automation can mean a significant gain in independence. If you can only reach for the window handle with agony or not at all, a remote-controlled motor is no longer a childish toy, but a sensible enrichment of your quality of life.
If your arm is stuck in a plaster cast after an accident, a mobile phone that listens helps to get through the tedious weeks of healing (get well soon, by the way). With a car GPS, voice control is actually a must – if the navigator reports a traffic jam at 120 km/h on the motorway and asks if you want to take a 30-minute shorter route, it must be able to react reliably to a shouted "oh God, please yes."
As long as there are people for whom voice assistants are a useful improvement of their life and not just a geeky gadget, I don't mind voice recognition technology spreading further – provided I can still find a way around them. When the voicemail system of my phone provider becomes too annoying, I start to mumble incomprehensibly until the computer gives up and connects me to a real person.
Is artificial intelligence better than none at all?
Voice assistants are often lumped together with artificial intelligence. Depending on the writer's attitude, this is supposed to invoke coolness or a threat – in either case it's nothing but hot air. In a nutshell, artificial intelligence doesn't exist. "Machine learning" is a far more accurate term: Roughly speaking, it's a matter of computers learning to recognize patterns and react accordingly.
Current voice assistants don't even learn: Most of the current generation can react only to a very limited word set and even then they can be quite picky. Recently, a dear friend tried to get an Alexa speaker to switch off the lamp next to his sofa. He tried it three times using different intonations, the volume of his voice increasing with each attempt. I sat quietly on the sofa and was torn whether to pity or mock him.
Maybe the difference to a stubborn child isn't that big: As a small child I sometimes closed my ears from the inside when my mom yelled my name from the living room. I knew I was needed, but I didn't feel like it.
Alexa probably wasn't faking it, though – she really seems to be somewhat hard of hearing. Loudspeakers or mobile phones can't really recognize much by themselves. Instead, they record the command and send it to a big neural network in the cloud, which then tells the dumb hardware how to react. If this connection is broken, the assistant will merely go "Huh?" – much more politely than an indignant child of course.
Without additional human help, the current generation of voice assistants would be a complete failure. It is known of Amazon and Google that behind the scenes, a phalanx of Third World writers is busily transcribing recorded conversations to improve the assistants' recognition rates. Apple, Microsoft and Samsung are unlikely to handle things differently. That's quite an effort just to make sure Alexa will eventually take down an order for a year's supply of salt and vinegar chips even though your mouth was full.
The standard voice is always female
Too many basic things bother me about voice assistants to want to give them a chance with my household. This starts with the fact that all voice assistants seem to talk with warm female voices.
It's not that I long for voice assistants to talk with a male screeching tone à la Gilbert Gottfried. But I'm still annoyed by the fact that all voice assistants are female by default. To my taste, the role cliché of the submissive female has not gone out of fashion for long enough. Only Apple Siri, Google Assistant and Samsung Bixby offer the option of switching their gender – Google even offers a choice of four male voices. Amazon Alexa and Microsoft Cortana, on the other hand, are locked to the female gender – their warm voices patiently adding toilet paper to your shopping list, provided there's an internet connection.
A deeper issue is the bitter core of truth within the joke I mentioned at the beginning. Time and time again, voice assistants have recorded conversations because they mistook a word from a casual conversation for an activation command. In the USA, Alexa speakers are said to have reacted by the thousands when her name was mentioned on TV. Amazon has recently filed a patent for Alexa to execute an instruction even if the activation word comes after the command itself. This can only work if the system is constantly listening.
In ioco veritas
Such circumstances awaken new desires: German secretaries of the interior have recently voiced the opinion that access to voice recordings from smart home devices should not require a judge's permission for "acoustic surveillance", but merely a search warrant: after all, the recordings were already there and they were not specifically recorded for the investigators.
It took the resulting public debate to make clear to me the basic problems of data protection raised by voice assistants. Ultimately, Amazon, Apple, Google & Co. store and process their assistants' recordings in accordance with their own rules. In a country such as Germany, where a good part of the population has experienced systematic state surveillance for decades, this inevitably creates unease.
If Alexa will not even reliably understand that she should turn off a floor lamp, I don't want to imagine what might happen if the assistant misinterprets a lively discussion about terrorist attacks as a specific plan for one. Hold on for a second, there's somebody very insistently knocking at my door right now...