How Amazon's Echo built the future of computing

James Titcomb 22 December 2016 • 6:00am

In late 2014, out of nowhere, Amazon unveiled a peculiar black cylinder with such little warning that nobody quite knew what to make of it.

The Echo, a voice-activated wireless speaker without a touchscreen or buttons to control it, was unlike anything that had come before, sales were limited to a select group of customers, and the device’s appeal was hard to describe.

Operated entirely by speaking to “Alexa”, its artificial intelligence, the Echo appeared to take the concept of smartphone assistants like Siri – which many people saw as a novelty extra on mobile phones – and curiously place it at the centre of the device.

Early reactions were that it was a Christmas season marketing gimmick, or worse, a privacy nightmare. At first, the Echo’s voice recognition was less than flawless and coming off the back of the derided Amazon phone, it looked like yet another weird experiment.

But two years later, the Echo has become a cult hit. The £150 speaker, which went on sale in the UK and Germany this year along with the cheaper Echo Dot, has become a fixture in many homes and one of Amazon’s top-selling items.

While mobile phone voice companions often fail to understand speech or are incapable of acting on it, improvements to the Echo mean it now interprets commands with unrivalled accuracy. Barking a command at it – you say “Alexa” to wake up the device before issuing an order – has replaced many people’s phones or laptops as the way to control music, set timers or look up information. It has even had marriage proposals – more than 250,000 of them.

Its voice-activated technology, however, may simply be at the beginning of its potential, and some believe that it represents the cusp of the next computing revolution; that reliable speech recognition could be an advance as profound as the arrival of the smartphone.

The next wave of computing

“We think voice is the most natural use interface,” says Mike George, Amazon’s vice president of the Alexa division. “We're taking things that required devices, that in the past were done on apps, and you’re using your language and your voice to do it.

“Where you normally would have pulled out a laptop, fired up a browser, gone to a search engine and asked a question, if you can just do that by speaking you probably will just speak.”

When Amazon was first developing the Echo, the inspiration was the always-listening computer system onboard Star Trek’s spaceship (Amazon’s chief executive Jeff Bezos is a renowned fan of the series). In the show, a question or command from a crew member would instantly be answered by the disembodied voice.

But while many people saw Star Trek before ever using a PC, computers took on a different form – the system of icons and controls on a screen known as the graphical user interface. When voice assistants finally came, in the shape of smartphone technology like Apple’s Siri or Google’s Android equivalent – they were limited and under-used. Turning speech, with all its variations and idiosyncracies, into computer code and back again has flummoxed computer scientists for years.

Smartphone assistants such as Siri came before Alexa, but the latter has been better received Credit: Getty Images

But done right, voice control is superior to the screen for many things. As something people learn from infancy and understand throughout their lives, it requires no training, and disability is no impediment. Speaking to ask a question is typically quicker than typing or clicking around a website.

“Alexa has changed how people think of voice as an interface for controlling digital services,” says Werner Vogels, Amazon’s chief technology officer. “Voice is such a more natural way of interacting with these digital services, the best thing we have done with Echo is not putting a screen on it."

Building on Alexa

At present, the Echo’s capabilities are relatively basic compared to a fully-fledged PC or smartphone. The device still occasionally misunderstands, and is best suited to small interactions – adding items to a shopping list or asking the weather – while more complicated tasks such as buying items from Amazon are a little awkward. Companies can also develop “skills”, the equivalent of apps, which let users call an Uber or order a takeaway, and the Echo can control the growing number of “smart home” gadgets such as internet-connected lightbulbs or thermostats, via voice.

But the promise of the technology is what excites Amazon’s executives. Earlier this month, the company announced new features to allow developers to build more advanced skills. In future, says Al Lindsay, who runs Amazon’s Alexa speech team, the device will be able to book tables at restaurants or holidays with friends, things that often take a series of emails or phone calls today.

Rohit Prasad, the Alexa division’s chief scientist, says there is still too much “friction” in communicating with the device – users have to activate skills using particular phrases for them to work, for example you must say “Alexa, ask Uber to order me a car” instead of simply “Alexa, order me a car.”

Staff stacking Echos — The Echo has become one of Amazon's top-selling gadgets Credit: Rex Features

But Mr Prasad that the technology will become smarter. “We definitely want to solve that problem, there are a few things we’ll do to make it much simpler in the next few months,” he says.

He also promises improvements to certain areas of the technology – the fact that each interaction must start with “Alexa”, for example, or that the Echo can not pro-actively talk to you with the equivalent of a smartphone push notification. These are challenges, he says, in which require the company to stay on the right line of privacy concerns.

What it means for Amazon

As is often the case with Amazon, which is famed for its relentless focus on growth over immediate profits, the business model for the Echo is not yet clear. The company makes almost no money on selling the speakers themselves, although it does have a music streaming service designed for the Echo, as well as allowing customers to make orders through them. Mr George says that one day it could make money from taking a cut on developers selling skills, as Apple does on its iPhone App Store.

Amazon is also reportedly building a version of the Echo with a screen, which would make things such as shopping easier, although the notoriously secretive company has not confirmed any plans. “We know what the screen can offer,” Mr Prasad says. “Shopping for instance, certain things you will not buy without looking at a screen.

The company does have competition. The Echo blindsided the rest of the industry, but only for so long. Last month Google – famed for its artificial intelligence pedigree – released a rival device, the Google Home, whose Assistant software, unlike Amazon’s Alexa, is also available on its smartphones (Mr George claims not to have sampled his main rival’s offering).

But Amazon has a two-year headstart, not to mention the support of its mammoth online retailer – displays for the Echo are unmissable when visiting the website. It is also no slouch when it comes to AI: from its early days, the company has used machine learning to develop its online recommendations, detect fraud and train the robots that man its fulfillment centres.

The company’s research bases in Cambridge and Boston, which have hoovered up speech recognition start-ups, are frantically working on making the Star Trek computer a reality. If they get it right, Amazon may have one hand on the future.

View the latest Amazon deals