Scotty, beamed down.

The Voice

M.G. Siegler
500ish
Published in
8 min readJan 12, 2017

--

Over the past few years, everyone — investors, developers, consumers — have been looking for the next platform. 2017 is the ten year anniversary of the iPhone — nine years since it truly became a platform, with the launch of the App Store — and it is in no way an overstatement to say that it fundamentally altered the state of computing. In fact, it’s not even an overstatement to say that it altered the state of the entire world, in many ways. Smartphones are now undoubtedly the most used device by a huge percentage of the world’s population.

So what’s next?

The hunt has been on basically since the launch of the iPhone. The iPad ushered tablet computing into the mainstream, and is a great business, but it’s ultimately an extension of the smartphone and morphing with the laptop. Google Glass was clearly a slightly off approach at the wrong time. Apple Watch is a nice-to-have, but not vital to anyone’s day-to-day. Maybe that changes down the road — in particular if the tracking of vitals becomes the key to the device — but it’s not there right now. The same feels true with AR/VR; getting more interesting by the day, but still a bit too early.

And so, we wait.

But what if the next interesting platform has been under our noses the whole time? Or rather, right around our noses. Our mouths and ears…

Can You Hear Me Now?

I’ve been intrigued by the power of voice with regard to computing for a long time. Back in the 1990s, I bought one of those awful Microsoft dictation microphones for Windows. It was laughably bad. Microsoft Bob bad. So bad, I can’t even seem to find it anywhere on the internet anymore to link to it! It has been erased from history. That bad.

Later, I also owned one of the first Bluetooth headsets. Yes, I was that guy. That cyborg guy. More recently, I’ve “hacked” my iPhone to basically read everything to me (via a more subtle Bluetooth headset).

I consider ‘voice’ to be both talking and listening — vocal and audible computing. Sometimes use cases call for one aspect, sometimes the other. Sometimes both. But they’re clearly tied together, in my mind. To that end, I think a confluence of events has made it possible that this will be the next truly interesting platform. Again, for investors, developers, and consumers.

Everyone knows about Amazon’s Alexa by now. Well, by “everyone” I mean everyone who is likely to read this post. In reality, Amazon has still only sold five million or so Echos so far. Apple has sold a billion iPhones. Perspective matters.

But that’s okay, because the Echo was simply the first vessel in which Alexa entered our lives. Amazon is now pushing Alexa far beyond that first device. That’s easier said than done, of course. But if this year’s CES is any indication, this is happening.

Directionally, the same seems to be true with Google’s Assistant. Google Home is just one outlet for the service. Notably, billions of Android devices are potential conduits. Cortana + Microsoft. Etc.

And then there’s Siri. While Apple had the foresight to acquire Siri and make it a marquee feature of the iPhone — in 2011! — before their competitors knew what was happening, Apple has treated Siri like, well, an Apple product. That is, iterate secretly behind the scenes and focus on new, big functionality only when they deem it ready to ship, usually timed with a new version of iOS. That’s great, but I’m not sure it’s the right way forward for this new computing paradigm — things are changing far too quickly.

This is where I insert buzzwords. AI. Machine Learning. AI. Machine Learning. AI. Machine Learning. AI. Machine Learning. AI. Machine Learning. AI. Machine Learning. AI. Machine Learning. AI. Machine Learning. AI. Machine Learning. AI. Machine Learning…

But really: AI. Machine Learning.

Apple: Ahead or Behind?

I found a recent interview Apple SVP of Marketing Phil Schiller did with Steven Levy rather illuminating on Apple’s thinking. One key part:

“That’s really important,” Schiller says, “and I’m so glad the team years ago set out to create Siri — I think we do more with that conversational interface than anyone else. Personally, I still think the best intelligent assistant is the one that’s with you all the time. Having my iPhone with me as the thing I speak to is better than something stuck in my kitchen or on a wall somewhere.”Well, I reply, Amazon sees its Alexa voice interface not as something pinned to one device, but a ubiquitous and persistent cloud-based product that can listen to you anywhere.“People are forgetting the value and importance of the display,” he says “Some of the greatest innovations on iPhone over the last ten years have been in display. Displays are not going to go away. We still like to take pictures and we need to look at them, and a disembodied voice is not going to show me what the picture is.”

Two dismissals in there worry me about Apple’s approach.

First, that they believe the phone is far more interesting than having a stand-alone product in the space right now. I maybe — maybe¹ — agree with the general sentiment in the long term. But in the short term, it’s folly to dismiss the importance of having a dedicated device. For a thousand reasons.

With the technical hurdles to vocal computing quickly fading away, the key is normalizing it. We’ve been able to talk to our phones for years, but it’s still awkward to do this most of the time. I don’t know about you, but when I do it, I still pretend like I’m talking on the phone. To another human being. Like an animal.

Second, with regard to the screen… The brilliance of the Echo is that the only way to interact with it is through voice. This is a forcing function. While it may be easier to do some things via a screen or keyboard, in many cases, that’s in large part because functionality to make such action more robust was built out over time. Further, continuing to use such functions will simply perpetuate those paradigms.

I agree that for many things, those inputs are the better way to do things right now — and presumably this is why Amazon is working on an Echo with a screen — but to enable the future of functionality, we need to break habits. The Echo showcases this perfectly.

Anyway, I suspect Schiller is being a bit disingenuous here. Rumors have Apple working on its own Echo-like device, so this could simply be an old Steve Jobs-like maneuver of dismissing a space until you launch a competitor. But if Apple truly believes that the phone is going to be the real gateway here, they risk ceding the era to Amazon, who is thinking about it the exact opposite way — Alexa, not just on a phone, but everywhere.² An operating system for all things connected.³

But…

This leads to the AirPods. With each passing day, I love these things even more. And while they’re sort of symbolically tied to music right now (thanks to the similar look to Apple’s iconic white EarPods, which became synonymous with iTunes via all those brilliant ads), they are perhaps the perfect product yet for the vocal computing world laid out above.

They’re not just good for music, they’re great for audiobooks, for podcasts, for listening to anything via my voiceover “hack”. And, of course, they include a microphone for the other part of the equation. Yes, EarPods were good for all of this too — the difference, as I’m coming to realize, is that you can almost leave the AirPods in all day without even realizing they’re there. They’re an extension of your body — one that’s always connected to the internet via your iPhone (or iPad, or Mac).⁴

More importantly, the fact that Apple built-in the ability to trigger Siri (double-tap) almost negates my concerns about Apple in this space as mentioned above. If it were well done — meaning, worked more than a fraction of the time on first attempt — it might fully negate my concerns.

Of course, such functionality needs to be more fleshed out, presumably through Siri (though not necessarily), to let a thousand apps and services flourish. And not just the obvious ones like podcasts, etc. For this world to really take off, we’re going to need a thousand new types of services.

Think about services that may be voice-first. It’s really hard to do that right now in our current world. But there’s clearly opportunity there. What if this is not just the next platform, but the future of computing, in general…

Again, Amazon has the lead here. A couple nights ago, my wife and I started playing Jeopardy with Alexa. It was brilliant. So seamless, so obvious, so much less overhead than launching a visual-based game (whether on the phone or on a video game console), so much more interactive than watching television (where Alex Trebek won’t tell us specifically how we did).

It’s a silly example. But some of the biggest things start as silly examples. You’ve undoubtedly heard this ad nauseum, but worth repeating until you get it: watch a child talk to Alexa. That’s also silly — until all those children grow up and start to interact with all of their computers this way.

¹ This is the vision of the future that everyone points to in the movie Her, of course. But in that world, Theodore doesn’t interact with Samantha mainly on a smartphone, but rather through an earpiece. The screen is the backup device for certain visual tasks…

² Of course, that’s a lot easier to do when you tried to launch a phone and utterly failed.

³ Wouldn’t it be interesting if after all the wasted words about the “operating system for the home” it wasn’t one of the living room devices like Apple TV or Xbox, but rather Amazon that nailed this via Alexa?

⁴ Yeah, battery life has to be better than 5 hours to make this a true reality, but as long as you’re not constantly listening to music, the effective battery life is much longer.

--

--

Writer turned investor turned investor who writes. General Partner at GV. I blog to think.