Computing in Concert

M.G. Siegler
Published in
4 min readJan 16, 2017

--

“Voice will never be the interaction model,” they say, out loud, to communicate their thoughts.

Here’s the thing. It’s pretty obvious what’s going to happen here. At least in my mind. And I’m now communicating that to you in written form. But I’m also thinking it in my head in verbal form. But it’s a non-audible verbal form. But I also imagine someone will read at least part of this outloud to someone else. It sort of depends on the circumstance…

And there it is. Your answer. No, vocal computing isn’t going to be the be-all, end-all of computing. Instead, it’s going to be one facet of interaction. I believe it will be a key one. But not the only one. Instead, many forms of interaction are going to work together to form a concert of computing.

Think about those scenes in Minority Report. You know, the ones where Tom Cruise is sifting through digital information while waving his hands about as if he were conducting an orchestra. It’s an over-dramatization, but it provides a visual framework for what I’m thinking about.

Our interaction with computers started with typing.¹ 50 years later, we’re still doing this. Things haven’t fully changed, but they have evolved. Physical keyboards have remained in heavy use, but they were eventually joined by a physical device to manipulate objects (in)directly on the screen: the mouse.

The mouse also still exists but as laptops have become the more prevalent form of computers, the trackpad/touchpad has supplanted the mouse. And thanks to the rise of the smartphone, neither the keyboard nor mouse nor trackpad is now the primary method of interaction with computers. It’s now touch (whether manipulating objects on a screen directly or typing on software keyboards).

Add to this new paradigm of touch the concept of “multi-touch” and you all of a sudden have a whole range of gestures used for computer interaction that weren’t possible before. Some of these gestures have bled back to touch pads. And now, with the new MacBook Pros, we have touch bars to add to the mix.

My point is this: our interaction with computers has evolved and will continue to evolve. But that doesn’t necessarily mean that one form of interaction will “kill” another one. Instead, new ones are likely to augment existing ones. At some point, those new ones may supplant the old ones as the primary method of interaction, but legacy lingers. I’m actually typing this on a physical keyboard attached to my multi-touch iPad, after all.

All of this will be true as vocal computing continues to enter our world. In some cases, it will make sense to say something rather than type it or scroll through touchscreen displays. Finding a particular song you want to play, for example. But if you want to browse your collection of music, that’s obviously easier to do visually than audibly.

And so we’ll use all of these methods of interaction in concert with one another. Typing, touching, multi-touching, speaking, listening. When we eventually add virtual and augmented reality into this mix, things become even more obvious. At times, we really may look like Tom Cruise conducting information.

The question, in my mind, is when voice enters this interaction paradigm for most people. I think devices like Echo are making that a reality already. But it’s still the early days. I can see a world in which an AirPod-like system allows us to do computing via voice while walking down the street, but augmenting that with a smartphone as the backup for needed visualizations.

Is that a computer in your pocket?…

Going back to the movies, for a minute. This is exactly what happens in Her. Everyone focuses on Theodore’s interaction with his AI friend Samantha via voice. But he always has his (folded) smartphone device with him as well for more private moments, taking pictures, etc. It augments his vocal computing reality.

Over to Westworld, we have humans interacting with computers (hosts) mainly through voice. But they always have their (folded) touch tablets with them as well for different types of manipulation.

Movies and television shows are meant to be fantastical views of the future. But they also have to be grounded in some truth in order to be relatable. I think the extrapolated interaction models ground them. They’re different, but familiar. And they all point to various forms of computing interaction working together in concert.

Lots of folded computing love in the future…

¹ Well, I guess really with entering punched cards

--

--

Writer turned investor turned investor who writes. General Partner at GV. I blog to think.