SIP Specific: Speaking SIP
The Visual ExperienceI love my iPhone. I remember the first time I saw it, and I knew that things were about to change. The iPhone captured a nearly perfect combination of features - a thin form factor, an incredible display and usable touch screen, and a wonderful visual and intuitive interface for interacting with all its goodness.
The visual aspect of this is particularly interesting. The set of telephony features on the iPhone is certainly nothing new - basic calling, caller ID, hold, mute, 3-way conferencing, and voicemail. But what is different about the iPhone is that it focused on perfecting a visual interface for interacting with those features - an interface that leveraged size and quality of the screen and the touch nature of the controls.
For the longest time, telephones had only two ways of interacting with the user - through speech and through a small number of buttons on the phone. Business phones were similar, differing only in that they had a larger number of buttons, such as the infamous transfer and conference buttons users always have a hard time using. As a consequence of this extremely limited user interface, the industry created and adopted norms for how these features should interact with the user, and those interactions were centered on an interface that only allowed speech and button presses. This is why features like voicemail have - for the longest time - relied on speech prompts to play voicemail to the user, and relied on a fixed set of buttons to control operations like delete and forward.
The problem with speech interfaces is that they are fundamentally slow. They require sequential delivery of content to users. Often this content is not desired, forcing the user to wait and further interact to get what they need. Speech content also places great burden on the user to remember context - what keys I need to press, what menu option I'll go back to if I press zero. This is one of the reasons why navigation through hierarchical speech interfaces - such as those rendered by interactive voice response systems - are so frustrating, yet navigation through hierarchical visual interfaces - such as a web-page version of the same system - are so much easier to use.
And so, with the iPhone, Mr. Jobs took a bold step in changing the fundamental way in which users interact with telephony features. With a phone that had a good enough display to support a visual interface, and the desire to break the norms of old, he drove the industry toward redefining the way we interact with telephone features. And this is a really, really good thing.
Let me pick on another oft-maligned feature - call hold. Call hold typically plays music of some sort to the other party. This music is there for a reason - there is a fundamentalrequirement to alert the other party that they shouldn't hang up, that despite the fact that they cannot be heard, something is happening. If there was no music, a user could not differentiate a dead connection (in which case they should hang up), from hold (in which case they should wait). Given that the only interface available for conveying information to users was speech, designers of this feature decided to play music, and now we've all been trained that hearing music means that I am on hold. However, most folks hate music-on-hold. The music quality is awful, it's often music I don't care to listen to, and it can be incredibly disruptive when played into conference calls.
Visual interfaces can fix this. Instead of playing music, why not instead render a visual cue to the user that lets them know they are on hold? In some cases, I may need audible cues - for example, when my cell phone is in a screen saver mode, but these can be played selectively and never rendered into a conference call. A visual cue would meet the fundamental user experience requirement, but eliminate all of the problems associated with playing music - the non-visual way to meet that requirement.
This is the future. As more and more devices come out with displays and interfaces that are as good as, or better than, the iPhone, we will finally be able to make the jump to visual versions of all of the features we've been poorly interacting with over the last fifty years. Couple that with rich multimedia communications, and telephony will finally make its transition to a high-def experience.
Thank you, Mr. Jobs, for helping push us in the right direction.
Jonathan Rosenberg is a Cisco fellow (www.cisco.com).
|