[December 6, 2004]

Moving From VoIP to "Trans-Modal Communication"

By Art Rosenberg, The Unified View


One of the big problems the telecommunications industry is facing is how to describe and name the new functionalities that converged communications technology can offer. Unfortunately, the industry has latched on to the lowest common denominator of technology infrastructure for voice and video communications convergence, the move to data network transport, i.e., Voice over Internet Protocol or “VoIP.” With a name like that, person-to-person communications can mean anything or nothing!

This kind of confusion was reflected in our recent enterprise migration study that showed that fewer than 10% of end users within enterprise are asking for the new communication capabilities that IP-based convergence can offer. This lack of end-user demand can mean either that they don’t recognize the benefits or they don’t feel the need for them, but VoIP connectivity by itself will never mean anything to end users.

Another reason for the lack of demand is that some of the communication benefits of SIP-based communications will also depend upon the new class of wired and wireless multi-modal communication devices that are just starting to become available. This trend supports both real-time voice conversations and all forms of messaging options. Because person-to-person communication is a two-way street, the user benefits of exploiting IP infrastructure will be maximized when both parties can dynamically shift from one modality to another as their joint needs dictate. It does takes “two to tango!”

Our Legacy -- Blind Communication Silos

We initiate phone calls not knowing whether the recipient can talk or is even at the location of the telephone device and we leave voicemail messages and send email messages not knowing when the message will be retrieved. Responding to messages is fraught with the same lack of awareness for making a successful contact. For the bulk of non-urgent contacts, this is a situation that is perfectly acceptable and we have learned to live with it.

However, busy recipients have to be made aware of more important and urgent contact attempts with minimum disruptiveness and with a choice of options for accepting the contact attempt immediately, in a few minutes, or via alternative modalities. (Wireless pagers used to be carried by mobile users just for this purpose, remember?) Aside from wasted effort, if time is of the essence, person-to-person contacts need to be more efficient for all communicating parties. The way disparate communication channels function, we “can’t see the forest for the trees!”

New Applications or Better Functionality?

The inability to describe the overall benefits of more efficient person-to-person communications has caused pundits and providers alike to refer to any new feature or function that contributes to such effective communications as a “new application.” What’s even worse, is that such features are still being treated as disparate, network or device-dependent functions that contact initiators must blindly choose as the means of communication, rather than being viewed as part of a “seamless” continuum of person-to-person communications accessibility.

A prime reason why we are facing this dilemma is because in the past we have always been limited by the method and devices of communications contact being tied to specific network transport. With converged IP networks able to accommodate all modalities, we can play a more flexible and dynamic communication game both before and even after initiating a person-to-person connection. Now that devices are becoming multi-modal, both for desktop and handheld communications, we have to start thinking in terms of the method of contact initiation being independent of the network transport or the modalities of communication exchange. In particular, as messaging facilities increase in usage, we have to blend such activities seamlessly with the ability to have the ultimate in efficient personal communication, voice conversations. This means that initiating communication contact with a person in any particular mode should not necessarily limit the use of other communication modalities, including the concurrent exchange of visual information in conjunction with voice conversation.

The concept of changing communication modalities is not really new, since telephone answering machines and voice mail systems converted a voice call attempt into a voice message for later retrieval based upon using the same device and network, the telephone. However, this solution did not solve the problem of time-sensitive contact attempts, because the voice messages were not necessarily delivered/retrieved immediately. This real-time association of call attempts with voice messaging still dominates the use of voice mail systems, and, because it is not always “free” as email is, continues to give voice mail messages a perception of higher value than email.

Similarly, once a voice connection is made, can it be changed to other modalities, including multi-party conferencing additions?

The personal communication world is changing rapidly for both consumers and business users with growing adoption of wireless mobility and presence-based instant messaging, expanding real-time accessibility to people. We may now need the reverse of telephone answering, where presence management and messaging may be the first step in enabling increased real-time connections between people. However, we don’t even know what to call such new communication flexibility and that makes it harder to educate the market and its end users to drive awareness and demand.

Communication Modalities and Devices

When email and voice mail were primarily separate asynchronous messaging functions, the focus of messaging convergence was storage and retrieval, reflected in the converged message management capabilities we called “unified messaging.” This is where voice mail systems became integrated with email systems to reduce storage costs and facilitate cross-media message retrieval and replies. To the extent that a voice mail system can also initiate outbound phone calls over the PSTN, unified messaging also provides real-time notification and delivery, as well as cross-modal response to a voice or email message in the form of a telephone callback.

The real payoffs in business communications will be derived from insuring that time-sensitive contact attempts are successful, and if they have to be in messaging mode, they will still be delivered and responded to in a timely manner. Messages that are not important or time sensitive can be dealt with as personal priorities permit, but it is really the time-sensitive situations that must command our priorities for payoff.

Wireless mobility has increased the need for multi-modal device interfaces to satisfy such time-sensitive contacts in a variety of user communication circumstances. We are now seeing handheld devices that exploit both screen/text input and speech user interfaces for unified messaging as well as call management (telephone) conversations. They are also exploiting the flexibility of wireless voice networks, both 3G wide area and local (WiFi). From the user’s perspective, the bottom line of all this convergence is that they now have more flexibility in selectively using the modality of personal contacts and communications that will work for both contact initiators and recipients.

What has been long been a source of communication inefficiency is when an initial mode of contact turns out to be inadequate. Exchanging asynchronous emails or voicemails is not always as efficient or effective as a voice conversation or an immediate instant message exchange. When more than two parties are involved, it becomes even less efficient if a multi-party “instant conference” will be what is really needed to reach rapid agreement on problem resolution. So, while asynchronous text or voice messaging still are important for basic communications, they may need to be quickly and easily escalated into real-time contacts when immediacy suddenly becomes necessary.

A new program that has been bundled into the humungous 2005 CES show in Las Vegas is the “Consumer VoIP Summit” focused upon the consumer market. It has a very heavy emphasis on IP-based multi-modal desktop and handheld devices to support voice conversations and all flavors of messaging exchange, including video and still pictures. These functions are easily understood by users and are indeed distinct communication “applications” for IP-based networks. Being able to dynamically switch to any of these options at any time will not be a new application, but will a prime example of “transmodal communication” flexibility.

Welcome to “Transmodal Communication”

The advent of a common IP network infrastructure and end-to-end SIP protocols will enable person-to-person voice communications to become dynamically flexible during the course of a single communication contact. What voice mail has been doing with voice messages and “call return” can now be expanded to instant messaging and multi-party voice conferencing.

Ideally, users should be able to efficiently shift from one modality to another without starting from scratch. In other words, once a contact has been made in any form, it can be escalated to more real-time connections as agreed to by the communicating parties. Since such capability involves a dynamic shift of communication modality in real-time, we have decided to describe it as “transmodal communication,” reflecting the fact that it enables communication modality change immediately, easily, and with minimal cost.

There are already scattered examples to be found, albeit in very rudimentary form. Contact center applications that enable an online customer to talk to an agent (“click-to-talk”) might be more easily facilitated by stepping through an Instant Messaging mode (chat) first, because we have not yet established direct contact between a customer and a specific customer-facing agent. Once any connection between the two is made, it will then be appropriate to decide whether a text “chat” is adequate or whether a voice conversation is more appropriate and a callback or PC-based voice connection is needed.

We can start to speculate about the benefits and conveniences of such new communication flexibility, but let’s first give it a descriptive label to better understand what we are talking about. And let’s NOT call it a new “application!”

So, Where’s the Productivity ROI? “Time is Money!”

As usual, enterprise management may see the soft benefits, but, because they are difficult to quantify, will want to know about “hard dollar” ROI, better known as cost savings. There will be significant cost savings to be derived from converging voice communications on IP data networks and detailed analyses and usage modeling with “ROI tools” will be needed to identify and quantify those benefits. However, because cost savings are not the only reason that organizations will want to drastically change all their business communications, they must also look at what converged communications will do for operational performance, group productivity, and its impact on revenue and profits.

Individual enterprise user productivity comes from two primary sources:

Easier to use technology that takes individual users less time to do something by themselves, anywhere, anytime. This also helps eliminate wasted, non-productive “dead time,” e.g., when traveling, or away from the office. We have often referred to this kind of productivity as “micro-productivity.”

Communication technology that enables users to do things faster anywhere, anytime with others, i.e., communicating, coordinating, and collaborating on decision-making and problem resolution tasks. This communication flexibility reduces the “dead time” created by waiting for making contact with and/or getting responses from specific individuals. Such productivity pays off to the group as a whole and we have called this “macro-productivity.” Any technology that helps individual micro-productivity also contributes to increased macro-productivity, but does not replace the need for person-to-person contact efficiencies.

Because collaborative effectiveness and efficiency is often a function of communication modality rather than just personal contact accessibility availability, we must include multi-modal flexibility as a factor in increasing macro-productivity. If a voice call connection cannot be made, but an urgent message can be delivered immediately and then escalated to a voice conversation or an instant message exchange, then we have effectively accomplished the same kind of real-time communication. However, quantifying the benefits of such faster communications is not simple and will be very dependent on the context of the business activity. Although “time is money,” how much money is a big, unknown variable!

So-called “One-number services” are targeting this need to simplify access to specific individuals using a single contact “address” and dynamically converting the notification and response modality to satisfy the recipient’s current communications status. With multi-modal devices available to both parties, the contact initiator will also be able to change modalities to suit the recipient’s needs as well.

What Do You Think?

Do you agree that we need better terminology to describe the flexibilities of converged communications? Will convergence between consumer use and business use force enterprise communications to support multi-modal devices and transmodal services? Will transmodal service drive demand for SIP-based multi-modal desktop “hard phones” and handheld smartphones as opposed to PC-based “soft-phones?” What role will wireless carrier and IP-Centrex services play in supporting enterprise CPE transmodal capabilities?

Let us know your opinions by sending them to

New White Paper Report: Progress and Direction of Enterprise Migration to Converged Communications

The Unified-View has just completed a new white paper report on the state of the industry and the enterprise market for communications convergence. Entitled “Beyond VoIP: Enterprise Perspectives on Migrating to Multi-modal Communications and Wireless Mobility,” the report was sponsored by the non-profit Unified Communications Consortium and leading providers of enterprise voice telecommunications technologies, including Alcatel, Avaya, Mitel, Nortel Networks and Siemens.

This objective report summarizes the current availability of key converged voice application technology from the provider industry, as well as a realistic assessment of the progress that enterprise organizations are making in migrating to communications convergence. The latter information is based on recent market studies of enterprise organizations from a converged usage perspective. The study provides practical feedback on the readiness of the market for the new IP-based voice technologies.

For a free copy of the new report, go to

Art Rosenberg is a veteran of the computer and communications industry and formed The Unified-View to provide strategic consulting to technology and service providers, as well as to enterprise organizations, in migrating towards converged wired and wireless unified communications. He focuses on practical user requirements, implementation issues, and new benefits of multi-modal communication technologies for individual end users, both as a consumer and as a member of enterprise working groups. The latter includes identifying new responsibilities for enterprise communications management to support changing operational usage needs most cost-effectively.

Copyright © 2004-2005, The Unified-View, All Rights Reserved Worldwide

