ACMCrossroads / Xrds8-1 / Unified Communication Systems

Unified Communication Systems

by Christopher R. Andrews

Introduction

Many people rely on a multitude of devices for communication with others. Each new device gives its users the convenience of its unique applications. However, each new mode of communication requires another number to be remembered, or another service to be checked for messages. In reality, the number of devices a person has access to is irrelevant if none of the devices can understand the format of an important message [4]. Furthermore, communication networks and applications often route calls, pages, and faxes to specific devices instead of to specific people. Device diversity engenders a second problem. As a caller, one is concerned with contacting a specific device. It doesn't matter what device you contact, so long as that device is accessible to the person one is trying to contact. Moreover, device users cannot always specify who is able to use which device or to where a call from a particular person is routed. Lack of control over routing and device specificity can result in a user's personal information, such as his current location, to be exposed [6].

To address the problems caused by device proliferation, several companies are now releasing Unified Messaging Systems (UMSs). These systems enable users to manipulate all their messages through a single service that can be accessed by several devices. For example, a user can check and reply to e-mail from a cell phone while in the car or can check voice-mail and faxes from the computer lab at school.

To improve the specificity of call routing to devices, Unified Communication Systems (UCSs) are being developed. A UCS offers the typical services of a UMS, while adding the capability of person-to-person communication through dynamic call redirection and forwarding. To accomplish this, these systems must provide any-to-any communication; communication between any device and any other device over some series of networks. For example, users can customize their service by specifying the conditions (e.g. time, caller, or protocol) that must be satisfied for each a given device to be contacted. For instance, phone calls should be directed an office phone between 9 A.M and 5 P.M. and to a cell phone at other times. Also, UCSs can notify a user of an important message. Callers trying to reach a UCS user need only call a single number to contact a user regardless of that user's location. The UCS figures out which device use. Of the known implementations of UCS, the two most notable are the Universal Inbox, by the Iceberg group at UC Berkeley (UCB), and Calliope, by the Mobile People Architecture (MPA) at Stanford University.

This paper discusses the properties of Unified Communication Systems. First, there is an overview of UCSs from various devices. Then, using Iceberg´s Universal Inbox as an example, the paper covers how these systems work, specifically concentrating on the centralized mail store, data conversion, personalization, and routing. And finally, the future direction of these systems is discussed.

Using a Unified Communication System

The success of any system depends, in large part, on its usability. This is especially true for a Unified Communication System, which will be used by people with limited computer literacy to send and receive messages of various types. If such a system is not user-friendly, then impatience and frustration will discourage many from learning how to use its powerful services. Luckily, because a UCS is simply the convergence of many familiar applications, the general interfaces for each device can be reused. The following will first show how UMSs are used from PCs, phones, and other devices. Then, the discussion will turn to customization and call redirection in the more advanced UCS implementations. Throughout this section, Table 1 may be referenced to show which data types can generally be accessed by specific devices in UMSs and UCSs.

Because most UMSs support Internet standard protocols, proprietary client software and hardware is not usually needed. (This is not quite true for MPA´s Calliope.) Consequently, users do not have to download, upgrade, or maintain special application software. Users can instead continue using a preferred browser or e-mail client [2]. Ericsson´s Unified Messenger is the epitome of this model because messages in its mail store can be accessed and manipulated by any mail client. Other services, such as Onebox.com, stray slightly from this model and require the use of their web-based client for checking mail. However, this client can be accessed from any web browser.

Table 1. Devices and their supported data types.

- Text Voice Fax Picture
PSTN phone No Yes No No
GSM phone Maybe Yes No Yes
Palm PDA Yes Limited Limited Limited
Pager Maybe No No No
Fax Machine Maybe No Yes Maybe

UMS from a PC

A graphical user interface, (GUI) similar to those that manipulate e-mail, can be used when dealing with any kind of message (see Figure 1). In this scheme, all messages appear as e-mail with the non-text segments stored as attachments. For example, users can listen to voice-mail by clicking on the attachment and an appropriate application, such as Quicktime, will start up to play it for her. Similarly, composing messages is a simple task; writing and sending e-mail is done in the same manner as before. Depending on the particular service, though, other options may also be available. Onebox.com, for instance, allows users to record a message with a PC microphone.

Fig1

Figure 1. A typical mail GUI. Typical GUIs such as this Onebox.com Inbox can be used by a UMS. Voice-mail and faxes are shown as regular e-mail with attachments and have the phone number of the caller in the "From" field.

UMS from a phone

Accessing and manipulating messages via phone is also straightforward. From a user's perspective, there is no difference between the methods used for Public Switched Telephone Network (PSTN, the type in most of our homes) phones and those used in cellular phones. A user simply calls the UMS phone number and can navigate through a menu to play, respond to, compose, save, and delete specific messages at will. Also, more advanced services, such as UCB´s Media Manager and Ericsson´s Unified Messenger, allow users to listen to text-based messages, using text-to-speech technology. Onebox.com and Yahoo! are also offering services along these lines. Through Onebox.com, users can access their mailstores by phone and listen to any voice messages waiting for them. Yahoo!´s recently released service, Yahoo! By Phone, not only allows users to hear their voice and text messages but also allows them to listen to other information, such as weather reports, sports news, and stock quotes. Thus, UMS via the telephone is as accessible as an ordinary voice-mail service but with enhanced features.

UMS from other devices

Other devices can reuse existing applications and interfaces just as PCs and phones do. For instance, personal digital assistants (PDAs) can continue using the same e-mail GUI and/or Internet browser when accessing a UMS. However, since many PDAs have no audio output and very limited graphics capabilities, it may often be beneficial to convert rich-text or audio messages into plain text, if possible. More advanced PDAs that operate under the Windows CE platform may have greater capabilities in terms of sound and graphics, but their resources are still far more limited than the average PC. An even more limited device in this category is the pager. From a pager, users can be notified of messages, but only a few lines of text can be forwarded.

Using redirection in a UCS

For systems such as Calliope and the Universal Inbox, it is possible to have a direct connection between two users set up through redirection. In the case of a phone call, one user using either a PC or a phone interface, requests a connection to another. The UCS then figures out which device (phone or PC) is the best connection medium, and creates the connection. If one of the users is unavailable, they can leave a message with the UMS. Similarly, an instant messaging session, using an application like AOL Instant Messenger, can be conducted.

How a Unified Communication System works

Though it varies from system to system, a common set of functionalities is shared among most UMS and UCS providers. The basic building block is a centralized mail repository, a place where users go for all their messages. From there, minor to major differences occur depending on the level of sophistication of the system. Ultimately, these systems must be able to support a set of data conversions, such as between text files and sound files. Other common aspects are customizable settings and coordinated routing schemes between heterogeneous networks. The next sections describe in more detail how each of these pieces works.

Centralized Mail Repository

One of the key ingredients of a Unified Messaging system is the centralized mail repository (CMR). It provides clients with a single place to go for all their messages, including e-mail, voice mail, and faxes. A CMR can be designed similarly to a regular mail service application with MIME capabilities. A good example of such a service is the Media Manager, the CMR used by the Universal Inbox (see Figure 2). The Media Manager has a simple interface to the outside world. Through it, a Media Client can send and receive messages, as well as access other properties such as list and folder information, using a single interface, regardless of the underlying protocol (see Appendix B for an explanation of the client/server model). The manipulation of messages in this way results in extensibility and format independence.

The goal of UCS is any-to-any communication which is faciliated by either client or server data conversion. Smart client applications include audio players that recognize and relate both MP3 and WAV files or text readers that understand both rich and simple text. Depending on the complexity of the UMS, however, some conversions might not be available to the client, such as text-to-voice or message summary, so these conversions are provided by the server. The Media Manager uses its Transcoder Service to carry out the required conversions before sending the data to the client. As Figure 2 shows, the Transcoder service can perform many conversions. Text-to-voice programs have been around for many years, and though they still generate playback with little emotion, they have come a long way. Voice-to-text, on the other hand, is a much harder prospect for the Transcoder. The audio file has to be analyzed for pauses and pitches to form words. While reading dictation, this service has about a 90% accuracy rate. When talking in normal conversation voice, however, the accuracy is between 30% and 40%. To obtain a summary from an audio message, pitch detection can be used to infer the important words that should be saved, thus cutting down on storage and playback time. Analyzing audible speech is still a very active area a research and is making tremendous gains. [1]

fig2

Figure 2. The Media Manager Architecture (Source: Based on B. Hohlt, "Architecture," 30 May 2001, http://www.cs.berkeley.edu/~hohltb/mediamanager.html, (6 June 2001).)

Example of UMS operation

An example of UMS operation will help clarify how a UMS works. Let´s assume that a user Joe is at his computer and wants to check his mail. Upon clicking the ´checkmail´ button, his Media Client software will connect with the Media Manager and request a list of messages from each of Joe´s accounts. For each account, the Media Manager will contact the appropriate server, get the requested listing, and return it to Joe´s Media Client. Now Joe can view his messages using the Media Client´s GUI. He sees that there is an e-mail and a voice mail from Jane and a fax from his boss. When he clicks on the e-mail message, the Media Manager retrieves it from the POP server on which it resides, and displays it on the GUI according to the e-mail and POP protocols. Similar actions take place for the voice mail and the fax, except that the Media Manager may create a text version of the messages and attach the audio or fax files to those messages.

What if Joe is using his cell phone and not his computer? The sequence of events will be nearly the same. The Media Manager sends his phone a list of messages. This time, however, the phone's Media Client displays a choice between the whole message or just a summary of it when he selects the e-mail from Jane. The phone can only display a few words at a time and the message is four kilobytes long, Joe opts for the summary. The Media Manager recieves the request for an e-mail summary of Jane's message. It then retrieves the message from the POP server, uses the Transcoder to summarize the text, and sends the shortened message back to the Media Client on the cell phone. If the cell has software to play audio files, then Joe can download the voice mail and actually listen to it. Otherwise, his Media Client would have the voice mail converted to text so that he could read it.

Routing and redirection

Both Iceberg and MPA use routing and redirection to implement person-to-person connectivity under mobility. Iceberg and MPA have slightly different ways of performing this functionality, but only Iceberg´s approach will be discussed here for simplicity.

Fig3

Figure 3. Iceberg Components (Source: B. Raman, J. Wang, J. Shih, A.D. Joseph, R. Katz, The Design of ICEBERG: An Integrated Communication Architecture, http://www.cs.berkeley.edu/~bhaskar/iceberg/iceberg.ps, (6 June 2001).)

Figure 3 shows the components that are used in Iceberg´s Universal Inbox. Iceberg Access-Points (IAPs) span the Internet to all other networks and act as gateways. They provide a common interface so that users on one network can use services or make connections through other networks. Both incoming and outgoing connections are established through them. [5]

A Preference Registry (PR) holds the rules of how people want to be contacted. A user can add rules to their PR through a GUI and specify conditions such as time of day, day of week, caller, and user location. The PR is a service that takes the necessary input from the caller's and the user´s profile, and it returns the preferred device. This way, user profiles will not be made available to the general public. [5]

The Naming Server (NS) maps device end-points to users, and the Preference Registry Location Server (PRLS) maps users to PRs. Basically, there are differing name spaces for heterogeneous devices and services; end-points for local phones in the United States are made up of seven digits, while end-points of IP services consist of 32 bits. The NS returns a user´s unique id when given a phone number. With this information we can find the location of the user´s preference file by using the PRLS. Now the IAP can figure out what device to connect to based on the number that is called. [5]

The Automatic Path Creation (APC) server encapsulates the data flow from one end-device to the other. There are two aspects to this flow: 1) sending the data from one physical point to the next, and 2) appropriately converting the data to the necessary formats. The APC process is able to compose different conversions. For example, assume the APC can only convert a PCM audio stream into a GSM audio stream (see Appendix A), and it can only convert a GSM stream into an PSTN stream. If the endpoints of a connection are in the PCM and PSTN networks, then the stream can first be converted from PCM into GSM and then into PSTN. Thus, the service is modular and composeable. If we can convert format A to format B, then we can convert format A into anything format B can be converted to. [5]

Example of redirection

As an example, let´s follow what happens when Jane from her PSTN phone wants to call Joe, who currently has access to his GSM cellular phone. First, Jane´s call is intercepted by an IAP on her PSTN network. The IAP then contacts a known Naming Server node and gets the location of Joe´s PR, along with his unique ID to access his information. From the PR, the IAP learns that Joe currently wants to be reached by his GSM phone. Now, all the IAP has to do is send the voice stream (with its routing information) to an APC, which converts it into GSM and redirects it to Joe´s cell phone. If Joe answers, then a similar connection is set up in the reverse direction so that Jane can likewise hear his voice [5].

The Future of UCS

Unified messaging and person-to-person communications over heterogeneous networks are relatively new applications. Many commercial messaging systems, such as Onebox.com and Ericsson´s Unified Messenger, have already begun their journey in this direction. There is much room for growth, however, and many other systems will soon need the capabilities described above just to stay competitive in the market. Also, person-to-person (rather than device-to-device) communications is a very promising area. America Online´s Instant Messenger has already been around for several years, and something more sophisticated seems to be right around the corner. Remembering one number to reach somebody is more appealing than remembering several. Finally, the person being called will have more power and flexibility if they can specify who can call them and when they can be reached.

References

1
Czerwinski, S.E., and Hohlt, B.A., Automatic Content Extraction for Voicemail, http://www.cs.berkeley.edu/~hohltb/papers/multimedia.ps, (2 June 2001).
2
Ericsson Messaging AB, Unified Messaging Over IP, http://www.ericsson.com/wireless/products/messag/unified.pdf, (1 November 2000).
3
Hohlt, B.A., Media Manager Service and Transcoder Service, 30 May 2000, http://www.cs.berkeley.edu/~hohltb/mediamanager.html, (1 November 2000).
4
Lucent Technologies, Inc., Octel Unified Messenger for Microsoft Exchange White Paper, P/N 001-11062-00, Milpitas, CA: Octel Messaging Division, Jan. 1998, p. 2.
5
Raman, B., Katz, R., and Joseph, A., Universal Inbox: Providing Extensible Personal Mobility and Service Mobility in an Integrated Communication Network, University of California at Berkeley: EECS Department, 2000.
6
Roussopoulos, M., Maniatis, P., Swierk, E., Lai, K., Appenzeller, G, and Baker, M., "Person-level Routing in the Mobile People Architecture", Proceedings of the USENIX Symposium on Internet Technologies and Systems, October 1999.
7
Webopedia, http://www.webopedia.com/>http://www.webopedia.com, (1 November 2000).

Appendix A - Acronyms

(Note: The expansions and definitions for these acronyms were found at Webopedia [7].)

GSM (Global System for Mobile Communications) - GSM is one of the leading cellular phone systems and the de facto standard in Europe and Asia.

GUI (Graphical User Interface) - A GUI is a program interface that takes advantage of computer graphics to make the program easier to use. Characteristics of many GUIs under the Microsoft Windows platform are pointers, icons, windows, and menus.

IMAP (Internet Message Access Protocol) - IMAP is a protocol for retrieving e-mail messages. Similar to POP, but having additional features, IMAP uses SMTP to communicate between the e-mail client and server.

MIME (Multipurpose Internet Mail Extensions) - MIME is a specification for formatting non-ASCII messages so they can be sent over the Internet.

MPA (Mobile People Architecture) - The MPA is a research project at Stanford University. Their system, Calliope, is a UCS that is being developed.

MP3 - MP3 is the file extension for MPEG, audio layer 3. Layer 3 is one of three coding schemes for the compression of audio signals. The result is that sound data on a CD can be shrunk by a factor of 12 (without sacrificing sound quality) when using this scheme.

PDA (Personal Digital Assistant) - Also known as a palmtop, a PDA is a small computer that literally fits in a palm. The last couple years have seen palm PDA´s that support color, sound, and an internet connection.

POP (Post Office Protocol) - POP is a protocol used to retrieve e-mail and is similar to the newer IMAP, and also requires SMTP when sending messages.

PSTN (Public Switched Telephone Network) - PSTN refers to the international telephone system based on copper wired carrying analog voice data. Most household phones are linked to a PSTN.

SMTP (Simple Mail Transfer Protocol) - SMTP is a protocol for sending e-mail messages between servers. Most e-mail systems use SMTP over the Internet when sending messages from one server to another. Both IMAP and POP use SMTP.

WAV - The format for storing sound in files developed jointly by Microsoft and IBM. Support for WAV files is built into the Windows operating systems.

Appendix B - Expanded Definitions

Any-to-any communication refers to the concept that any device can communicate with any other device. This is one of the key ideas behind Unified Communication Systems. For simplicity, let´s assume that A and B are different kinds of devices which reside on heterogeneous networks. A couple of things must be able to happen if A and B are to communicate. First of all, there must be a path from A to B (and vice versa) which spans possibly many networks. For example, if A is on GSM and B is on PSTN and both can connect to the Internet, then there is a path from one to the other. Second, the data being exchanged between the two devices must somehow be converted so that the receiver can (1) recognize the format and (2) handle the size of the message. Only under these conditions can A and B communicate.

A client/server relationship occurs between two processes, a client and a server. As the name implies, a server process is dedicated to fulfilling a service, such as managing print sessions or mail repositories. Similarly, a client process is a user application, such as a local mail store or Internet browser, which relies on the services of a server. For example, when I click on a link while surfing the web, my browser (the client) requests that data be sent from the corresponding server. If the server is up and running, then it will review the request and send my browser a file.


Acknowledgements

I would like to thank Anthony Joseph and Barbara Hohlt for giving me the opporutunity to work on the Media Manager. Also, thanks to John Hatton for sharpening my technical writing skills. And to the three above and the editors of ACM Crossroads, thanks for your comments and suggestions regarding this paper.

Biography

Chris Andrews will graduate from the University of California, Berkeley, in December, 2001, with a B.S. in Electrical Engineering and Computer Science. He has played trumpet for the Cal Marching Band and enjoys running and playing intramural sports.

Copyright 2004, The Association for Computing Machinery, Inc.