

While graphical user interfaces have gained much popularity in recent years, there are situations when the need to use existing applications in a nonvisual modality is clear. Examples of such situations include the use of applications on hand-held devices with limited screen space (or even no screen space, as in the case of telephones), or users with visual impairments.
We have developed an architecture capable of transforming the graphical interfaces of existing applications into powerful intuitive nonvisual interfaces. Our system, called Mercator, provides new input and output techniques for working in the nonvisual domain. Navigation is accomplished by traversing a hierarchical tree representation of the interface structure. Output is primarily auditory, although other output modalities (such as tactile) can be used as well. The mouse, an inherently visually-oriented device, is replaced by keyboard and voice interaction.
Our system is currently in its third major revision. We have gained insight into both the nonvisual interfaces presented by our system and the architecture necessary to construct such interfaces. This architecture uses several novel techniques to efficiently and flexibly map graphical interfaces into new modalities.

ENO is an audio server designed to make it easy for applications in the Unix environment to incorporate non-speech audio cues. At the physical level, ENO manages a shared resource, namely the audio hardware. At the logical level, it manages a sound space that is shared by various client applications. Instead of dealing with sound in terms of its physical description (i.e., sampled sounds), ENO allows sounds to be presented and controlled in terms of higher-level descriptions of sources, interactions, attributes, and sound space. Using this structure, ENO can facilitate the creation of consistent, rich systems of audio cues. In this paper, we discuss the justification, design, and implementation of ENO.