

Soap is a pointing device based on hardware found in a mouse, yet works in mid-air. Soap consists of an optical sensor device moving freely inside a hull made of fabric. As the user applies pressure from the outside, the optical sensor moves independent from the hull. The optical sensor perceives this relative motion and reports it as position input. Soap offers many of the benefits of optical mice, such as high-accuracy sensing. We describe the design of a soap prototype and report our experiences with four application scenarios, including a wall display, Windows Media Center, slide presentation, and interactive video games.

Scientists use a variety of visualization techniques to help understand computational fluid dynamics (CFD) datasets, but the interfaces to these techniques are generally two-dimensional and therefore are separated from the 3D view. Both rapid interactive exploration of datasets and precise control over the parameters and placement of visualization techniques are required to understand complex phenomena contained in these datasets. In this paper, we present work in progress on a 3D user interface for exploratory visualization of these datasets.

User interfaces are becoming more and more complex. Adaptable and adaptive interfaces have been proposed to address this issue and previous studies have shown that users prefer interfaces that they can adapt to self-adjusting ones. However, most existing systems provide users with little support for adapting their interfaces. Interface customization techniques are still very primitive and usually constricted to particular applications. In this paper, we present User Interface Façades, a system that provides users with simple ways to adapt, reconfigure, and re-combine existing graphical interfaces, through the use of direct manipulation techniques. The paper describes the user's view of the system, provides some technical details, and presents several examples to illustrate its potential.

We propose a new evolutionary method of extracting user preferences from examples shown to an automatic graph layout system. Using stochastic methods such as simulated annealing and genetic algorithms, automatic layout systems can find a good layout using an evaluation function which can calculate how good a given layout is. However, the evaluation function is usually not known beforehand, and it might vary from user to user. In our system, users show the system several pairs of good and bad layout examples, and the system infers the evaluation function from the examples using genetic programming technique. After the evaluation function evolves to reflect the preferences of the user, it is used as a general evaluation function for laying out graphs. The same technique can be used for a wide range of adaptive user interface systems.

Systems of connected appliances, such as home theaters and presentation rooms, are becoming commonplace in our homes and workplaces. These systems are often difficult to use, in part because users must determine how to split the tasks they wish to perform into sub-tasks for each appliance and then find the particular functions of each appliance to complete their sub-tasks. This paper describes Huddle, a new system that automatically generates task-based interfaces for a system of multiple appliances based on models of the content flow within the multi-appliance system.

Users in ubiquitous computing environments need to be able to make serendipitous use of resources that they did not anticipate and of which they have no prior knowledge. The Speakeasy recombinant computing framework is designed to support such ad hoc use of resources on a network. In addition to other facilities, the framework provides an infrastructure through which device and service user interfaces can be made available to users on multiple platforms. The framework enables UIs to be provided for connections involving multiple entities, allows these UIs to be delivered asynchronously, and allows them to be injected by any party participating in a connection.

We introduce ViewPointer, a wearable eye contact sensor that detects deixis towards ubiquitous computers embedded in real world objects. ViewPointer consists of a small wearable camera no more obtrusive than a common Bluetooth headset. ViewPointer allows any real-world object to be augmented with eye contact sensing capabilities, simply by embedding a small infrared (IR) tag. The headset camera detects when a user is looking at an infrared tag by determining whether the reflection of the tag on the cornea of the user's eye appears sufficiently central to the pupil. ViewPointer not only allows any object to become an eye contact sensing appliance, it also allows identification of users and transmission of data to the user through the object. We present a novel encoding scheme used to uniquely identify ViewPointer tags, as well as a method for transmitting URLs over tags. We present a number of scenarios of application as well as an analysis of design principles. We conclude eye contact sensing input is best utilized to provide context to action.

One of the problems with mobile media devices is that they may distract users during critical everyday tasks, such as navigating the streets of a busy city. We addressed this issue in the design of eyeLook: a platform for attention sensitive mobile computing. eyeLook appliances use embedded low cost eyeCONTACT sensors (ECS) to detect when the user looks at the display. We discuss two eyeLook applications, seeTV and seeTXT, that facilitate courteous media consumption in mobile contexts by using the ECS to respond to user attention. seeTV is an attentive mobile video player that automatically pauses content when the user is not looking. seeTXT is an attentive speed reading application that flashes words on the display, advancing text only when the user is looking. By making mobile media devices sensitive to actual user attention, eyeLook allows applications to gracefully transition users between consuming media, and managing life.

Impromptu is a mobile audio device which uses wireless Internet Protocol (IP) to access novel computer-mediated voice communication channels. These channels show the richness of IP-based communication as compared to conventional mobile telephony, adding audio processing and storage in the network, and flexible, user-centered call control protocols. These channels may be synchronous, asynchronous, or event-triggered, or even change modes as a function of other user activity. The demands of these modes plus the need to navigate with an entirely non-visual user interface are met with a number of audio-oriented user interaction techniques.

While graphical user interfaces have gained much popularity in recent years, there are situations when the need to use existing applications in a nonvisual modality is clear. Examples of such situations include the use of applications on hand-held devices with limited screen space (or even no screen space, as in the case of telephones), or users with visual impairments.
We have developed an architecture capable of transforming the graphical interfaces of existing applications into powerful intuitive nonvisual interfaces. Our system, called Mercator, provides new input and output techniques for working in the nonvisual domain. Navigation is accomplished by traversing a hierarchical tree representation of the interface structure. Output is primarily auditory, although other output modalities (such as tactile) can be used as well. The mouse, an inherently visually-oriented device, is replaced by keyboard and voice interaction.
Our system is currently in its third major revision. We have gained insight into both the nonvisual interfaces presented by our system and the architecture necessary to construct such interfaces. This architecture uses several novel techniques to efficiently and flexibly map graphical interfaces into new modalities.

ENO is an audio server designed to make it easy for applications in the Unix environment to incorporate non-speech audio cues. At the physical level, ENO manages a shared resource, namely the audio hardware. At the logical level, it manages a sound space that is shared by various client applications. Instead of dealing with sound in terms of its physical description (i.e., sampled sounds), ENO allows sounds to be presented and controlled in terms of higher-level descriptions of sources, interactions, attributes, and sound space. Using this structure, ENO can facilitate the creation of consistent, rich systems of audio cues. In this paper, we discuss the justification, design, and implementation of ENO.

Systems of connected appliances, such as home theaters and presentation rooms, are becoming commonplace in our homes and workplaces. These systems are often difficult to use, in part because users must determine how to split the tasks they wish to perform into sub-tasks for each appliance and then find the particular functions of each appliance to complete their sub-tasks. This paper describes Huddle, a new system that automatically generates task-based interfaces for a system of multiple appliances based on models of the content flow within the multi-appliance system.

We describe a unique form of hands-free interaction that can be implemented on most commodity computing platforms. Our approach supports blowing at a laptop or computer screen to directly control certain interactive applications. Localization estimates are produced in real-time to determine where on the screen the person is blowing. Our approach relies solely on a single microphone, such as those already embedded in a standard laptop or one placed near a computer monitor, which makes our approach very cost-effective and easy-to-deploy. We show example interaction techniques that leverage this approach.

Modern brain sensing technologies provide a variety of methods for detecting specific forms of brain activity. In this paper, we present an initial step in exploring how these technologies may be used to perform task classification and applied in a relevant manner to HCI research. We describe two experiments showing successful classification between tasks using a low-cost off-the-shelf electroencephalograph (EEG) system. In the first study, we achieved a mean classification accuracy of 84.0% in subjects performing one of three cognitive tasks - rest, mental arithmetic, and mental rotation - while sitting in a controlled posture. In the second study, conducted in more ecologically valid setting for HCI research, we attained a mean classification accuracy of 92.4% using three tasks that included non-cognitive features: a relaxation task, playing a PC based game without opponents, and engaging opponents within the game. Throughout the paper, we provide lessons learned and discuss how HCI researchers may utilize these technologies in their work.

Current asynchronous voice messaging interfaces, like voicemail, fail to take advantage of our conversational skills. TalkBack restores conversational turn-taking to voicemail retrieval by dividing voice messages into smaller sections based on the most significant silent and filled pauses and pausing after each to record a response. The responses are composed into a reply, alternating with snippets of the original message for context. TalkBack is built into a digital picture frame; the recipient touches a picture of the caller to hear each segment of the message in turn. The minimal interface models synchronous interaction and facilitates asynchronous voice messaging. TalkBack can also present a voice-annotated slide show which it receives over the Internet.

With advances in pen-based computing devices, handwriting has become an increasingly popular input modality. Researchers have put considerable effort into building intelligent recognition systems that can translate handwriting to text with increasing accuracy. However, handwritten input is inherently ambiguous, and these systems will always make errors. Unfortunately, work on error recovery mechanisms has mainly focused on interface innovations that allow users to manually transform the erroneous recognition result into the intended one. In our work, we propose a mixed-initiative approach to error correction. We describe CueTIP, a novel correction interface that takes advantage of the recognizer to continually evolve its results using the additional information from user corrections. This significantly reduces the number of actions required to reach the intended result. We present a user study showing that CueTIP is more efficient and better preferred for correcting handwriting recognition errors. Grounded in the discussion of CueTIP, we also present design principles that may be applied to mixed-initiative correction interfaces in other domains.

We introduce CrossY, a simple drawing application developed as a benchmark to demonstrate the feasibility of goal crossing as the basis for a graphical user interface. We show that crossing is not only as expressive as the current point-and-click interface, but also offers more flexibility in interaction design. In particular, crossing encourages the fluid composition of commands which supports the development of more fluid interfaces. While crossing was previously identified as a potential substitute for the classic point-and-click interaction, this work is the first to report on the practical aspects of implementing an interface based on goal crossing as the fundamental building block.

Despite novel interaction techniques proposed for virtual desktops, common yet challenging tasks remain to be investigated. Dragging and dropping between overlapping windows is one of them. The fold-and-drop technique presented here offers a natural and efficient way of performing those tasks. We show how this technique successfully builds upon several interaction paradigms previously described, while shedding new light on them.

Today's generic data management applications such as accounting, CRM or logging and tracking software, rely on form and menu based interfaces. These applications take only marginal advantage of current graphical user interfaces. This is because the data they handle does not have intrinsic visual representations upon which direct manipulation principles can be used. This article presents how we have extended an Information Visualization framework with generic data manipulation functions. These new data editing capabilities are tuned to take advantage of the characteristics of each view. They enable us to generalize the direct manipulation mechanisms to address many abstract data manipulation needs. In this article we present five uses of the features we have implemented and deduce a general workflow applicable to a variety of contexts. The workflow comprises three steps and five editing actions. The steps are: adjust view, select, and edit. The editing actions are: edit a value or group of values, clone objects, remove objects, add attributes, and remove attributes. The workflow provides complete editing access to table and hierarchical data structures using particularly terse interaction methods. It defines a general data editing model that enables powerful data manipulation tasks without requiring end-user programming or scripting.

An action inferring facility for a multimodal interface called Edward is described. Based on the actions the user performs, Edward anticipates future actions and offers to perform them automatically. The system uses inductive inference to anticipate actions. It generalizes over arguments and results, and detects patterns on the basis of a small sequence of user actions, e.g. “copy a lisp file; change extension of original file into .org; put the copy in the backup folder”. Multimodality (particularly the combination of natural language and simulated pointing gestures) and the reuse of patterns are important new features. Some possibilities and problems of action inferring interfaces in general are addressed. Action inferring interfaces are particularly useful for professional users of general-purpose applications. Such users are unable to program repetitive patterns because either the applications do not provide the facilities or the users lack the capabilities.

Conventional interface builders allow the user interface designer to select widgets such as menus, buttons and scroll bars, and lay them out using a mouse. Although these are conceptually simple to use, in practice there are a number of problems. First, a typical widget will have dozens of properties which the designer might change. Insuring that these properties are consistent across multiple widgets in a dialog box and multiple dialog boxes in an application can be very difficult. Second, if the designer wants to change the properties, each widget must be edited individually. Third, getting the widgets laid out appropriately in a dialog box can be tedious. Grids and alignment commands are not sufficient. This paper describes Graphical Tabs and Graphical Styles in the Gild interface builder which solve all of these problems. A “graphical tab” is an absolute position in a window. A “graphical style” incorporates both property and layout information, and can be defined by example, named, applied to other widgets, edited, saved to a file, and read from a file. If a graphical style is edited, then all widgets defined using that style are modified. In addition, because appropriate styles are inferred, they do not have to be explicitly applied.

Most document or information management systems rely on hierarchies to organise documents (e.g. files, email messages or web bookmarks). However, the rigid structures of hierarchical schemes do not mesh well with the more fluid nature of everyday document practices. This paper describes Presto, a prototype system that allows users to organise their documents entirely in terms of the properties those documents hold for users. Properties provide a uniform mechanism for managing, coding, searching, retrieving and interacting with documents. We concentrate in particular on the challenges that property-based approaches present and the architecture we have developed to tackle them.

This paper introduces a new type of interface for 3D drawings that improves the usability of gestural interfaces and augments typical command-based modeling systems. In our suggestive interface, the user gives hints about a desired operation to the system by highlighting related geometric components in the scene. The system then infers possible operations based on the hints and presents the results of these operations as small thumbnails. The user completes the editing operation simply by clicking on the desired thumbnail. The hinting mechanism lets the user specify geometric relations among graphical components in the scene, and the multiple thumbnail suggestions make it possible to define many operations with relatively few distinct hint patterns. The suggestive interface system is implemented as a set of suggestion engines working in parallel, and is easily extended by adding customized engines. Our prototype 3D drawing system, Chateau, shows that a suggestive interface can effectively support construction of various 3D drawings.

Graphical user interfaces (GUI) provide intuitive and easy means for users to communicate with computers. However, construction of GUI software requires complex programming that is far from being intuitive. Because of the “semantic gap” between the textual application program and its graphical interface, the programmer himself must conceptually maintain the correspondence between the textual programming and the graphical image of the resulting interface. Instead, we propose a programming environment based on the programming by visual example (PBVE) scheme, which allows the GUI designers to “program” visual interfaces for their applications by “drawing” the example visualization of application data with a direct manipulation interface. Our system, TRIP3, realizes this with (1) the bi-directional translation model between the (abstract) application data and the pictorial data of the GUI, and (2) the ability to generate mapping rules for the translation from example application data and its corresponding example visualization. The latter is made possible by the use of generalization of visual examples, where the system is able to automatically generate generalized mapping rules from a given set of examples.

It is generally accepted that it is important to involve the end users of a Graphical User Interface (GUI) in all stages of its design and development. However, traditional GUI development tools typically do not support collaborative design. TelePICTIVE is an experimental software prototype designed to allow computer-naive users to collaborate with experts at possibly remote locations in designing GUIs.
TelePICTIVE is based on the PICTIVE participatory design methodology, and has been prototyped using the RENDEZVOUS system. In this paper we describe TelePICTIVE, and show how it is designed to support collaboration among a group of GUI designers with diverse levels of expertise. We also explore some of the issue that have come up during development and initial usability testing, such as how to coordinate simultaneous access to a shared design surface, and how to engage in the participatory design of GUIs using a Computer-Supported Cooperative Work (CSCW) system.

User interface toolkits and higher-level tools built on top of them play an ever increasing part in developing graphical user interfaces. This paper describes the XIT system, a user interface development tool for the X Window System, based on Common Lisp, comprising user interface toolkits as well as high-level interactive tools organized into a layered architecture. We especially focus on the object-oriented design of the lower-level toolkits and show how advanced features for describing automatic screen layout, visual feedback, application links, complex interaction, and dialog control, usually not included in traditional user interface toolkits, are integrated.

A large proportion of computer-supported tasks---such as design exploration, decision analysis, data presentation, and many kinds of retrieval---can be characterised as user-driven processing of a body of data in search of an outcome that satisfies the user. Clearly such tasks can never be automated fully, but few existing tools offer support for mechanising more than the simplest repetitive aspects of the search. Reconnaissance facilities, in which the computer produces summary reports from exploration in directions suggested by the user, can save the user time and effort by revealing which areas are the most deserving of detailed investigation. The time users are prepared to spend on searching will be more effectively used, improving the likelihood of finding solutions that really meet their needs rather than merely being the first to appear satisfactory. This note describes an implemented example of reconnaissance, based on the parallel coordinates presentation technique.

The construction of application-specific Graphical User Interfaces (GUI) still needs considerable programming partly because the mapping between application data and its visual representation is complicated. This study proposes a system which generates GUIs by generalizing multiple sets of application data and its visualization examples. The most notable characteristic of the system is that programmers can interactively modify the mapping by “correcting” the system-generated visualization examples that represent the system's current notion of programmer's intentions. Conflicting mappings are automatically resolved via the use of constraint hierarchies.

We describe a new type of graphical user interface widget, known as a "tracking menu." A tracking menu consists of a cluster of graphical buttons, and as with traditional menus, the cursor can be moved within the menu to select and interact with items. However, unlike traditional menus, when the cursor hits the edge of the menu, the menu moves to continue tracking the cursor. Thus, the menu always stays under the cursor and close at hand.In this paper we define the behavior of tracking menus, show unique affordances of the widget, present a variety of examples, and discuss design characteristics. We examine one tracking menu design in detail, reporting on usability studies and our experience integrating the technique into a commercial application for the Tablet PC. While user interface issues on the Tablet PC, such as preventing round trips to tool palettes with the pen, inspired tracking menus, the design also works well with a standard mouse and keyboard configuration.

This paper details the design and evaluation of the Delphian Desktop, a mechanism for online spatial prediction of cursor movements in a Windows-Icons-Menus-Pointers (WIMP) environment. Interaction with WIMP-based interfaces often becomes a spatially challenging task when the physical interaction mediators are the common mouse and a high resolution, physically large display screen. These spatial challenges are especially evident in overly crowded Windows desktops. The Delphian Desktop integrates simple yet effective predictive spatial tracking and selection paradigms into ordinary WIMP environments in order to simplify and ease pointing tasks. Predictions are calculated by tracking cursor movements and estimating spatial intentions using a computationally inexpensive online algorithm based on estimating the movement direction and peak velocity. In testing the Delphian Desktop effectively shortened pointing time to faraway icons, and reduced the overall physical distance the mouse (and user hand) had to mechanically traverse.

This paper presents tangible interaction techniques for fine-tuning one-to-one scale NURBS curves on a large display for automotive design. We developed a new graspable handle with a transparent groove that allows designers to manipulate virtual curves on a display screen directly. The use of the proposed handle leads naturally to a rich vocabulary of terms describing interaction techniques that reflect existing shape styling methods. A user test raised various issues related to the graspable user interface, two-handed input, and large-display interaction.

In this paper, we present a methodology for recognizing seatedpostures using data from pressure sensors installed on a chair.Information about seated postures could be used to help avoidadverse effects of sitting for long periods of time or to predictseated activities for a human-computer interface. Our system designdisplays accurate near-real-time classification performance on datafrom subjects on which the posture recognition system was nottrained by using a set of carefully designed, subject-invariantsignal features. By using a near-optimal sensor placement strategy,we keep the number of required sensors low thereby reducing costand computational complexity. We evaluated the performance of ourtechnology using a series of empirical methods including (1)cross-validation (classification accuracy of 87% for ten posturesusing data from 31 sensors), and (2) a physical deployment of oursystem (78% classification accuracy using data from 19sensors).

It is well known that paper is a very fluid, natural, and easy to use medium for manipulating some kinds of information. It is familiar, portable, flexible, inexpensive, and offers good readability properties. Paper also has well known limitations when compared with electronic media. Work in hybrid paper electronic interfaces seeks to bring electronic capabilities to real paper in order to obtain the best properties of each. This paper describes a hybrid paper electronic system --- the Paper PDA --- which is designed to allow electronic capabilities to be employed within a conventional paper notebook, calendar, or organizer. The Paper PDA is based on a simple observation: a paper notebook can be synchronized with a body of electronic information much like an electronic PDA can be synchronized with information hosted on a personal computer. This can be accomplished by scanning, recognizing and processing its contents, then printing a new copy. This paper introduces the Paper PDA concept and considers interaction techniques and applications designed to work within the Paper PDA. The StickerLink technique supports on-paper hyperlinking using removable paper stickers. Two applications are also considered which look at aspects of electronic communications via the Paper PDA.

Ordinary paper offers properties of readability, fluidity, flexibility, cost, and portability that current electronic devices are often hard pressed to match. In fact, a lofty goal for many interactive systems is to be "as easy to use as pencil and paper". However, the static nature of paper does not support a number of capabilities, such as search and hyperlinking that an electronic device can provide. The Paper PDA project explores ways in which hybrid paper electronic interfaces can bring some of the capabilities of the electronic medium to interactions occurring on real paper. Key to this effort is the invention of on-paper interaction techniques which retain the flexibility and fluidity of normal pen and paper, but which are structured enough to allow robust interpretation and processing in the digital world. This paper considers the design of a class of simple printed templates that allow users to make common marks in a fluid fashion, and allow additional gestures to be invented by the users to meet their needs, but at the same time encourages marks that are quite easy to recognize.

In our previous studies into web design, we found that pens, paper, walls, and tables were often used for explaining, developing, and communicating ideas during the early phases of design. These wall-scale paper-based design practices inspired The Designers' Outpost, a tangible user interface that combines the affordances of paper and large physical workspaces with the advantages of electronic media to support information design. With Outpost, users collaboratively author web site information architectures on an electronic whiteboard using physical media (Post-it notes and images), structuring and annotating that information with electronic pens. This interaction is enabled by a touch-sensitive SMART Board augmented with a robust computer vision system, employing a rear-mounted video camera for capturing movement and a front-mounted high-resolution camera for capturing ink. We conducted a participatory design study with fifteen professional web designers. The study validated that Outpost supports information architecture work practice, and led to our adding support for fluid transitions to other tools.

SketchWizard allows designers to create Wizard of Oz prototypes of pen-based user interfaces in the early stages of design. In the past, designers have been inhibited from participating in the design of pen-based interfaces because of the inadequacy of paper prototypes and the difficulty of developing functional prototypes. In SketchWizard, designers and end users share a drawing canvas between two computers, allowing the designer to simulate the behavior of recognition or other technologies. Special editing features are provided to help designers respond quickly to end-user input. This paper describes the SketchWizard system and presents two evaluations of our approach. The first is an early feasibility study in which Wizard of Oz was used to prototype a pen-based user interface. The second is a laboratory study in which designers used SketchWizard to simulate existing pen-based interfaces. Both showed that end users gave valuable feedback in spite of delays between end-user actions and wizard updates.

Location-enhanced applications use the location of people, places, and things to augment or streamline interaction. Location-enhanced applications are just starting to emerge in several different domains, and many people believe that this type of application will experience tremendous growth in the near future. However, it currently requires a high level of technical expertise to build location-enhanced applications, making it hard to iterate on designs. To address this problem we introduce Topiary, a tool for rapidly prototyping location-enhanced applications. Topiary lets designers create a map that models the location of people, places, and things; use this active map to demonstrate scenarios depicting location contexts; use these scenarios in creating storyboards that describe interaction sequences; and then run these storyboards on mobile devices, with a wizard updating the location of people and things on a separate device. We performed an informal evaluation with seven researchers and interface designers and found that they reacted positively to the concept.

We are building a multimedia conversation system to facilitate information seeking in large and complex data spaces. To provide tailored responses to diverse user queries introduced during a conversation, we automate the generation of a system response. Here we focus on the problem of determining the data content of a response. Specifically, we develop an optimization-based approach to content selection. Compared to existing rule-based or plan-based approaches, our work offers three unique contributions. First, our approach provides a general framework that effectively addresses content selection for various interaction situations by balancing a comprehensive set of constraints (e.g., content quality and quantity constraints). Second, our method is easily extensible, since it uses feature-based metrics to systematically model selection constraints. Third, our method improves selection results by incorporating content organization and media allocation effects, which otherwise are treated separately. Preliminary studies show that our method can handle most of the user situations identified in a Wizard-of-Oz study, and achieves results similar to those produced by human designers.

When users handle large amounts of data, errors are hard to notice. Outlier finding is a new way to reduce errors by directing the user's attention to inconsistent data which may indicate errors. We have implemented an outlier finder for text, which can detect both unusual matches and unusual mismatches to a text pattern. When integrated into the user interface of a PBD text editor and tested in a user study, outlier finding substantially reduced errors.

We present Citrine, a system that extends the widespread copy-and-paste interaction technique with intelligent transformations, making it useful in more situations. Citrine uses text parsing to find the structure in copied text and allows users to paste the structured information, which might have many pieces, in a single paste operation. For example, using Citrine, a user can copy the text of a meeting request and add it to the Outlook calendar with a single paste. In applications such as Excel, users can teach Citrine by example how to copy and paste data by showing it which fields go into which columns, and can use this to copy or paste many items at a time in a user-defined manner. Citrine can be used with a wide variety of applications and types of data and can be easily extended to work with more. It currently includes parsers that recognize contact information, calendar appointments and bibliographic citations. It works with Internet Explorer, Outlook, Excel, Palm Desktop, EndNote and other applications. Citrine is available to download on the internet.

We describe the current status of Pad++, a zooming graphical interface that we are exploring as an alternative to traditional window and icon-based approaches to interface design. We discuss the motivation for Pad++, describe the implementation, and present prototype applications. In addition, we introduce an informational physics strategy for interface design and briefly compare it with metaphor-based design strategies.

We present the Haptic Shading Framework (HSF), a framework for procedurally defining haptic texture. HSF haptic texture shaders are short procedures allowing an application-programmer to easily define interesting haptic surface interaction and the parameters that control the surface properties. These shaders provide the illusion of surface characteristics by altering previously calculated forces from object collision in the haptic pipeline.HSF can be used in an existing haptic application with few modifications. The framework consists of user-programmable modules that are dynamically loaded. This framework and all user-defined procedures are written in C++, with a provided library of useful math and geometry functions. These functions are meant to mimic RenderMan functionality, creating a familiar shading environment. As we demonstrate, many procedural shading methods and algorithms can be directly adopted for haptic shading.

We describe a unique form of hands-free interaction that can be implemented on most commodity computing platforms. Our approach supports blowing at a laptop or computer screen to directly control certain interactive applications. Localization estimates are produced in real-time to determine where on the screen the person is blowing. Our approach relies solely on a single microphone, such as those already embedded in a standard laptop or one placed near a computer monitor, which makes our approach very cost-effective and easy-to-deploy. We show example interaction techniques that leverage this approach.

This paper presents a demonstrational interface builder with improved reasoning capabilities. The system is comprised of two major components: an interactive display manager and a rule-based reasoner. The display manager provides facilities to draw the physical appearance of an interface and define interface behavior by graphical demonstration. The behavior is defined using a technique of stimulus-response demonstrations. With this technique, an interface developer first demonstrates a stimulus that represents an action that an end user will perform on the interface. After the stimulus, the developer demonstrates the response(s) that should result from the given stimulus. As the behavior is demonstrated, the reasoner observes the demonstrations and draws inferences to expedite behavior definition. The inferences entail generalizing from specific behavior demonstrations and identifying constraints that define the generalized behavior. Once behavior constraints are identified, the reasoner sends them to the display manager to complete the definition process. When the interface is executed by an end-user, the display manager uses the constraints to implement the run-time behavior of the interface.

Direct-manipulation editors for structured data are increasingly common. While such editors can greatly simplify the creation of structured data, there are few tools to simplify the creation of the editors themselves. This paper presents Citrus, a new programming language and user interface toolkit designed for this purpose. Citrus offers language-level support for constraints, restrictions and change notifications on primitive and aggregate data, mechanisms for automatically creating, removing, and reusing views as data changes, a library of widgets, layouts and behaviors for defining interactive views, and two comprehensive interactive editors as an interface to the language and toolkit itself. Together, these features support the creation of editors for a large class of data and code.

Making effective use of the available display space has long been a fundamental issue in user interface design. We live in a time of rapid advances in available CPU power and memory. However, the common sizes of our computational display spaces have only minimally increased or in some cases, such as hand held devices, actually decreased. In addition, the size and scope of the information spaces we wish to explore are also expanding. Representing vast amounts of information on our relatively small screens has become increasingly problematic and has been associated with problems in navigation, interpretation and recognition. User interface research has proposed several differing presentation approaches to address these problems. These methods create displays that vary considerably, visually and algorithmically. We present a unified framework that provides a way of relating seemingly distinct methods, facilitating the inclusion of more than one presentation method in a single interface. Furthermore, it supports extrapolation between the presentation methods it describes. Of particular interest are the presentation possibilities that exist in the ranges between various distortion presentations, magnified insets and detail-in-context presentations, and between detail-in-context presentations and a full-zooming environment. This unified framework offers a geometric presentation library in which presentation variations are available independently of the mode of graphic representation. The intention is to promote the ease of exploration and experimentation into the use of varied presentation combinations.

This paper presents motivation, design, and algorithms for using and implementing translucent, non-rectangular patches as a substitute for rectangular opaque windows. The underlying metaphor is closer to a mix between the architects yellow paper and the usage of white boards, than to rectangular opaque paper in piles and folders on a desktop.
Translucent patches lead to a unified view of windows, sub-windows and selections, and provide a base from which the tight connection between windows, their content, and applications can be dissolved. It forms one aspect of on-going work to support design activities that involve “marking” media, like paper and white boards, with computers. The central idea of that research is to allow the user to associate structure and meaning dynamically and smoothly to marks on a display surface.

Making effective use of the available display space has long been a fundamental issue in user interface design. We live in a time of rapid advances in available CPU power and memory. However, the common sizes of our computational display spaces have only minimally increased or in some cases, such as hand held devices, actually decreased. In addition, the size and scope of the information spaces we wish to explore are also expanding. Representing vast amounts of information on our relatively small screens has become increasingly problematic and has been associated with problems in navigation, interpretation and recognition. User interface research has proposed several differing presentation approaches to address these problems. These methods create displays that vary considerably, visually and algorithmically. We present a unified framework that provides a way of relating seemingly distinct methods, facilitating the inclusion of more than one presentation method in a single interface. Furthermore, it supports extrapolation between the presentation methods it describes. Of particular interest are the presentation possibilities that exist in the ranges between various distortion presentations, magnified insets and detail-in-context presentations, and between detail-in-context presentations and a full-zooming environment. This unified framework offers a geometric presentation library in which presentation variations are available independently of the mode of graphic representation. The intention is to promote the ease of exploration and experimentation into the use of varied presentation combinations.

SketchWizard allows designers to create Wizard of Oz prototypes of pen-based user interfaces in the early stages of design. In the past, designers have been inhibited from participating in the design of pen-based interfaces because of the inadequacy of paper prototypes and the difficulty of developing functional prototypes. In SketchWizard, designers and end users share a drawing canvas between two computers, allowing the designer to simulate the behavior of recognition or other technologies. Special editing features are provided to help designers respond quickly to end-user input. This paper describes the SketchWizard system and presents two evaluations of our approach. The first is an early feasibility study in which Wizard of Oz was used to prototype a pen-based user interface. The second is a laboratory study in which designers used SketchWizard to simulate existing pen-based interfaces. Both showed that end users gave valuable feedback in spite of delays between end-user actions and wizard updates.

To disentangle and analyze neural pathways estimated from magnetic resonance imaging data, scientists need an interface to select 3D pathways. Broad adoption of such an interface requires the use of commodity input devices such as mice and pens, but these devices offer only two degrees of freedom. CINCH solves this problem by providing a marking interface for 3D pathway selection. CINCH interprets pen strokes as pathway selections in 3D using a marking language designed together with scientists. Its bimanual interface employs a pen and a trackball (see Figure 1), allowing alternating selections and scene rotations without changes of mode. CINCH was evaluated by observing four scientists using the tool over a period of three weeks as part of their normal work activity. Event logs and interviews revealed dramatic improvements in both the speed and quality of scientists' everyday work, and a set of principles that should inform the design of future 3D marking interfaces. More broadly, CINCH demonstrates the value of the iterative, participatory design process that catalyzed its evolution.

This paper investigates the sense of touch as a channel for communicating with miniature handheld devices. We embedded a PDA with a TouchEngineTM --- a thin, miniature lower-power tactile actuator that we have designed specifically to use in mobile interfaces (Figure 1). Unlike previous tactile actuators, the TouchEngine is a universal tactile display that can produce a wide variety of tactile feelings from simple clicks to complex vibrotactile patterns. Using the TouchEngine, we began exploring the design space of interactive tactile feedback for handheld computers. Here, we investigated only a subset of this space: using touch as the ambient, background channel of interaction. We proposed a general approach to design such tactile interfaces and described several implemented prototypes. Finally, our user studies demonstrated 22% faster task completion when we enhanced handheld tilting interfaces with tactile feedback.

Intrabody communication (IBC) is a wireless communications technology that uses a person's body as the transmission medium for imperceptible electrical signals. Because communication is limited to the vicinity of a person's body, ambiguities arising from communication between personal devices and environmental devices when multiple people are present can, in theory, be solved simply. Intrabody communication also potentially allows data to be transferred when a person touches an IBC-enabled device. We have designed and constructed an intrabody communication system, modeled after Zimmerman's original design, and extended it to operate up to 38.4Kbps and to calculate signal strength. In this paper, we present quantitative measurements of data error rates and signal strength while varying hand distance to transceiver plate, electrode location on the body, touch plate size and shape, and several other factors. We find that plate size and shape have only minor effects, but that the distance to plate and the coupling mechanism significantly effect signal strength. We also find that portable devices, with poor ground coupling, suffer more significant signal attenuation. Our goal is to promote design guidelines for this technology and identify the best contexts for its effective deployment.

The creation of most models used in computer animation and computer games requires the assignment of texture coordinates, texture painting, and texture editing. We present a novel approach for texture placement and editing based on direct manipulation of textures on the surface. Compared to conventional tools for surface texturing, our system combines UV-coordinate specification and texture editing into one seamless process, reducing the need for careful initial design of parameterization and providing a natural interface for working with textures directly on 3D surfaces.A combination of efficient techniques for interactive constrained parameterization and advanced input devices makes it possible to realize a set of natural interaction paradigms. The texture is regarded as a piece of stretchable material, which the user can position and deform on the surface, selecting arbitrary sets of constraints and mapping texture points to the surface; in addition, the multi-touch input makes it possible to specify natural handles for texture manipulation using point constraints associated with different fingers. Pressure can be used as a direct interface for texture combination operations. The 3D position of the object and its texture can be manipulated simultaneously using two-hand input.

This research explores distributed sensing techniques for mobile devices using synchronous gestures. These are patterns of activity, contributed by multiple users (or one user with multiple devices), which take on a new meaning when they occur together in time, or in a specific sequence in time. To explore this new area of inquiry, this work uses tablet computers augmented with touch sensors and two-axis linear accelerometers (tilt sensors). The devices are connected via an 802.11 wireless network and synchronize their time-stamped sensor data. This paper describes a few practical examples of interaction techniques using synchronous gestures such as dynamically tiling together displays by physically bumping them together, discusses implementation issues, and speculates on further possibilities for synchronous gestures.

In this paper we propose a novel way of supporting occasional meetings that take place in unfamiliar public places, which promotes lightweight, visible and fluid collaboration. Our central idea is that the sharing and exchange of information occurs across public surfaces that users can easily access and interact with. To this end, we designed and implemented Dynamo, a communal multi-user interactive surface. The surface supports the cooperative sharing and exchange of a wide range of media that can be brought to the surface by users that are remote from their familiar organizational settings.

In this paper we propose a new model for a class of rapid serial visual presentation (RSVP) interfaces [16] in the context of consumer video devices. The basic spatial layout "explodes" a sequence of image frames into a 3D trail in order to provide more context for a spatial/temporal presentation. As the user plays forward or back, the trail advances or recedes while the image in the foreground focus position is replaced. The design is able to incorporate a variety of methods for analyzing or highlighting images in the trail. Our hypotheses are that users can navigate more quickly and precisely to points of interest when compared to conventional consumer-based browsing, channel flipping, or fast-forwarding techniques. We report on an experiment testing our hypotheses in which we found that subjects were more accurate but not faster in browsing to a target of interest in recorded television content with a TV remote.

The new media types used in advance user interfaces and interactive systems introduce time as a significant variable. This paper addresses the architectural support and programming tools that should be provided to the programmer to manage the time dependencies. The approach considers that the basic models and programming paradigms adopted in the manipulation and management of time should be isomorphic with the spatial models used in existing graphical user interfaces.
The paper describes the architectural principles of a toolkit designed to support the construction of user interfaces with temporal characteristics. The Ttoolkit is an extension of an existing graphical user interface toolkit, the Xt toolkit. Its design is presented and a sample application is described.

An action inferring facility for a multimodal interface called Edward is described. Based on the actions the user performs, Edward anticipates future actions and offers to perform them automatically. The system uses inductive inference to anticipate actions. It generalizes over arguments and results, and detects patterns on the basis of a small sequence of user actions, e.g. “copy a lisp file; change extension of original file into .org; put the copy in the backup folder”. Multimodality (particularly the combination of natural language and simulated pointing gestures) and the reuse of patterns are important new features. Some possibilities and problems of action inferring interfaces in general are addressed. Action inferring interfaces are particularly useful for professional users of general-purpose applications. Such users are unable to program repetitive patterns because either the applications do not provide the facilities or the users lack the capabilities.

While graphical user interfaces have gained much popularity in recent years, there are situations when the need to use existing applications in a nonvisual modality is clear. Examples of such situations include the use of applications on hand-held devices with limited screen space (or even no screen space, as in the case of telephones), or users with visual impairments.
We have developed an architecture capable of transforming the graphical interfaces of existing applications into powerful intuitive nonvisual interfaces. Our system, called Mercator, provides new input and output techniques for working in the nonvisual domain. Navigation is accomplished by traversing a hierarchical tree representation of the interface structure. Output is primarily auditory, although other output modalities (such as tactile) can be used as well. The mouse, an inherently visually-oriented device, is replaced by keyboard and voice interaction.
Our system is currently in its third major revision. We have gained insight into both the nonvisual interfaces presented by our system and the architecture necessary to construct such interfaces. This architecture uses several novel techniques to efficiently and flexibly map graphical interfaces into new modalities.

ENO is an audio server designed to make it easy for applications in the Unix environment to incorporate non-speech audio cues. At the physical level, ENO manages a shared resource, namely the audio hardware. At the logical level, it manages a sound space that is shared by various client applications. Instead of dealing with sound in terms of its physical description (i.e., sampled sounds), ENO allows sounds to be presented and controlled in terms of higher-level descriptions of sources, interactions, attributes, and sound space. Using this structure, ENO can facilitate the creation of consistent, rich systems of audio cues. In this paper, we discuss the justification, design, and implementation of ENO.

In this paper, we describe a multimodal interface prototype system based on Dynamical Dialogue Model. This system not only integrates information of speech and gestures, but also controls the response timing in order to realize a smooth interaction between user and computer. Our approach consists of human-human dialogue analysis, and computational modeling of dialogue.

We describe the current status of Pad++, a zooming graphical interface that we are exploring as an alternative to traditional window and icon-based approaches to interface design. We discuss the motivation for Pad++, describe the implementation, and present prototype applications. In addition, we introduce an informational physics strategy for interface design and briefly compare it with metaphor-based design strategies.

This paper describes a new music-playback interface for trial listening, SmartMusicKIOSK. In music stores, short trial listening of CD music is not usually a passive experience -- customers often search out the chorus or "hook" of a song using the fast-forward button. Listening of this type, however, has not been traditionally supported. This research achieves a function for jumping to the chorus section and other key parts of a song plus a function for visualizing song structure. These functions make it easier for a listener to find desired parts of a song and thereby facilitate an active listening experience. The proposed functions are achieved by an automatic chorus-section detecting method, and the results of implementing them as a listening station have demonstrated their usefulness.

Nested User Interface Components combine the concepts of Zooming User Interfaces (ZUIs) with recursive nesting of active graphical user interface widgets. The resulting system of recursively nesting interface components has a number of desirable properties. The level of detail of the view of any widget component and its children, as well as the responsiveness of that component to the user's actions, can be tuned to the current visible size of that component on the screen.
We distinguish between the interaction style of a component, and the semantic result that it produces. Only the latter is used to determine the geographic parameters for that component. In this way, very large and layered control problems can be presented to the user as a cohesive and readily navigable visual surface. It becomes straightforward to layout interaction semantics that are best handled by recursion, such as filters composed of nested expressions.

The medium of collage supports the visualization of meaningful event summaries using photographs. It can however be rather tedious to author a collage from a large collection of photographs. In this work we present an approach that supports efficient construction of a collage by assisting the user with an automatic layout procedure that can be controlled at a high level. Our layout method utilizes a pre-designed template which consists of cells for photos and annotations applied to these cells. The layout is then filled by matching the metadata of photos to the annotations in the cells using an optimization algorithm. The user exercises flexibility in the authoring process by (a) maintaining high-level control through the types of constraints applied and (b) leveraging visual emphases supported by the layout algorithm. The user can of course provide fine-grained control of the final collage through direct manipulation. Off-loading the tedium of collage construction to a user controlled yet automated process clears the way for rapidly generating different views of the same album and could also support the increased sharing of digital photos in the form of compact collages.

Paper Augmented Digital Documents (PADDs) are digital documents that can be manipulated either on a computer screen or on paper. PADDs, and the infrastructure supporting them, can be seen as a bridge between the digital and the paper worlds. As digital documents, PADDs are easy to edit, distribute and archive; as paper documents, PADDs are easy to navigate, annotate and well accepted in social settings. The chimeric nature of PADDs make them well suited for many tasks such as proofreading, editing, and annotation of large format document like blueprints.We are presenting an architecture which supports the seamless manipulation of PADDs using today's technologies and reports on the lessons we learned while implementing the first PADD system.

Current paper-based interfaces such as PapierCraft, provide very little feedback and this limits the scope of possible interactions. So far, there has been little systematic exploration of the structure, constraints, and contingencies of feedback-mechanisms in paper-based interaction systems for paper-only environments. We identify three levels of feedback: discovery feedback (e.g., to aid with menu learning), status-indication feedback (e.g., for error detection), and task feedback (e.g., to aid in a search task). Using three modalities (visual, tactile, and auditory) which can be easily implemented on a pen-sized computer, we introduce a conceptual matrix to guide systematic research on pen-top feedback for paper-based interfaces. Using this matrix, we implemented a multimodal pen prototype demonstrating the potential of our approach. We conducted an experiment that confirmed the efficacy of our design in helping users discover a new interface and identify and correct their errors.

This paper presents motivation, design, and algorithms for using and implementing translucent, non-rectangular patches as a substitute for rectangular opaque windows. The underlying metaphor is closer to a mix between the architects yellow paper and the usage of white boards, than to rectangular opaque paper in piles and folders on a desktop.
Translucent patches lead to a unified view of windows, sub-windows and selections, and provide a base from which the tight connection between windows, their content, and applications can be dissolved. It forms one aspect of on-going work to support design activities that involve “marking” media, like paper and white boards, with computers. The central idea of that research is to allow the user to associate structure and meaning dynamically and smoothly to marks on a display surface.

We describe a new type of graphical user interface widget, known as a "tracking menu." A tracking menu consists of a cluster of graphical buttons, and as with traditional menus, the cursor can be moved within the menu to select and interact with items. However, unlike traditional menus, when the cursor hits the edge of the menu, the menu moves to continue tracking the cursor. Thus, the menu always stays under the cursor and close at hand.In this paper we define the behavior of tracking menus, show unique affordances of the widget, present a variety of examples, and discuss design characteristics. We examine one tracking menu design in detail, reporting on usability studies and our experience integrating the technique into a commercial application for the Tablet PC. While user interface issues on the Tablet PC, such as preventing round trips to tool palettes with the pen, inspired tracking menus, the design also works well with a standard mouse and keyboard configuration.

Conventional scrolling methods for small sized display in PDAs or mobile phones are difficult to use when frequent switching of scrolling and editing operations are required, for example, browsing and operating large sized WWW pages.In this paper, we propose a new user-interface method to provide seamless switching between scrolling and other operations such as editing, based on "Paperweight Metaphor". A sheet of paper that has been placed on a slippery table is difficult to draw on. Therefore, in order to write or draw something on the sheet of paper, a person must secure the paper with his/her palm to avoid the paper from moving. This will be a good metaphor to design switching operation of scroll and editing modes.We have made prototype systems by placing a touch sensor under each PDA display where user's palm will be hit. Three application programs - map browser, WWW browser, and photograph browser - that switch between scrolling and other operation modes depending on sensor output have been developed. We have carried out user tests on this mode switching method and have received favorable feedback on the same.

Current paper-based interfaces such as PapierCraft, provide very little feedback and this limits the scope of possible interactions. So far, there has been little systematic exploration of the structure, constraints, and contingencies of feedback-mechanisms in paper-based interaction systems for paper-only environments. We identify three levels of feedback: discovery feedback (e.g., to aid with menu learning), status-indication feedback (e.g., for error detection), and task feedback (e.g., to aid in a search task). Using three modalities (visual, tactile, and auditory) which can be easily implemented on a pen-sized computer, we introduce a conceptual matrix to guide systematic research on pen-top feedback for paper-based interfaces. Using this matrix, we implemented a multimodal pen prototype demonstrating the potential of our approach. We conducted an experiment that confirmed the efficacy of our design in helping users discover a new interface and identify and correct their errors.

We describe a new widget and interaction technique, known as a "Frisbee," for interacting with areas of a large display that are difficult or impossible to access directly. A frisbee is simply a portal to another part of the display. It consists of a local "telescope" and a remote "target". The remote data surrounded by the target is drawn in the telescope and interactions performed within it are applied on the remote data. In this paper we define the behavior of frisbees, show unique affordances of the widget, and discuss design characteristics. We have implemented a test application and report on an experiment that shows the benefit of using the frisbee on a large display. Our results suggest that the frisbee is preferred over walking back and forth to the local and remote spaces at a distance of 4.5 feet.

We explore a variety of interaction and visualization techniques for fluid navigation, segmentation, linking, and annotation of digital videos. These techniques are developed within a concept prototype called LEAN that is designed for use with pressure-sensitive digitizer tablets. These techniques include a transient position+velocity widget that allows users not only to move around a point of interest on a video, but also to rewind or fast forward at a controlled variable speed. We also present a new variation of fish-eye views called twist-lens, and incorporate this into a position control slider designed for the effective navigation and viewing of large sequences of video frames. We also explore a new style of widgets that exploit the use of the pen's pressure-sensing capability, increasing the input vocabulary available to the user. Finally, we elaborate on how annotations referring to objects that are temporal in nature, such as video, may be thought of as links, and fluidly constructed, visualized and navigated.

High precision parameter manipulation tasks typically require adjustment of the scale of manipulation in addition to the parameter itself. This paper introduces the notion of Zoom Sliding, or Zliding, for fluid integrated manipulation of scale (zooming) via pressure input while parameter manipulation within that scale is achieved via x-y cursor movement (sliding). We also present the Zlider (Figure 1), a widget that instantiates the Zliding concept. We experimentally evaluate three different input techniques for use with the Zlider in conjunction with a stylus for x-y cursor positioning, in a high accuracy zoom and select task. Our results marginally favor the stylus with integrated isometric pressure sensing tip over bimanual techniques which separate zooming and sliding controls over the two hands. We discuss the implications of our results and present further designs that make use of Zliding.

To disentangle and analyze neural pathways estimated from magnetic resonance imaging data, scientists need an interface to select 3D pathways. Broad adoption of such an interface requires the use of commodity input devices such as mice and pens, but these devices offer only two degrees of freedom. CINCH solves this problem by providing a marking interface for 3D pathway selection. CINCH interprets pen strokes as pathway selections in 3D using a marking language designed together with scientists. Its bimanual interface employs a pen and a trackball (see Figure 1), allowing alternating selections and scene rotations without changes of mode. CINCH was evaluated by observing four scientists using the tool over a period of three weeks as part of their normal work activity. Event logs and interviews revealed dramatic improvements in both the speed and quality of scientists' everyday work, and a set of principles that should inform the design of future 3D marking interfaces. More broadly, CINCH demonstrates the value of the iterative, participatory design process that catalyzed its evolution.

As technical as we have become, modern computing has not permeated many important areas of our lives, including mathematics education which still involves pencil and paper. In the present study, twenty high school geometry students varying in ability from low to high participated in a comparative assessment of math problem solving using existing pencil and paper work practice (PP), and three different interfaces: an Anoto-based digital stylus and paper interface (DP), pen tablet interface (PT), and graphical tablet interface (GT). Cognitive Load Theory correctly predicted that as interfaces departed more from familiar work practice (GT > PT > DP), students would experience greater cognitive load such that performance would deteriorate in speed, attentional focus, meta-cognitive control, correctness of problem solutions, and memory. In addition, low-performing students experienced elevated cognitive load, with the more challenging interfaces (GT, PT) disrupting their performance disproportionately more than higher performers. The present results indicate that Cognitive Load Theory provides a coherent and powerful basis for predicting the rank ordering of users' performance by type of interface. In the future, new interfaces for areas like education and mobile computing could benefit from designs that minimize users' load so performance is more adequately supported.

Informal prototyping tools have shown great potential in facilitating the early stage design of user interfaces. How-ever, continuous interactions, an important constituent of highly interactive interfaces, have not been well supported by previous tools. These interactions give continuous visual feedback, such as geometric changes of a graphical object, in response to continuous user input, such as the movement of a mouse. We built Monet, a sketch-based tool for proto-typing continuous interactions by demonstration. In Monet, designers can prototype continuous widgets and their states of interest using examples. They can also demonstrate com-pound behaviors involving multiple widgets by direct ma-nipulation. Monet allows continuous interactions to be eas-ily integrated with event-based, discrete interactions. Con-tinuous widgets can be embedded into storyboards and their states can condition or trigger storyboard transitions. Monet achieves these features by employing continuous function approximation and statistical classification techniques, without using any domain specific knowledge or assuming any application semantics. Informal feedback showed that Monet is a promising approach to enabling more complete tool support for early stage UI design.

SketchWizard allows designers to create Wizard of Oz prototypes of pen-based user interfaces in the early stages of design. In the past, designers have been inhibited from participating in the design of pen-based interfaces because of the inadequacy of paper prototypes and the difficulty of developing functional prototypes. In SketchWizard, designers and end users share a drawing canvas between two computers, allowing the designer to simulate the behavior of recognition or other technologies. Special editing features are provided to help designers respond quickly to end-user input. This paper describes the SketchWizard system and presents two evaluations of our approach. The first is an early feasibility study in which Wizard of Oz was used to prototype a pen-based user interface. The second is a laboratory study in which designers used SketchWizard to simulate existing pen-based interfaces. Both showed that end users gave valuable feedback in spite of delays between end-user actions and wizard updates.

Multi-display environments compose displays that can be at different locations from and different angles to the user; as a result, it can become very difficult to manage windows, read text, and manipulate objects. We investigate the idea of perspective as a way to solve these problems in multi-display environments. We first identify basic display and control factors that are affected by perspective, such as visibility, fracture, and sharing. We then present the design and implementation of E-conic, a multi-display multi-user environment that uses location data about displays and users to dynamically correct perspective. We carried out a controlled experiment to test the benefits of perspective correction in basic interaction tasks like targeting, steering, aligning, pattern-matching and reading. Our results show that perspective correction significantly and substantially improves user performance in all these tasks.

In this paper, we show how traditional physical interface components such as switches, levers, knobs and touch screens can be easily modified to identify who is activating each control. This allows us to change the function per-formed by the control, and the sensory feedback provided by the control itself, dependent upon the user. An auditing function is also available that logs each user's actions. We describe a number of example usage scenarios for our tech-nique, and present two sample implementations.

Although graphical user interfaces started as imitations of the physical world, many interaction techniques have since been invented that are not available in the real world. This paper focuses on one of these "previewing", and how a sensory enhanced input device called "PreSense Keypad" can provide a preview for users before they actually execute the commands. Preview important in the real world because it is often not possible to undo an action. This previewable feature helps users to see what will occur next. It is also helpful when the command assignment of the keypad dynamically changes, such as for universal commanders. We present several interaction techniques based on this input device, including menu and map browsing systems and a text input system. We also discuss finger gesture recognition for the PreSense Keypad.

Rapid, early, but rough system prototypes are becoming a standard and valued part of the user interface design process. Pen, paper, and tools like Flash™ and Director™ are well suited to creating such prototypes. However, in the case of physical forms with embedded technology, there is a lack of tools for developing rapid, early prototypes. Instead, the process tends to be fragmented into prototypes exploring forms that look like the intended product or explorations of functioning interactions that work like the intended product - bringing these aspects together into full design concepts only later in the design process. To help alleviate this problem, we present a simple tool for very rapidly creating functioning, rough physical prototypes early in the design process - supporting what amounts to interactive physical sketching. Our tool allows a designer to combine exploration of form and interactive function, using objects constructed from materials such as thumbtacks, foil, cardboard and masking tape, enhanced with a small electronic sensor board. By means of a simple and fluid tool for delivering events to "screen clippings," these physical sketches can then be easily connected to any existing (or new) program running on a PC to provide real or Wizard of Oz supported functionality.

We introduce an inexpensive position input device called the FieldMouse, with which a computer can tell the position of the device on paper or any flat surface without using special input tablets or position detection devices. A FieldMouse is a combination of an ID recognizer like a barcode reader and a mouse which detects relative movement of the device. Using a FieldMouse, a user first detects an ID on paper by using the barcode reader, and then drags it from the ID using the mouse. If the location of the ID is known, the location of the dragged FieldMouse can also be calculated by adding the amount of movement from the ID to the position of the FieldMouse. Using a FieldMouse in this way, any flat surface can work as a pointing device that supports absolute position input, just by putting an ID tag somewhere on the surface. A FieldMouse can also be used for enabling a graphical user interface (GUI) on paper or on any flat surface by analyzing the direction and the amount of mouse movement after detecting an ID. In this paper, we introduce how a FieldMouse can be used in various situations to enable computing in real-world environments.

This paper describes a physically embodied and animated user interface to an interactive call handling agent, consisting of a small wireless animatronic device in the form of a squirrel, bunny, or parrot. A software tool creates movement primitives, composes these primitives into complex behaviors, and triggers these behaviors dynamically at state changes in the conversational agent's finite state machine. Gaze and gestural cues from the animatronics alert both the user and co-located third parties of incoming phone calls, and data suggests that such alerting is less intrusive than conventional telephones.

Location information can be used to enhance interaction with mobile devices. While many location systems require instrumentation of the environment, we present a system that allows devices to measure their spatial relations in a true peer-to-peer fashion. The system is based on custom sensor hardware implemented as USB dongle, and computes spatial relations in real-time. In extension of this system we propose a set of spatialized widgets for incorporation of spatial relations in the user interface. The use of these widgets is illustrated in a number of applications, showing how spatial relations can be employed to support and streamline interaction with mobile devices.

Impromptu is a mobile audio device which uses wireless Internet Protocol (IP) to access novel computer-mediated voice communication channels. These channels show the richness of IP-based communication as compared to conventional mobile telephony, adding audio processing and storage in the network, and flexible, user-centered call control protocols. These channels may be synchronous, asynchronous, or event-triggered, or even change modes as a function of other user activity. The demands of these modes plus the need to navigate with an entirely non-visual user interface are met with a number of audio-oriented user interaction techniques.

In this paper, we explore the concept of dual-purpose speech: speech that is socially appropriate in the context of a human-to-human conversation which also provides meaningful input to a computer. We motivate the use of dual-purpose speech and explore issues of privacy and technological challenges related to mobile speech recognition. We present three applications that utilize dual-purpose speech to assist a user in conversational tasks: the Calendar Navigator Agent, DialogTabs, and Speech Courier. The Calendar Navigator Agent navigates a user's calendar based on socially appropriate speech used while scheduling appointments. DialogTabs allows a user to postpone cognitive processing of conversational material by proving short-term capture of transient information. Finally, Speech Courier allows asynchronous delivery of relevant conversational information to a third party.

Spoken language interfaces provide highly mobile, small form-factor, hands-free, eyes-free interaction with information. Uniform access to large lists of information using spoken interfaces is highly desirable, but problematic due to inherent limitations of speech. A speech widget for lists of attributed objects is described that provides for approximate queries to retrieve desired items. User tests demonstrate that this is an effective technique for accessing information using speech.

Many tasks require users to extract information from diverse sources, to edit or process this information locally, and to explore how the end results are affected by changes in the information or in its processing. We present the RecipeSheet, a general-purpose tool for assisting users in such tasks. The RecipeSheet lets users create information processors, called recipes, which may take input in a variety of forms such as text, Web pages, or XML, and produce results in a similar variety of forms. The processing carried out by a recipe may be specified using a macro or query language, of which we currently support Rexx, Smalltalk and XQuery, or by capturing the behaviour of a Web application or Web service. In the RecipeSheet's spreadsheet-inspired user interface, information appears in cells, with inter-cell dependencies defined by recipes rather than formulas. Users can also intervene manually to control which information flows through the dependency connections. Through a series of examples we illustrate how tasks that would be challenging in existing environments are supported by the RecipeSheet.

Most current interface designs require that the user focus their attention on them in order to be of value. However, as the price of computation falls, and computational capabilities make their way into many everyday objects, the demand for attention from many different directions may begin to seriously reduce the usefulness of these computational objects. Ambient information displays are intended to fit in a part of the interface design space that does not have this property. They are designed to convey background or context information that the user may or may not wish to attend to at any given time. Ambient Displays are designed to work primarily in the periphery of a user's awareness, moving to the center of attention only when appropriate and desirable. This paper describes a new ambient information display that is designed to give a rich medium of expression placed within an aesthetically pleasing decorative object. This display --- the Information Percolator --- is formed by air bubbles rising up tubes of water. By properly controlling the release of air, a set of pixels which scroll up the display is created. This allows a rendition of any (small, black and white) image to be displayed. The detailed design and construction of this display device will be considered, along with several applications.

In our previous studies into web design, we found that pens, paper, walls, and tables were often used for explaining, developing, and communicating ideas during the early phases of design. These wall-scale paper-based design practices inspired The Designers' Outpost, a tangible user interface that combines the affordances of paper and large physical workspaces with the advantages of electronic media to support information design. With Outpost, users collaboratively author web site information architectures on an electronic whiteboard using physical media (Post-it notes and images), structuring and annotating that information with electronic pens. This interaction is enabled by a touch-sensitive SMART Board augmented with a robust computer vision system, employing a rear-mounted video camera for capturing movement and a front-mounted high-resolution camera for capturing ink. We conducted a participatory design study with fifteen professional web designers. The study validated that Outpost supports information architecture work practice, and led to our adding support for fluid transitions to other tools.

We introduce a set of techniques for haptically manipulating digital media such as video, audio, voicemail and computer graphics, utilizing virtual mediating dynamic models based on intuitive physical metaphors. For example, a video sequence can be modeled by linking its motion to a heavy spinning virtual wheel: the user browses by grasping a physical force-feedback knob and engaging the virtual wheel through a simulated clutch to spin or brake it, while feeling the passage of individual frames. These systems were implemented on a collection of single axis actuated displays (knobs and sliders), equipped with orthogonal force sensing to enhance their expressive potential. We demonstrate how continuous interaction through a haptically actuated device rather than discrete button and key presses can produce simple yet powerful tools that leverage physical intuition.

This paper describes a novel physical icon [3] (“phicon”) based system that can be programmed to issue a range of commands about what the user wishes to do with handdrawn whiteboard content. Through the phicon's UI, a command to process whiteboard context is issued using infrared signaling in combination with image processing and a ceiling-mounted camera system. We leverage camera systems that are already used for capturing whiteboard content [4] by further augmenting these systems to detect the presence and location of IR beacons within an image. An HDLC-based protocol and a built-in IR transmitter are used to send these signals.

We describe a tangible interface for building virtual structures using physical building blocks. We demonstrate two applications of our system. In one version, the blocks are used to construct geometric models of objects and structures for a popular game, Quake II#8482;. In another version, buildings created with our blocks are rendered in different styles, using intelligent decoration of the building model.

The Actuated Workbench is a device that uses magnetic forces to move objects on a table in two dimensions. It is intended for use with existing tabletop tangible interfaces, providing an additional feedback loop for computer output, and helping to resolve inconsistencies that otherwise arise from the computer's inability to move objects on the table. We describe the Actuated Workbench in detail as an enabling technology, and then propose several applications in which this technology could be useful.

We have previously developed a collaborative infrastructure called SCAPE - an acronym for Stereoscopic Collaboration in Augmented and Projective Environments - that integrates the traditionally separate paradigms of virtual and augmented reality. In this paper, we extend SCAPE by formalizing its underlying mathematical framework and detailing three augmented Widgets constructed via this framework: CoCylinder, Magnifier, and CoCube. These devices promote intuitive ways of selecting, examining, and sharing synthetic objects, and retrieving associated documentary text. Finally we present a testbed application to showcase SCAPE's capabilities for interaction in large, augmented virtual environments.

Classroom BRIDGE supports activity awareness by facilitating planning and goal revision in collaborative, project-based middle school science. It integrates large-screen and desktop views of project times to support incidental creation of awareness information through routine document transactions, integrated presentation of awareness information as part of workspace views, and public access to subgroup activity. It demonstrates and develops an object replication approach to integrating synchronous and asynchronous distributed work for a platform incorporating both desktop and large-screen devices. This paper describes an implementation of these concepts with preliminary evaluation data, using timeline-based user interfaces.

In this paper we propose a new model for a class of rapid serial visual presentation (RSVP) interfaces [16] in the context of consumer video devices. The basic spatial layout "explodes" a sequence of image frames into a 3D trail in order to provide more context for a spatial/temporal presentation. As the user plays forward or back, the trail advances or recedes while the image in the foreground focus position is replaced. The design is able to incorporate a variety of methods for analyzing or highlighting images in the trail. Our hypotheses are that users can navigate more quickly and precisely to points of interest when compared to conventional consumer-based browsing, channel flipping, or fast-forwarding techniques. We report on an experiment testing our hypotheses in which we found that subjects were more accurate but not faster in browsing to a target of interest in recorded television content with a TV remote.

This paper describes a Computer Aided Design system for sketching free-form polygonal surfaces such as terrains and other natural objects. The user manipulates two 3D position and orientation trackers with three buttons, one for each hand. Each hand has a distinct role to play, with the dominant hand being responsible for picking and manipulation, and the less-dominant hand being responsible for context setting of various kinds. The less-dominant hand holds the workpiece, sets which refinement level that can be picked by the dominant hand, and generally acts as a counterpoint to the dominant hand. In this paper, the architecture of the system is outlined, and a simple surface is shown.

The personal universal controller (PUC) is an approach for improving the interfaces to complex appliances by introducing an intermediary graphical or speech interface. A PUC engages in two-way communication with everyday appliances, first downloading a specification of the appliance's functions, and then automatically creating an interface for controlling that appliance. The specification of each appliance includes a high-level description of every function, a hierarchical grouping of those functions, and dependency information, which relates the availability of each function to the appliance's state. Dependency information makes it easier for designers to create specifications and helps the automatic interface generators produce a higher quality result. We describe the architecture that supports the PUC, and the interface generators that use our specification language to build high-quality graphical and speech interfaces.

MediaMosaic is an editing environment developed to provide several features that are either unavailable or not adequately addressed in current editing systems. First, it is a multimedia editor of an open architecture. General media are inserted in documents by embedded virtual screens. Second, it allows users to do markup editing in context. The marked comments are overlapped and attached to the commented areas. Third, it provides a mechanism to allow users to bring data from more than one source to a single document. The views of the included data can be tailored. Fourth, users can work on an included medium through its embedded view or through another complete and duplicated view. It isolates and simplifies the interface design of individual media editors.

Communication is about people, not machines. But as firms and families alike spread out geographically, we rely increasingly on telecommunications tools to keep us “connected”. The challenge of such systems is to enable conversation between individuals without computational infrastructure getting in the way. This paper compares two speech-based communication systems, Phoneshell and Chatter, in how they deal with the keys to communication: proper names. Chatter, a conversational system using speech-recognition, improves upon the hierarchical nature of the touch-tone based Phoneshell by maintaining context and enabling use of anaphora. Proper names can present particular problems for speech recognizers, so an interface algorithm for reliable name specification by spelling is offered. Since individual letter recognition is non-robust, Chatter implicitly disambiguates strings of letters based on context. We hypothesize that the right interface can make faulty speech recognition as usable as TouchTones---even more so.

This paper presents interaction techniques (and the underlying implementations) for putting clothes on a 3D character and manipulating them. The user paints freeform marks on the clothes and corresponding marks on the 3D character; the system then puts the clothes around the body so that corresponding marks match. Internally, the system grows the clothes on the body surface around the marks while maintaining basic cloth constraints via simple relaxation steps. The entire computation takes a few seconds. After that, the user can adjust the placement of the clothes by an enhanced dragging operation. Unlike standard dragging where the user moves a set of vertices in a single direction in 3D space, our dragging operation moves the cloth along the body surface to make possible more flexible operations. The user can apply pushpins to fix certain cloth points during dragging. The techniques are ideal for specifying an initial cloth configuration before applying a more sophisticated cloth simulation.

Sometimes users fail to notice a change that just took place on their display. For example, the user may have accidentally deleted an icon or a remote collaborator may have changed settings in a control panel. Animated transitions can help, but they force users to wait for the animation to complete. This can be cumbersome, especially in situations where users did not need an explanation. We propose a different approach. Phosphor objects show the outcome of their transition instantly; at the same time they explain their change in retrospect. Manipulating a phosphor slider, for example, leaves an afterglow that illustrates how the knob moved. The parallelism of instant outcome and explanation supports both types of users. Users who already understood the transition can continue interacting without delay, while those who are inexperienced or may have been distracted can take time to view the effects at their own pace. We present a framework of transition designs for widgets, icons, and objects in drawing programs. We evaluate phosphor objects in two user studies and report significant performance benefits for phosphor objects.

Temporal events, while often discrete, also have interesting relationships within and across times: larger events are often collections of smaller more discrete events (battles within wars; artists' works within a form); events at one point also have correlations with events at other points (a play written in one period is related to its performance over a period of time). Most temporal visualisations, however, only represent discrete data points or single data types along a single timeline: this event started here and ended there; this work was published at this time; this tag was popular for this period. In order to represent richer, faceted attributes of temporal events, we present Continuum. Continuum enables hierarchical relationships in temporal data to be represented and explored; it enables relationships between events across periods to be expressed, and in particular it enables user-determined control over the level of detail of any facet of interest so that the person using the system can determine a focus point, no matter the level of zoom over the temporal space. We present the factors motivating our approach, our evaluation and implementation of this new visualisation which makes it easy for anyone to apply this interface to rich, large-scale datasets with temporal data.

Progress bars are prevalent in modern user interfaces. Typically, a linear function is employed such that the progress of the bar is directly proportional to how much work has been completed. However, numerous factors cause progress bars to proceed at non-linear rates. Additionally, humans perceive time in a non-linear way. This paper explores the impact of various progress bar behaviors on user perception of process duration. The results are used to suggest several design considerations that can make progress bars appear faster and ultimately improve users' computing experience.

Although mobile, tablet, large display, and tabletop computers increasingly present opportunities for using pen, finger, and wand gestures in user interfaces, implementing gesture recognition largely has been the privilege of pattern matching experts, not user interface prototypers. Although some user interface libraries and toolkits offer gesture recognizers, such infrastructure is often unavailable in design-oriented environments like Flash, scripting environments like JavaScript, or brand new off-desktop prototyping environments. To enable novice programmers to incorporate gestures into their UI prototypes, we present a "$1 recognizer" that is easy, cheap, and usable almost anywhere in about 100 lines of code. In a study comparing our $1 recognizer, Dynamic Time Warping, and the Rubine classifier on user-supplied gestures, we found that $1 obtains over 97% accuracy with only 1 loaded template and 99% accuracy with 3+ loaded templates. These results were nearly identical to DTW and superior to Rubine. In addition, we found that medium-speed gestures, in which users balanced speed and accuracy, were recognized better than slow or fast gestures for all three recognizers. We also discuss the effect that the number of templates or training examples has on recognition, the score falloff along recognizers' N-best lists, and results for individual gestures. We include detailed pseudocode of the $1 recognizer to aid development, inspection, extension, and testing.

Conventional interface builders allow the user interface designer to select widgets such as menus, buttons and scroll bars, and lay them out using a mouse. Although these are conceptually simple to use, in practice there are a number of problems. First, a typical widget will have dozens of properties which the designer might change. Insuring that these properties are consistent across multiple widgets in a dialog box and multiple dialog boxes in an application can be very difficult. Second, if the designer wants to change the properties, each widget must be edited individually. Third, getting the widgets laid out appropriately in a dialog box can be tedious. Grids and alignment commands are not sufficient. This paper describes Graphical Tabs and Graphical Styles in the Gild interface builder which solve all of these problems. A “graphical tab” is an absolute position in a window. A “graphical style” incorporates both property and layout information, and can be defined by example, named, applied to other widgets, edited, saved to a file, and read from a file. If a graphical style is edited, then all widgets defined using that style are modified. In addition, because appropriate styles are inferred, they do not have to be explicitly applied.

Conventional windowing environments provide separate classes of objects for user interface components, or “widgets,” and graphical objects. Widgets negotiate layout and can be resized as rectangles, while graphics may be shared, transformed, transparent, and overlaid. This presents a major obstacle to applications like user interface builders and compound document editors where the manipulated objects need to behave both like graphics and widgets.
Fresco[1] blends graphics and widgets into a single class of objects. We have an implementation of Fresco and an editor called Fdraw that allows graphical objects to be composed like widgets, and widgets to be transformed and shared like graphics. Performance measurements of Fdraw show that sharing reduces memory usage without slowing down redisplay.

This paper introduces a new type of interface for 3D drawings that improves the usability of gestural interfaces and augments typical command-based modeling systems. In our suggestive interface, the user gives hints about a desired operation to the system by highlighting related geometric components in the scene. The system then infers possible operations based on the hints and presents the results of these operations as small thumbnails. The user completes the editing operation simply by clicking on the desired thumbnail. The hinting mechanism lets the user specify geometric relations among graphical components in the scene, and the multiple thumbnail suggestions make it possible to define many operations with relatively few distinct hint patterns. The suggestive interface system is implemented as a set of suggestion engines working in parallel, and is easily extended by adding customized engines. Our prototype 3D drawing system, Chateau, shows that a suggestive interface can effectively support construction of various 3D drawings.

We introduce a set of techniques for haptically manipulating digital media such as video, audio, voicemail and computer graphics, utilizing virtual mediating dynamic models based on intuitive physical metaphors. For example, a video sequence can be modeled by linking its motion to a heavy spinning virtual wheel: the user browses by grasping a physical force-feedback knob and engaging the virtual wheel through a simulated clutch to spin or brake it, while feeling the passage of individual frames. These systems were implemented on a collection of single axis actuated displays (knobs and sliders), equipped with orthogonal force sensing to enhance their expressive potential. We demonstrate how continuous interaction through a haptically actuated device rather than discrete button and key presses can produce simple yet powerful tools that leverage physical intuition.

User interface toolkits and higher-level tools built on top of them play an ever increasing part in developing graphical user interfaces. This paper describes the XIT system, a user interface development tool for the X Window System, based on Common Lisp, comprising user interface toolkits as well as high-level interactive tools organized into a layered architecture. We especially focus on the object-oriented design of the lower-level toolkits and show how advanced features for describing automatic screen layout, visual feedback, application links, complex interaction, and dialog control, usually not included in traditional user interface toolkits, are integrated.

Many user interface toolkits use constraint solvers to maintain geometric relationships between graphic objects, or to connect the graphics to the application data structures. One efficient and flexible technique for maintaining constraints is multi-way local propagation, where constraints are represented by sets of method procedures. To satisfy a set of constraints, a local propagation solver executes one method from each constraint.
SkyBlue is an incremental constraint solver that uses local propagation to maintain a set of constraints as individual constraints are added and removed. If all of the constraints cannot be satisfied, SkyBlue leaves weaker constraints unsatisfied in order to satisfy stronger constraints (maintaining a constraint hierarchy). SkyBlue is a more general successor to the DeltaBlue algorithm that satisfies cycles of methods by calling external cycle solvers and supports multi-output methods. These features make SkyBlue more useful for constructing user interfaces, since cycles of constraints can occur frequently in user interface applications and multi-output methods are necessary to represent some useful constraints. This paper discusses some of applications that use SkyBlue, presents times for some user interface benchmarks and describes the SkyBlue algorithm in detail.

We describe a set of application frameworks designed especially to support information-intensive applications in complex domains, where the visual organization of an application's information is critical. Our frameworks, called visual formalisms, provide the semantic structures and editing operations, as well as the visual layout algorithms, needed to create a complete application. Examples of visual formalisms include tables, panels, graphs, and outlines. They are designed to be extended both by programmers, through subclassing, and by end users, through an integrated extension language.

Conventional interface builders allow the user interface designer to select widgets such as menus, buttons and scroll bars, and lay them out using a mouse. Although these are conceptually simple to use, in practice there are a number of problems. First, a typical widget will have dozens of properties which the designer might change. Insuring that these properties are consistent across multiple widgets in a dialog box and multiple dialog boxes in an application can be very difficult. Second, if the designer wants to change the properties, each widget must be edited individually. Third, getting the widgets laid out appropriately in a dialog box can be tedious. Grids and alignment commands are not sufficient. This paper describes Graphical Tabs and Graphical Styles in the Gild interface builder which solve all of these problems. A “graphical tab” is an absolute position in a window. A “graphical sty