Robert B. Terwilliger and Peter G. Polson
Institute of Cognitive Science
University of Colorado, Boulder, CO 80309-0345
telephone: (303) 492-4574
email: terwilli@psych.colorado.edu
A study measured the time experienced Macintosh users took to create a graph from pre-existing data, including the assignment of variables to axes in a dialog box. The study revealed that the task took less time when the items in the dialog box were labeled in terms of one problem representation, even when the instructions were written in terms of another. The Kitajima and Polson model explains this as resulting from the problem representation being elaborated with task-specific schemata during the instruction comprehension process.
empirical studies, cognitive models.
Consider the following situation: An experienced user sits in front of a computer, attempting to complete a familiar task using an unfamiliar application. For each step in the process, they first read some instructions from the manual open at their side and then turn to the computer to actually accomplish the actions about which they have just read. What are the cognitive processes involved in this situation? And, how should the manual and interface be designed so as to minimize the difficulty of the task? The Kitajima and Polson model [2,3] can tell at least two stories.
In the task elaboration story, the user has a complex, pre-existing
representation of the task to be accomplished; therefore, when
they read the instructions, they translate them into specific
task goals using pre-existing schemata. They then attempt to achieve
these goals using the interface. Because of this, the task is
easiest when the interface matches their pre-existing representation,
regardless of the way in which the instructions are written.
On the other hand, in the label following story, the user constructs their task goals from a superficial representation of the instructions. Loosely, they remember keywords and phrases pretty much verbatim, and look for menu items andother labels that match what they remember. Therefore, the task is easiest when the manual and interface use identical terminology.
Which story is correct? Or, are both stories correct, but at different
times? There is empirical support for label following [1]; however,
it is unclear if this account is universal. Therefore, we performed
a study to determine if, at least at times, subjects elaborate
their task goals using pre-existing schemata. In our opinion,
the results show that they do.
In this study, we examined the graph creation task previously
discussed by Polson et. al. [1-3]. In our version of the task,
the user first reads a single sentence of instructions, then creates
a graph from pre-existing data by pulling down a menu, releasing
on a menu item, and then assigning variables to axes in a dialog
box. The specific question we asked was: Are users faster at this
task when the labels in the dialog box match the wording of the
instructions, or when they match their pre-existing representation
of the problem?
As suggested by Kitajima and Polson [2], we considered two versions
of both the instructions and the dialog box. The variables in
the data to be graphed were "absences" and "month;"
therefore, the "XY" instructions read "create a
graph with absences on the X axis and month on the Y axis,"
and the "FN" instructions read "create a graph
of absences as a function of month." Similarly, the axes
assignment dialog box had two selection lists: In the "XY"
version, the left selection list was labeled "X Axis:"
and the right selection list was labeled "Y Axis:".
In the "FN" version, the left list was labeled "Plot:"
and the right list was labeled "As a Function of:".
We exposed different subjects to all combinations of instructions and dialog box type. We assumed that the "XY" version of the dialog box would match subjects pre-existing representation of the task; therefore, if the task elaboration story were correct, subjects would perform the task faster with the "XY" dialog box, regardless of the type of instructions. On the other hand, if the label following story better described their behavior, then they would complete the task faster when the instructions matched the dialog box, regardless of its type.
The basic design of the experiment was a 2 X 2 factorial with both instructions and dialog box type as between-subjects variables. Both the apparatus and procedure were designed to be as similar as possible to that used by Franzke [1].
Sixteen subjects were drawn from the introductory psychology subject pool at the University of Colorado and received course credit for their participation in the experiment. On average, they were 19 years old, had been using 3.1 different applications on the Macintosh for 3 years, and had made about 50 graphs before beginning the experiment. The data for four additional subjects was not used: two due to equipment failures, and two because they could not complete the task unaided.
All subjects performed their tasks on custom interfaces constructed using Visual Basic (Applications Edition) on top of Microsoft Excel version 5.0. Four versions of the system were created, one for each combination of "XY" or "FN" instructions with "XY" or "FN" dialog box. Subjects read the instructions on one sheet of a workbook, then switched to a different sheet to perform the necessary actions. The time to complete each task was recorded automatically by calling from Visual Basic to an external C procedure and then invoking the Ticks routine in the operating system.
Upon arriving at the location of the experiment, subjects were first given a questionnaire assessing their computer and graphing experience. They were then given think aloud instructions before performing three warm up tasks. The first introduced them to unusual features of the experimental interfaces, the second had them sort a table of data, and the third had them do a number of simple formatting tasks. After completing the warm ups, the subjects performed the graph creation task and were then allowed to leave.
The total time to create the graph was recorded for each subject. The average times for each condition are shown in Table 1. An ANOVA with two between-subjects variables revealed that, on average, the task took significantly longer when the dialog box had the "FN" labels than when it had the "XY" labels F(1,12) = 14.31, p = .0026, partial r2 = .544. There were no other significant effects or interactions. A set of planned comparisons revealed that the average times for the two versions of dialog box were significantly different for each version of the instructions, but that the times for the two versions of the instructions were not significantly different for either version of the dialog box.
The shorter completion times for conditions with the "XY" dialog box, regardless of the type of instructions, strongly supports the elaboration of goals using pre-existing task schemata as suggested by Kitajima and Polson [3]. Further
|
|
| |
|
|
| |
|
|
| |
Table 1. Average time in seconds to create graph by instructions
and dialog box type.
support is provided by the lack of any significant difference
between conditions in which the instructions match the dialog
box, and those in which they don't. At least in this situation,
subjects do not seem to be label following.
However, we should be careful not to over generalize from this
experience. The results of the current study suggest that, not
only do subjects elaborate the instructions to create specific
task goals, but that they are unable to hold both the instructions
and the goals in short term memory simultaneously. Otherwise,
subjects would also perform well in the "FN" instructions,
"FN" dialog box condition. It is quite unlikely that
this limitation will hold for all users in all situations.
Despite its small scope, the above study does have some implications
for designers. Our results suggest that it is important to know
what, if any, pre-existing conceptions of a task the users of
a system may have. If the terminology used in the manuals, menus,
and dialog boxes does not match their expectations, users may
perform poorly despite the overall elegance of a design.
The study reported here is preliminary. For the present we are content to say that, at least in some instances, users do seem to elaborate goals using pre-existing task schemata, and that this produces changes in their behavior significant enough to be measured with available technology. We look forward to exploring the issue further.
Support has been provided by NSF grant IRI 9116640.