![]()
|
Web users are suffering from information overload. In 1996 Jakob Nielson saw Web surfing ending, and the rise of relationships between the site and its users. He sees users as having relationships with a small number of key websites, and that website developers will have to start treating their users as individuals. This concept is illustrated even further when thinking about developing intranets. Users will occasionally peruse sub-sites other than their own, but by and large they will access the organization's overall site to determine one type of information, and then access their own sub-site to obtain more specific information.
As individuals continue to home in on specific sites, developers must begin to see them as such and not faceless members of the web-surfing crowd. A developer planning an effective website (one that meets its user's needs) should therefore place the users at the center of the design process. This requires the developer to become well-versed in aspects of cognitive psychology, such as how individuals organize and remember information. Philosophies such as "less is more", and designing pages that are simple and elegant are being embraced.
This interdisiplinary and pluralistic approach to design is challenging and exciting. But how can we ensure that we are adhering to these principles? One method is to conduct usability testing. I pose that good usability tests do not have to be elaborate, expensive, or labor intensive. While conducting a single test can provide useful information, we discovered that using a set of related tests was considerably better at identifying problematic interfaces. Test results were more convincing to developers and management, especially when attempting to promote usability testing as a standard component of the development lifecycle.
The Bureau of Labor Statistics (BLS) released an Internet Web site in September, 1995. A primary factor in its resounding success was the usability evaluation conducted by two resident HCI specialists. Encouraged by this experience, BLS decided to use the new Web technology to develop an improved procedure for distributing internal information to its employees.
A small intranet design team was established in August, 1996. The team was given two weeks to design an approach for developing an intranet, and present their recommendations to upper management. The proposal recommended that a prototype be developed, and be subjected to usability testing.
The prototype was completed in October. Management approached the usability test team, and requested that usability testing be conducted. The test team, which consisted of myself and two usability specialists, was given two and one half weeks to design the evaluation, conduct the tests, evaluate the results, and present our findings. We met with the prototype's management team to discuss their requirements, and identified the overall organizational issues as being the most critical (rather than the individual leaf pages). The testing thus focused on the prototype's high level structure.
Given this precedent, we decided to use a Card Sort Exercise, an "Icon Mix-and-Match" test, and a "Category Membership Expectations" test. The Card Sort exercise was designed to determine what mental hierarchy users construct when given a set of anticipated leaf pages from the intranet site. The Icon Mix-and-Match test was designed to find associations and/or interference between button pictures and the associated button text. The Category Membership Expectations test was designed to elicit users' understanding of a set of categories and their associated labels. The tests were conducted in one afternoon, in two separate sessions (first the Card Sort, then the Icon Mix-and-Match and the Category Membership Expectations test).
Seventeen test subjects were identified and recruited. None of the participants were involved in designing or implementing the prototype. They were distributed over as many BLS offices as possible. All were expected to have worked in the Bureau long enough to have a reasonably firm grasp of the BLS organizational structure, and the type of work performed at BLS. Some Web experience was preferred. Since the objective of this series of tests was focused on the prototype's overall design, the tests did not address individual page design.
Each participant was given one set of randomly ordered peach index cards that contained the individual items (in this case, anticipated leaf pages from the intranet site), rubber bands, and blank white index cards. They were asked to arrange the items into logical groupings and place a rubber band around each group. If the banded groups could be further aggregated, they were asked to band those, place a blank index card on top, and label each grouping with a title that best described its content.
Sample items were:
We used the Statistical package for the Social Sciences (SPSS) to perform a hierarchical cluster analysis on the data. The resultant dendrogram was useful in aggregating multiple respondents' hierarchies, but was not the final word on an optimal site structure. By comparing the results to the proposed site structure, it was possible to see how well it reflected users' mental organization of the site's information, and was very useful for testing the site's overall structure. In this card sort the participants could have grouped leaf nodes by function. Some respondents, for example, did group all conference room reservations together, but the majority grouped leaf nodes by BLS organizational structure (by office or program). Being aware that thought patterns among human beings differ, a user's specific interpretation of an information space was not disregarded. Instead, we attempted to determine how or why the user may have divised that arrangement.
The expenses associated with card sorting were the time it took to make the cards (one day), conduct the test (an afternoon), and compile the results (two days). Conducting this test would have been less useful if the test participants had been new employees, and consequently less familiar with BLS's organizational structure. Card sorting can be used anytime when information needs to be "chunked". However, when used early in the design phase, it avoids wasteful activities such as designing graphics that will be later thrown out and creating links that will be broken.
In this test, participants selected an icon they felt best represented a category. "What's New," for example, might be matched to a tiny picture of a newspaper by a participant choosing from a matrix of pictographic representations. If several other participants made the same choice and did not pick the newspaper to represent something else, then there's a good chance that the wider user population will make that association as well. The test team looked at how participants matched a textual category label to an icon, as well as any possible interference existing between icons and category labels. We established a threshold of 70% agreement for an icon label pair to be successful.
Participants were asked to match 16 icons with six categories. Participants were given a spreadsheet with the 16 icons placed in 6 rows. They were instructed to select the best icon that represented the corresponding category, and place an X in the cell. If more than one icon corresponded to that category, or none did, participants could place an X in multiple cells, or in none. The icons and the categories were placed in random order to minimize bias.
The strongest match for any category was 100%, the weakest was 10%. The criterion for selection was a category with a match equal to or greater than 70%. The test team also looked for any possible interference between icons and categories, i.e. an icon that matched multiple categories. The criterion for such interference was a match greater than 40% for a single icon in more than one category. No such interference was found.
Research in human-computer interaction has found that the benefit of an icon/label pair is that the two different formats reinforce one another, and users can focus on either the picture or the text, whichever is more efficient.
Minimal resources were required to conduct the test. We designed the spreadsheet in less than one day, and it took approximately 20 minutes for the participants to complete, and about an hour to tally the results. Since this particular test had icons and possible associated categories, it was not sensitive in determining whether users might have provided other textual category labels for the icons. This type of test is best used early in the development process.
Through a category membership expectation test, you can get an idea of what the user expects to find under your categories and thus determine how usable your organizational scheme is. It is designed to elicit users' understanding of a set of categories and their associated labels. We looked for thematic agreement among the participants as to what they believed would appear in a certain category, and whether or not it conformed to the intranet designers' view. Responses were tallied and consolidated.
Participants were given a form which listed the six prototype categories. They were asked to list the kind of material they would expect to find in those categories when they were on the BLS home page, and when they were on their organization's home page. To preserve context, the categories were listed in the same order as they appear in the prototype.
The categories were:
Two categories clearly worked: 'New' and 'Reference'. Three categories clearly did not: 'Tools', 'Services', and 'Map'. The remaining category, 'Org View', was a partial success: users expected an organization chart, but would not look for significant links to sub-sites here.
Minimal resources were required for this test. We designed the form is less than one day, and it took approximately 30 minutes for the participants to complete it, and about three hours to tally the results. This test methods should be used early in the development process.
The interrelationship between the tests is what made them especially useful because they all asked users to group the same type of information (categories), but through different instruments. The Card Sort asked participants to group predetermined items into hierarchical categories, while the Category Membership Expectation test asked users to place their own explicit items in the categories. The Icon Mix-and-Match asked participants to match icons with categories.
The test results provided valuable information about the proposed categories. The Card Sort exercise validated the prototype designers' fundamental approach: follow the BLS organizational structure. The Category Membership Expectation test showed that the specific instantiation chosen by the designers was flawed. Our interpretation of the results lead to a high level split between information of general interest to most BLS employees, and information relevant to specific offices within the Bureau. Office specific information is then further divided between programs.
This series of tests did not address page design at all. The evaluators noticed many areas in different pages and sub-sites that we considered sub-optimal (gratuitous moving and flashing elements, for instance). Hence, we strongly recommended that page level and sub-site level usability testing be carried out in the future.


