After you have incorporated the results of the “Wizard of Oz” testing,
you will want to code and test a working prototype of the application.
During this phase, be sure to analyze the behavior of both new and,
if applicable, expert users.
Identifying
recognition problems
As
you proceed with the Test phase, note any consistent recognition
problems.
The most common cause of recognition problems is
acoustic confusability among the currently active phrases. For example,
both Madison and Addison are US airports. Thus, these potential user
inputs to a travel application are highly confusable:
User: |
Flying from Madison |
User: |
Flying from Addison |
|
Sometimes there is nothing you can do when this happens.
Other times you can try to correct the problem by:
- Using a synonym for one of the terms. For example, if the system
is confusing “no” and “new,” you might be able to replace “new” with
“recent,” depending on the application's context.
- Adding a word to one or more of the choices. For the Madison/Addison
airport confusion, you could make states optional in the grammar for
most cities, but require the state for low-traffic airports that have
acoustic confusability with higher-traffic airports.
- Plan for disambiguation by writing code that includes or accesses
data about typical acoustic confusions. For example:
System: |
Flying from? |
User: |
Los Angeles <not flagged as confusable> |
System: |
Flying to? |
User: |
Newark <flagged as confusable with New
York> |
System: |
Newark, New Jersey or New York, New York? |
User: |
Newark, New Jersey |
|
Identifying
any user interface breakdowns
The
Test phase is also where you will identify potential user interface
breakdowns. Some factors you may want to analyze include:
- Percentage of users who did not successfully complete your test
scenarios
- Percentage of users who transferred to a human operator, when
this was not the desired outcome
- Points in the application where users experienced the most difficulty
- Unexpected user behaviors
- Effectiveness of error recovery mechanisms
- Time to complete typical transactions
- Self-reported level of user satisfaction
The first round of user testing typically reveals places
where the system's response needs to be rephrased to improve usability.
For this reason, system prompts and other messages should be left
flexible for as long as possible, at least until after the first round
of user testing.