Audio formatting is a useful technique that increases the bandwidth of spoken communication by using speech and non-speech cues to overlay structural and contextual information on spoken output. Audio formatting is analogous to the well-understood notion of visual formatting, where designers use font changes, indenting, and other aspects of visual layout to cue the reader to the underlying structure of the information being presented.
A disadvantage to using audio formatting is that there are not yet standard sounds for specific purposes. If you plan to use audio formatting, you may want to work with an audio designer (analogous to a graphic designer for graphical user interfaces) to establish a set of pleasing and easily discriminated sounds for these purposes.
When designing audio formatting, the tones should be kept short: typically no longer than 0.5–1.0 seconds, and even as short as 75 ms. Shorter tones are generally less obtrusive, so users are more likely to perceive them as useful, rather than distracting.
You can use non-speech cues to indicate dialog state, exceptions to normal system behavior, and content formatting, as described in Table 1.
Purpose | Recommendations |
---|---|
Turn-taking tone (barge-in disabled only) | If you disable barge-in, you might want to use audio formatting
to indicate when it is the user's turn to speak, as described in Table 1. An effective turn-taking tone will generally have the following
characteristics:
|
Barge-in temporarily disabled | When barge-in is temporarily disabled (for example, when
legal notices are read), you may want to play a unique background sound or
use a special tone or prompt as an indicator.
|
Audio cue for bulleted list | Consider using a short sound snippet as an auditory icon. |
Audio cue for emphasis (akin to visual bold and italics) | Consider using an auditory inflection technique, such as changing volume or pitch. |
Audio cue for secure transactions | For secure transactions, you may want to play a unique background sound or use a special tone or prompt. See the recommendations for Barge-in temporarily disabled above. |
Audio cue for “system busy” (akin to visual hourglass) | You can use the fetchaudio attribute
to play an audio file when the system is busy fetching documents. The audio
file stops playing as soon as the document is retrieved. If you use a ticking
tone for “system busy,” use a fairly slow ticking rate (about 1-2
seconds between ticks). Avoid rates that are faster than 1 second per tick.
Alternatively, consider playing music when the system is busy.
Note: When users are asked to wait, research has been shown that they will follow
the instruction for at least 7 seconds of silence.
|