While you could certainly build voice applications without using a voice markup language and a speech browser (for example, by writing your applications directly to a speech API), using VoiceXML and a VoiceXML browser provide several important capabilities:
- VoiceXML is a markup language that makes building voice applications easier,
in the same way that HTML simplifies building visual applications. VoiceXML
also reduces the amount of speech expertise that developers need.
- VoiceXML applications can use the same existing back-end business logic
as their visual counterparts, enabling voice solutions to be introduced to
new markets quickly. Current and long-term development and maintenance costs
are minimized by leveraging the Web design skills and infrastructures already
present in the enterprise. Customers can benefit from a consistency of experience
between voice and visual applications.
- VoiceXML implements a client/server paradigm, where a Web server provides
VoiceXML documents that contain dialogs to be interpreted and presented to
a user. The user's responses are submitted to the Web server, which responds
by providing additional VoiceXML documents, as appropriate. VoiceXML allows
you to request documents and submit data to server scripts using Universal
Resource Identifiers (URIs). VoiceXML documents can be static, or they
can be dynamically generated by CGI scripts, Java Beans, ASPs, JSPs, Java
servlets, or other server-side logic.
- Unlike a proprietary Interactive Voice Response (IVR) system, VoiceXML
provides an open application development environment that generates portable
applications. This makes VoiceXML a cost-effective alternative for providing
voice access services.
- Most installed IVR systems today accept input from the telephone keypad
only. In contrast, VoiceXML is designed predominantly to accept spoken input,
but it can also accept DTMF input, if desired. As a result, VoiceXML helps
speed up customer interactions by providing a more natural interface that
replaces the traditional, hierarchical IVR menu tree with a streamlined dialog
using a flattened command structure.
- VoiceXML directly supports networked and Web-based applications, meaning
that a user at one location can access information or an application provided
by a server at another geographically or organizationally distant location.
This capitalizes on the connectivity and commerce potential of the World Wide
Web.
- Using a single VoiceXML browser to interpret streams of markup language
originating from multiple locations provides the user with a seamless conversational
experience across independent applications. For example, a voice portal application
might allow a user to temporarily suspend an airline purchase transaction
to interact with a banking application on a different server to check an account
balance.
- VoiceXML supports local processing and validation of user input.
- VoiceXML supports playback of prerecorded audio files.
- VoiceXML supports recording of user input. The resulting audio can be
played back locally or uploaded to the server for storage, processing, or
playback at a later time.
- VoiceXML defines a set of events corresponding to such activities as a
user request for help, the failure of a user to respond within a timeout period,
and an unrecognized user response. A VoiceXML application can provide catch
elements that respond appropriately to a given event for a particular context.
- VoiceXML supports context-specific and tapered help using a system of
events and catch elements. Help can be tapered by specifying a count for each
event handler, so that different event handlers are executed depending on
the number of times that the event has occurred in the specified context.
This can be used to provide increasingly more detailed messages each time
the user asks for help. For more information, see Choosing help mode or self-revealing help.
- VoiceXML supports subdialogs, which are roughly the equivalent of function
or method calls. Subdialogs can be used to provide a disambiguation or confirmation
dialog, and to create reusable dialog components. For more information, see Subdialogs.