VoiceXML elements and attributes

VoiceXML is an XML-based application, meaning that it is defined as a set of XML tags or elements. The VoiceXML elements and provides implementation details specific to the VoiceXML browser in the Blueworx Voice Response product are listed here.

For additional information on grammar support in Blueworx Voice Response, see Grammars

Table 1. Summary of VoiceXML elements and attributes
Element Description Implementation details
<assign> Assigns a value to a variable. Supported as documented in VoiceXML 2.0.
<audio> Plays an audio file within a prompt or synthesizes speech from a text string if the audio file cannot be found.

Supported as documented in VoiceXML 2.0;

You can specify an audio file using a URI or you can use the expr attribute to play audio recorded using the <record> element.

The supported audio formats are:
  • an 8KHz 8-bit mu-law .au file
  • an 8KHz 8-bit mu-law .wav file
  • an 8KHz 8-bit a-law .au file
  • an 8KHz 8-bit a-law .wav file
  • an 8KHz 16-bit linear .au file
  • an 8KHz 16-bit linear .wav file
Use an 8-bit format whenever possible; 16-bit linear files take twice as much storage and require twice as long to download.
<block> Specifies a form item containing non-interactive executable content. Supported as documented in VoiceXML 2.0.
<break> Inserts a pause in TTS output.

VoiceXML 2.0 inherits this element from SSML.

VoiceXML 2.1 added the strength attribute. The time attribute now specifies the duration of a pause to be inserted in the output in seconds or milliseconds, for example, time="250ms" or time="3s".

For more information on programming for TTS, see Speech Synthesis Markup Language (SSML).

<catch> Catches an event. Supported as documented in VoiceXML 2.0.
<choice> Specifies a menu item. Supported as documented in VoiceXML 2.0.
<clear> Clears one or more form item variables. Supported as documented in VoiceXML 2.0.
<data> Allows a VoiceXML application to fetch arbitrary XML data from a document server without transitioning to a new VoiceXML document. Supported as documented in VoiceXML 2.1.
<disconnect> Causes the VoiceXML browser to disconnect from a user telephone session. VoiceXML 2.1 added the namelist attribute to this element. Otherwise, supported as documented in VoiceXML 2.0.
<dtmf> Specifies a DTMF grammar.

This element is not part of VoiceXML 2.0 and is not supported by Blueworx Voice Response.

You can, however, specify a DTMF grammar using the <grammar> element with the mode="dtmf" attribute.

Note: DTMF grammars used with Blueworx Voice Response do not support the use of hotword barge-in unless Blueworx Voice Response is configured to use DTNA and VoIP/SIP, and is also configured so that the DTMF detection is passed to an MRCP server rather than allowing the default behavior for Blueworx Voice Response to detect DTMFs. See Remote DTMF grammars for details of how to configure Blueworx Voice Response to do this.

Otherwise, if a prompt is played with bargeintype="hotword" and a DTMF grammar is active, the prompt will stop as soon as any DTMF key is detected, and the prompt will be played as though it was set to bargeintype="speech". This occurs regardless of whether or not any speech grammars are also active.

<else> Conditional statement used with the <if> element. Supported as documented in VoiceXML 2.0.
<elseif> Conditional statement used with the <if> element. Supported as documented in VoiceXML 2.0.
<emp> Specifies the emphasis for TTS output.

This element has been replaced by the SSML element <emphasis>. For more information, see Speech Synthesis Markup Language (SSML).

<enumerate> Shorthand construct that causes the VoiceXML browser to speak the text of each <choice> element when presenting the list of menu selections to the user. When using the <enumerate> element to play menu choices with the TTS engine, you do not have to add punctuation to control the length of the pauses between <choice> elements. The VoiceXML browser will automatically add the appropriate pauses and intonations when speaking the prompts.
<error> Catches an error event. Supported as documented in VoiceXML 2.0.
<exit> Exits a VoiceXML browser session. As described in VoiceXML 2.0, the <exit> element returns control to the interpreter, which determines what action to take. The namelist attribute is ignored. Blueworx Voice Response supports the use of the expr attribute to associate data with the call.
<field> Defines an input field in a form. Supported as documented in VoiceXML 2.0.
<filled> Specifies an action to execute when a field is filled. Supported as documented in VoiceXML 2.0.
<foreach> Allows a VoiceXML application to iterate through an ECMAScript array and to execute the content contained within the element for each item in the array. Supported as documented in VoiceXML 2.1.
<form> Specifies a dialog for presenting and gathering information. Supported as documented in VoiceXML 2.0. The VoiceXML browser supports mixed initiative dialogs using SISR. See Mixed-initiative application and form-level grammars for details.
<goto> Specifies a transition to another dialog or document. Supported as documented in VoiceXML 2.0.
<grammar> Defines a speech recognition grammar.

Voice XML 2.1 added the srcexpr attribute to this element.

Supported values for the type attribute are:
Grammar
Type
SRGF ABNF
application/srgs
SRGF XML
application/srgs+xml
See Grammars for additional information on grammar support in WebSphere Voice Server.
<help> Catches a help event. Supported as documented in VoiceXML 2.0.
<if> Defines a conditional statement. Supported as documented in VoiceXML 2.0.
<initial> Prompts for form-wide information in a mixed-initiative form. Supported as documented in VoiceXML 2.0.
<link> Specifies a transition to a new document or throws an event, when activated. Supported as documented in VoiceXML 2.0.
<log> Generates a debug message. Supported as documented in VoiceXML 2.0. For details about where data is logged or how data is accessed, refer to the sections “dtjflog script” and “dtjuserlog script” in the Deploying and Managing VoiceXML and Java Applications information.
<mark> Places a marker into the text/tag sequence. Supported as documented in VoiceXML 2.1.
<menu> Specifies a dialog for selecting from of a list of choices. Supported as documented in VoiceXML 2.0.
<meta> Specifies meta data about the document. The http-equiv attribute is supported for the values date, expires and lastModified. However, <meta> declarations in Speech Recognition Grammars are ignored. See Table 1 for more information.

The VoiceXML browser ignores the name attribute and any content specified with it. You can use these attributes to identify and assign values to the properties of a document, as defined by the HTML 4.0 specification (http://www.w3.org/TR/REC-html40/) and the HTTP 1.1 specification (http://www.ietf.org/rfc/rfc2616.txt).

<metadata> Defines metadata information using a metadata schema This is new for VoiceXML 2.0 but it is ignored by the VoiceXML browser if it is CDATA; otherwise an error is generated.
<noinput> Catches a noinput event. Supported as documented in VoiceXML 2.0.
<nomatch> Catches a nomatch event. Supported as documented in VoiceXML 2.0.
<object> Specifies platform-specific objects. For Blueworx Voice Response, some attributes are supported in specific ways:
classid
The following URI is supported: method://<fully qualified java classname>/<java method name>
codetype
The value javacode is supported for the codetype attribute.
archive
The archives can be arbitrary Jar files containing Java classes referenced by the classid attribute. The archives must be specified in a fully qualified URL format. If archives are not supplied, the system class loader (local class path) is used to resolve classid references.

Fetching and caching attributes are not supported.

All Ecmascript integers are passed to Java objects as java.lang.Double. If the VoiceXML application is to pass an integer value to a Java object, this must be done through a method which accepts that value as a java.lang.Double parameter object.

A sample program that uses the <object> element is in Calling a Java application.

You can use the <object> element to use invokeGCTI advanced CTI functionality. See Using advanced CTI features for further information.

You can also use the <object> element to close a speech recognition or speech synthesis session directly from a VoiceXML document. See Closing a speech recognition or TTS session from VoiceXML for further information.

<option> Specifies a field option. Supported as documented in VoiceXML 2.0.
<param> Specifies a parameter in an object or subdialog. Supported as documented in VoiceXML 2.0.
<prompt> Plays TTS or recorded audio output. Supported as documented in VoiceXML 2.0 with the exception that an additional value for the bargein attribute is supported bargein=dtmf_only. Audio output stops after the system determines that the user has pressed a DTMF key. Pressing any DTMF key stops the audio output.
<property> Controls aspects of the behavior of the implementation platform.
The attributes:
  • objectfetchhint
  • objectmaxage
  • objectmaxstale
are ignored by the VoiceXML browser.
The names:
  • sensitivity
  • speedvsaccuracy
are supported in this release.
The names:
  • com.ibm.speech.asr.saveaudio
  • com.ibm.speech.asr.saveaudiotype
  • com.ibm.speech.asr.endpointed
  • com.ibm.dtmf.useexternaldetection
  • com.ibm.gcti.usewvrtransfer
are supported in addition to those documented in VoiceXML 2.0.

See Recording user input during speech recognition for more information on the com.ibm.speech.asr… properties, Remote DTMF grammars for more information on the com.ibm.dtmf.useexternaldetection property, and Re-routing Genesys CTI call transfers through Blueworx Voice Response for information on the com.ibm.gcti.usewvrtransfer property.

In addition to the names listed above, VoiceXML 2.1 adds the recordutterance and recordutterancetype names to this element. If the value for recordutterance is set to "true" then three shadow variables may be used. For more details, see the VoiceXML 2.1 specification. Use the recordutterancetype property name to specify media formats for utterance recordings. For more details, see the VoiceXML 2.1 specification.

The confidencelevel property specifies a threshold for determining whether recognition results, or scores, should be accepted by the VoiceXML application. When scores are above the confidence-level threshold, the VoiceXML browser considers the recognized words acceptable; the appropriate handlers are called, and the array of the shadow variable application.lastresult$ is filled with the scores up to the maxnbest value. For a <field>, for example, the <filled> handler would be called. If no scores are above the confidence level, appropriate <nomatch> handlers are executed.

maxnbest returns a maximum of 100 n-best results to the application. A sample program using maxnbest is shown in Using n-best.)

Some vendor-specific VoiceXML properties are also supported. Properties that match a given pattern can be passed through from a VoiceXML document and then sent to a speech server in an MRCP SET-PARAMS message. Currently Nuance speech recognition property names that begin with swiep or swirec are supported.

<pros> Controls the prosody of TTS output.

This element has been replaced by the SSML element <prosody>. For more information, see Speech Synthesis Markup Language (SSML).

<record> Records spoken user input.

Any grammars active during <record> are ignored.

The timeout attribute specified for a prompt is ignored during <record>; no noinput event is generated.

The type attribute may take the following values:
audio/basic
Creates a .au file of 8kHz, 8-bit µ-law encoding
audio/x-alaw-basic
Creates a .au file of 8kHz, 8-bit a-law encoding
audio/x-wav
Creates a Microsoft wav file of 8kHz, 16-bit, linear PCM encoding

The beep attribute defaults to false (a beep is not played). The maxtime attribute defaults to 60 seconds, but is overridden by the Blueworx Voice Response Record Voice Maximum system parameter. You must set this parameter from Blueworx Voice Response to be greater than or equal to (in seconds) the largest maxtime attribute used in your VXML application. Refer to “Controlling messages” in the Blueworx Voice Response: Blueworx Voice Response for AIX: Designing and Managing State Table Applications information.

The final silence attribute is set to a default of 12 seconds by Blueworx Voice Response. This value can only be changed from Blueworx Voice Response. Other attributes are not user-definable.

Only the duration, size and termchar shadow variables have been implemented for <record>. Size contains the internal size of the raw audio data (8-bit headerless, µ-law or a-law encoding).

<reprompt> Causes the form interpretation algorithm to queue and play a prompt when entering a form after an event. Supported as documented in VoiceXML 2.0.
<return> Returns from a subdialog. Supported as documented in VoiceXML 2.0.
<sayas> Controls pronunciation of words or phrases in TTS output.

This element has been replaced by the SSML element <say-as>. For more information, see Speech Synthesis Markup Language (SSML).

<script> Specifies ECMAScript code.

VoiceXML 2.1 adds the srcexpr attribute to this element. Otherwise, supported as documented in VoiceXML 2.0.

<subdialog> Invokes a new dialog as a subdialog of the current one, in a new execution context. Supported as documented in VoiceXML 2.0.
<submit> Submits a list of variables to the document server. Supported as documented in VoiceXML 2.0.

To submit the results of a <record> element, you must use enctype=“multipart/form-data” as shown in Playing and storing recorded user input.

<throw> Throws an event. Supported as documented in VoiceXML 2.0.
<transfer> Connects the telephone caller to a third party.

VoiceXML 2.1 adds the type attribute to this element.

You can use the <transfer> element to transfer VoiP/SIP telephone calls, for example:

<transfer
 name="blindTransfer"
 dest="sip:444@9.20.49.62:5067"
 type="blind" connecttimeout="3000ms">
The following attributes are not supported in this release:
  • aaiexpr
  • connecttimeout
  • maxtime
  • transferaudio

The attribute aai is ignored by the VoiceXML browser.

The attribute bridge=“true” is supported only if Blueworx Voice Response is configured with a suitable telephony service. See Blueworx Voice Response: Deploying and Managing VoiceXML and Java Applications for more information.

For bridged transfer (when bridge=“true”), a conference call is set up to include the person or application specified by the dest=attribute.

<value> Embeds a variable in a prompt. Supported as documented in VoiceXML 2.0.
<var> Declares a variable. Refer to Table 1 for information about variable scope and Using shadow variables for information about shadow variables.
<vxml> Top-level container for all other VoiceXML elements in a document.

All VoiceXML documents must specify either version=“2.0” or version=“2.1”, and xmlns=“http://www.w3.org/2001/vxml”; if these attributes are missing, the VoiceXML browser will throw an error.badfetch event.

The xml:lang attribute defaults to the language specified by the locale of the JVM in which the VoiceXML browser runs.

Check for language specific limitations and considerations in the language appendixes.