SSML elements and attributes

Table 1. Summary of SSML elements and attributes
Element Description Implementation details
<audio> Plays an audio file within a prompt or synthesizes speech from a text string if the audio file cannot be found. Supported as documented in VoiceXML 2.0, except that the <audio> element cannot be contained within other SSML elements, such as in:<emphasis level=“strong”> <audio .../> </emphasis>
<break> Inserts a pause in TTS output. Supported as documented in VoiceXML 2.1.
<desc> Describes the content of an audio source. Supported as documented in VoiceXML 2.0.
<emphasis> Specifies the emphasis for TTS output. Supported as documented in VoiceXML 2.0.
<lexicon> References external pronunciation definitions. This element is not supported in Blueworx Voice Response.
<mark> Inserts a reference point in a document. This element is not supported in VoiceXML 2.0, but is supported in VoiceXML 2.1 as documented.
<metadata> Specifies general information about the document. This element is ignored by the VoiceXML browser.
<p> Specifies text structure in the absence of an end of sentence punctuation character. Supported as documented in VoiceXML 2.1.
<paragraph> Specifies text structure in the absence of an end of sentence punctuation character. Supported as documented in VoiceXML 2.0.
<phoneme> Specifies the pronunciation and phonology for a TTS output. Supported as documented in VoiceXML 2.0.
<prosody> Controls the pitch, range, rate and volume of TTS output. The contour and duration attributes are not supported.
<say-as> Specifies the type of text. For example, date, telephone number or currency. The following attribute values are not supported:
  • interpret-as=“letters” with details=“strict”
  • details=“dictate”
  • interpret-as=“words”
Note: The details=“punctuation” attribute value is not supported in Simplified Chinese.
<say-as> can be used with built-in grammar types.
<s> Specifies text structure in the absence of an end of sentence punctuation character. Supported as documented in VoiceXML 2.1.
<sentence> Specifies text structure in the absence of an end of sentence punctuation character. Supported as documented in VoiceXML 2.0.
<sub> Specifies substitute text for TTS output in place of a given input string. Supported as documented in VoiceXML 2.0.
<voice> Specifies the speaking voice used for TTS ouput. Supported as documented in VoiceXML 2.0.

Language specific limitations and considerations are given in the language appendixes.