SSML elements and attributes

Table 1. Summary of SSML elements and attributes
Element	Description	Implementation details
<audio>	Plays an audio file within a prompt or synthesizes speech from a text string if the audio file cannot be found.	Supported as documented in VoiceXML 2.0, except that the <audio> element cannot be contained within other SSML elements, such as in:<emphasis level=“strong”> <audio .../> </emphasis>
<break>	Inserts a pause in TTS output.	Supported as documented in VoiceXML 2.1.
<desc>	Describes the content of an audio source.	Supported as documented in VoiceXML 2.0.
<emphasis>	Specifies the emphasis for TTS output.	Supported as documented in VoiceXML 2.0.
<lexicon>	References external pronunciation definitions.	This element is not supported in Blueworx Voice Response.
<mark>	Inserts a reference point in a document.	This element is not supported in VoiceXML 2.0, but is supported in VoiceXML 2.1 as documented.
<metadata>	Specifies general information about the document.	This element is ignored by the VoiceXML browser.
<p>	Specifies text structure in the absence of an end of sentence punctuation character.	Supported as documented in VoiceXML 2.1.
<paragraph>	Specifies text structure in the absence of an end of sentence punctuation character.	Supported as documented in VoiceXML 2.0.
<phoneme>	Specifies the pronunciation and phonology for a TTS output.	Supported as documented in VoiceXML 2.0.
<prosody>	Controls the pitch, range, rate and volume of TTS output.	The contour and duration attributes are not supported.
<say-as>	Specifies the type of text. For example, date, telephone number or currency.	The following attribute values are not supported: interpret-as=“letters” with details=“strict” details=“dictate” interpret-as=“words” Note: The details=“punctuation” attribute value is not supported in Simplified Chinese. <say-as> can be used with built-in grammar types.
<s>	Specifies text structure in the absence of an end of sentence punctuation character.	Supported as documented in VoiceXML 2.1.
<sentence>	Specifies text structure in the absence of an end of sentence punctuation character.	Supported as documented in VoiceXML 2.0.
<sub>	Specifies substitute text for TTS output in place of a given input string.	Supported as documented in VoiceXML 2.0.
<voice>	Specifies the speaking voice used for TTS ouput.	Supported as documented in VoiceXML 2.0.

Language specific limitations and considerations are given in the language appendixes.