Adding speech technology

In the WebSphere Voice Response Java and VoiceXML Environment, speech technology can be used without hard-coding it into the speech application, making it easier to switch from one technology to another. In addition, different technologies can be used for different languages, again, without hard-coding anything in the application. (Java voice applications are fully language-independent.)

To be able to use speech technology, there are several components that you need to install and configure. Below is a summary of the steps you need to perform if you want to use the speech technology available in WebSphere Voice Server.
  1. First install the required speech recognition technology on your base system. If you are using WebSphere Voice Server V5.1, or other MRCP-V1.0-compliant speech product, see section three of the Blueworx Voice Response for AIX: Installation information for details of how to install the required WebSphere Voice Server Connector fileset.
  2. Then, on each voice response node, install the required plug-in for the speech technology you want to use, as follows (note that these zip files contain the plug-ins for both speech recognition and text-to-speech, so you only have to install the required file once):
    1. Change directory to /var/dirTalk/DTBE/plugins and find the dtjmrcp.zip, plug-in for the WebSphere Voice Server Version 5.1 speech technology.
    2. Ensure that Blueworx Voice Response is running, but also that the host and the Java and VoiceXML environment node are not running.
    3. Enter the following command:
      dtjplgin dtjmrcp.zip

      This action additionally installs required custom server components.

    4. When you restart the node, the required speech recognition or text-to-speech plug-in is available for use.
  3. Next, either use the dtjit configuration tool as described in Updating the configuration database, or manually edit the configuration file default.cff for either speech recognition or text-to-speech capability, as described in How speech recognition is configured and How text-to-speech is configured.
  4. When you have finished amending default.cff, update the configuration database by running the dtjconf command, as described in Updating the configuration database.
  5. Edit file /var/dirTalk/DTBE/dtj.ini to specify the appropriate confidence score range for your application. MRCP returns a confidence score in the range 0 - 100 whereas VXML2 expects a score in the range 0 - 1.
    To use the range 0 - 1, set the wvr.use.vxml2.confidencerange parameter to true:
    wvr.use.vxml2.confidencerange=true
    To use the range 0 - 100, set the parameter to false, which is the default value.
  6. If you are using WebSphere Voice Server Version 5.1.1, 5.1.2, or 5.1.3 to provide your speech technology, run the configureWVR utility (.sh on Linux or .bat on Windows) on the WebSphere Voice Server system. This configuration utility is available as part of an interim fix that can be downloaded. Please refer to the relevant WebSphere Voice Server support Technote at http://www.ibm.com/software/pervasive/voice_server/support/, for further information about downloading and installing the interim fix.
  7. Start the MRCP and the MRCP_Log custom servers, using the Blueworx Voice Response Custom Server Manager window (at the Welcome window, select Operations —> Custom Server Manager).
    Note: The custom server uses AIX communications ports to send and receive voice data to the voice server, and the allocation of these ports grows as needed while the custom server is in operation. The default allocation at startup is 120 pairs of ports, sufficient for 120 active trunk channels each using a single language voice recognition or text-to-speech application. If you need more than this, you can preallocate a number of ports before starting the custom servers, to ensure their availability, by use of the 'p' (lower case) parameter on the custom server command line. Note this parameter does not define the total number of ports allocated, it merely preallocates them

    You can access this parameter by selecting Applications —>Custom Servers —>Server —>Open —>File —>Properties , then enter p<n>, where <n> is the number of pairs of ports to preallocate. Note there is no space between the p and the number. <n> can be in the range 0 to 480. For example p240 will give 240 pairs of ports. (sufficient for 240 active trunk channels using a single language).