IBM Cloud Speech To Text (STT) supported parameters

IBM Cloud STT supported parameters

These are the supported parameters that will be sent to the IBM Cloud STT engine. Specific information on these parameters can be found in the IBM documentation https://cloud.ibm.com/apidocs/speech-to-text#recognize

Note that all of these parameters are optional.

Variable	Type	Default	Description
model	string	en-US_BroadbandModel	Model for processing audio.
customization_weight	float	0.3	How much weight to give to customization words.
inactivity_timeout	integer	30	Number of seconds of silence before the stopping the STT.
interim_results	boolean	false	Returns results as they are generated.
keywords	List		Array of keyword strings to spot in the audio.
keywords_threshold	float		Confidence level lower bound to recognize a keyword. By default no keyword spotting.
max_alternatives	integer	1	Maximum number of alternatives to return. By default this is a single result.
word_alternatives_threshold	float		Confidence level lower bound for identifying alternatives. By default no alternative words are found.
word_confidence	boolean	true	If true this returns the confidence of each word.
timestamps	boolean	true	If true return timestamps for each word.
profanity_filter	boolean	true	If true replaces all inappropriate words with asterisks.
smart_formatting	boolean	false	If true the STT engine will convert various values into a more readable form.
speaker_labels	boolean	false	If true the response will include labels identifying the individual speakers.
grammar_name	string		Name of the grammar for the recognition.
redaction	boolean	false	If true all numerical data will be redacted from the result.
processing_metrics	boolean	false	If true processing metrics are returned in the result.
processing_metrics_interval	float	1.0	Interval, in seconds, to return processing metrics.
audio_metrics	boolean	false	If true detailed information about signal characteristics of the audio are returned.
end_of_phrase_silence_time	float	0.8	Duration of the pause interval for splitting a transcript into multiple results.
split_transcript_at_phrase_end	boolean	false	If true directs the STT engine to split the transcript into multiple final results based on semantic features of the input.
speech_detector_sensitivity	float	0.5	Sensitivity of speech activity detection.
background_audio_suppression	float	0.0	Level to supress background audio.
low_latency	boolean	false	If true attempts to return results quicker.
character_insertion_bias	float	0.0	Bias between shorter or longer strings when generating the results.
skip_zero_len_words	boolean	false	Undocumented feature.

IBM Cloud Speech To Text (STT) supported parameters

Overview

IBM Cloud STT supported parameters