RESPITE: The CASA Toolkit Page: Documentation: Block Library Index:HMMDecoderMDSoft

HMMDecoderMDSoft

The HMMDecoderMDSoft is a version of HMMDecoderMD that has been generalised to accept a `soft mask'. Whereas the mask input for HMMDecoderMD should contain only 0's and 1's, the mask input for HMMDecoderMDSoft may contain values in the inclusive range 0.0 to 1.0. The values in the mask are interpreted as the probability that the data is reliable.

For full details of how the soft masks are employed in missing data speech recognition:

J. Barker, M.P. Cooke, L. Josifovski and P.D. Green (2000). Soft decisions in missing data techniques for robust automatic speech recogn ition, Proc. ICSLP-00, Beijing, China, PDF | PostScript

As part of its probability calculation the HMMDecoderMDSoft needs to perform a bounded-marginalisation (see HMMDecoderMD). The block therefore requires lower and upper bounds for the missing data values. These bounds are supplied via the block inputs in3 and in4. The lower bound input (in3) is optional, and if not supplied the lower bound will default to 0.

It is difficult to approximate good missing data bounds for delta features, so, by default, the HMMDecoderMDSoftdoes not apply the soft mask technique to the delta features. Instead, delta feature are treated in the same manner as they are by the discrete mask HMMDecoderMD decoder - i.e. they are either taken to be fully present or fully missing, and missing values are considered unbounded. Consequently, the elements of the missing data mask corresponding to the delta features should be 0 or 1. However, the decoder can be forced to treat delta features using the same soft mask technique as applied to the static features by setting USE_DELTA_BOUNDS to true. This may give better results if good lower and upper delta feature bounds are available.

The HMMDecoderMDSoft has the same parameters as the HMMDecoderMD, except: i) there is no USE_BOUNDS parameter (bounded-marginalisation is an integral part of the soft mask technique and bounds for the static features are always used), and ii) there is an additional parameter, ONE_ZERO_ROUNDING, which is described further below.

ONE_ZERO_ROUNDING
This parameter specifies a tolerance (that should be in the range 0.0 to 0.5) within which values in the soft mask will be rounded to the discrete values 0.0 and 1.0. i.e. if the parameter is set to 0.1 then values in the range [0.0,0.1] will be rounded down to 0.0 and values in the range [0.9,1.0] will be rounded up to 1.0.
Typically only a small proportion of the points in a soft mask are effectively soft, the others are very close to the discrete missing (0.0) and present (1.0) values. By approximating the points close to 0 and 1 as actually being 0 and 1 the probability calculation may be speeded up. Setting the ONE_ZERO_ROUNDING parameter to a small number e.g. 0.01 may result in faster decoding without any loss in performance. [However, in practice I've found performance tends to deteriorate even with very small tolerances - and the increased decoding speed is marginal! I've left the feature in for future experimentation.]
By default ONE_ZERO_ROUNDING is set to 0.0 - this effectively turns the feature off.

Note, that if a discrete mask consisting of only 0's and 1's is passed to HMMDecoderMDSoft it should produce exactly the same results as HMMDecoderMD. However, due to the way in which the probability calculation has been generalised, it is more efficient to use HMMDecoderMD for discrete masks .

As with the other decoders, HMMDecoderMDSoft outputs a stream of state-likelihood frames. Each frame consists of the likelihood of each model state having generated the corresponding input feature frame. Within these frames the state likelihoods occur in the same order in which the states are defined in the HMM definition file. (The mixture label frames (out2) indicate the integer label of the winning mixture for each state).

Inputs Meaning Sample 1-D frame $\ge$ 2-D frame

in1 feature vectors No Yes No

in2 fuzzy data mask No Yes No

(in3) lower bound No Yes No

in4 upper bound No Yes No

Inputs	Meaning	Sample	1-D frame	$\ge$ 2-D frame
`in1`	feature vectors	No	Yes	No
`in2`	fuzzy data mask	No	Yes	No
`(in3)`	lower bound	No	Yes	No
`in4`	upper bound	No	Yes	No

Outputs Meaning

out1 state likelihoods

out2 state max mixture label

Outputs	Meaning
out1	state likelihoods
out2	state max mixture label

Parameters Type Default Meaning

LOG_FILE String - Name of an optional log file

LOG_FILE_2 String - Name of additional detailed log file

WORD_PENALTY Float 0.0 The creation penalty

HMM_FILE String - Name of the HMM file list

GRAMMAR_FILE String - File storing the grammar

LABEL_FILE String - File storing HMM NAME-> HMM LABEL mapping

FIRST_TOKEN String - Label of a fixed first token

FINAL_TOKEN String - Label of a fixed final token

TRANSCRIPTION String - The correct transcription

SILENCE String "" The silence label(s)

MAX_APPROX Boolean False Use max mixture approximation

NBEST Int 1 Return best N hypotheses

STATE_PATH Boolean False Record HMM state path

HAS_DELTAS Boolean 0 Models have delta parameters

USE_DELTAS Boolean - Models have delta parameters

HYPOTHESIS FILTER String "" Regular expression for filtering hypotheses

OUTPUT_CONFUSIONS Boolean 0 Output confusion matrix

DUMP_PARAMETERS Boolean 0 Write parameters to log file

USE_BOUNDS Boolean False Use bounded marginalisation

ONE_ZERO_ROUNDING Float 0.0 Tolerance within which to round mask to 0 or 1

Parameters	Type	Default	Meaning
`LOG_FILE`	String	-	Name of an optional log file
`LOG_FILE_2`	String	-	Name of additional detailed log file
`WORD_PENALTY`	Float	0.0	The creation penalty
`HMM_FILE`	String	-	Name of the HMM file list
`GRAMMAR_FILE`	String	-	File storing the grammar
`LABEL_FILE`	String	-	File storing HMM NAME-> HMM LABEL mapping
`FIRST_TOKEN`	String	-	Label of a fixed first token
`FINAL_TOKEN`	String	-	Label of a fixed final token
`TRANSCRIPTION`	String	-	The correct transcription
`SILENCE`	String	""	The silence label(s)
`MAX_APPROX`	Boolean	False	Use max mixture approximation
`NBEST`	Int	1	Return best N hypotheses
`STATE_PATH`	Boolean	False	Record HMM state path
`HAS_DELTAS`	Boolean	0	Models have delta parameters
`USE_DELTAS`	Boolean	-	Models have delta parameters
`HYPOTHESIS FILTER`	String	""	Regular expression for filtering hypotheses
`OUTPUT_CONFUSIONS`	Boolean	0	Output confusion matrix
`DUMP_PARAMETERS`	Boolean	0	Write parameters to log file
`USE_BOUNDS`	Boolean	False	Use bounded marginalisation
`ONE_ZERO_ROUNDING`	Float	0.0	Tolerance within which to round mask to 0 or 1

Documentation for CTKv1.1.4 - Last modified: Tue Jul 3 12:09:42 BST 2001