RESPITE: The CASA Toolkit Page: Documentation: Block Library Index:HMMDecoderMultisource

HMMDecoderMultisource

[Note, this is an experimental decoder that is still under development.]

The `multisource' decoder accepts a 1/0 missing data mask and a mask of integer labels that define grouped regions (`fragments') in the 1/0 mask. These fragments are assumed to be regions of the representation that are due to a single source. Rather than using a single fixed present data mask the decoder attempts to test every mask that can be generated from a subset of the mask fragments. As there is potentially a very large number of mask hypotheses, a limited search is employed. This search proceeds by generating new complimentary hypotheses when a fragment starts and merging partial complementary hypotheses every time a fragment ends. The algorithm is described in more details in:

J. Barker, M.P. Cooke and D.P.W.Ellis (2000). Decoding speech in the presence of other sound sources, Proc. ICSLP-00, Beijing, China, PDF | PostScript

The multisource decoder has five inputs. The data, mask, group labellings and lower and upper missing data bounds.

The parameters are similar to those of HMMDecoderMD but with the following additions:

DISPLAY_GROUPS
A boolean parameter that if set to TRUE causes the mask that was used for the winning hypothesis to be displayed.
MD_WEIGHT
A parameter that controls the ratio of present/missing data in the mask. The exact interpretation of this parameter depends on the normalisation technique the decoder employs. The normalisation technique is set using the NORMALISE_MODE parameter.
NORMALISE_MODE
The current implementation can employ two different techniques for balancing the amount of present against the amount of missing data. The user can selected between these by setting the NORMALISE_MODE parameter to either MODE_1 or MODE_2:
- MODE_1
  When operating in MODE_1 all missing data probabilities are scaled by the parameter MD_WEIGHT. Present data probabilities are not altered. So, with a higher value of MD_WEIGHT the decoding will be biased towards mask hypotheses in which more data is missing, and with a lower value it will favour hypotheses in which less data is missing.
- MODE_2
  In this mode, individual present and missing data probabilities are employed unscaled, but entire mask hypothesis probabilities are adjusted at the point where a fragment ends and complimentary hypotheses are compared. The hypothesis score adjustment is based on the size of the fragment that is under consideration. The score for the mask hypothesis in which the fragment is considered to be present remains unaltered, and the score for the hypothesis in which the fragment has been dropped is reduced. The reduction in the log probability score is computed as the size of the fragment multiplied by the MD_WEIGHT parameter. Hence, a large value MD_WEIGHT implies a larger cost for labelling fragments as missing.
MD_WEIGHT has very different interpretations in modes 1 and 2. Note, a larger MD_WEIGHT biases towards more missing data in MODE_1 but towards less missing data in MODE_2.

Note, the block has two outputs - but these outputs should not be used and will be removed in future releases.

Inputs Meaning Sample 1-D frame $\ge$ 2-D frame

in1 feature vectors Yes

in2 fuzzy data mask Yes

in3 group labels Yes

(in4) lower bound Yes

in5 upper bound Yes

Inputs	Meaning	Sample	1-D frame	$\ge$ 2-D frame
`in1`	feature vectors		Yes
`in2`	fuzzy data mask		Yes
`in3`	group labels		Yes
`(in4)`	lower bound		Yes
`in5`	upper bound		Yes

Outputs Meaning

out1 rubbish!

out2 rubbish!

Outputs	Meaning
out1	rubbish!
out2	rubbish!

Parameters Type Default Meaning

LOG_FILE String - Name of an optional log file

LOG_FILE_2 String - Name of additional detailed log file

WORD_PENALTY Float 0.0 The creation penalty

HMM_FILE String - Name of the HMM file list

GRAMMAR_FILE String - File storing the grammar

LABEL_FILE String - File storing HMM NAME-> HMM LABEL mapping

FIRST_TOKEN String - Label of a fixed first token

FINAL_TOKEN String - Label of a fixed final token

TRANSCRIPTION String - The correct transcription

SILENCE String "" The silence label(s)

MAX_APPROX Boolean False Use max mixture approximation

NBEST Int 1 Return best n hypotheses

STATE_PATH Boolean False Record HMM state path

HAS_DELTAS Boolean 0 Models have delta parameters

USE_DELTAS Boolean - Models have delta parameters

HYPOTHESIS FILTER String "" Regular expression for filtering hypotheses

OUTPUT_CONFUSIONS Boolean 0 Output confusion matrix

DUMP_PARAMETERS Boolean 0 Write parameters to log file

DISPLAY_GROUPS Boolean 0 Display mask backtrace (requires MATLAB)

USE_DELTA_BOUNDS Boolean False Use bounded marginalisation (delta features)

MD_WEIGHT Float 0.0 Balances amount of present/missing data

NORMALISE_MODE {MODE_1, MODE_2} MODE_1 The technique used to balance the amount of present/missing data

Parameters	Type	Default	Meaning
`LOG_FILE`	String	-	Name of an optional log file
`LOG_FILE_2`	String	-	Name of additional detailed log file
`WORD_PENALTY`	Float	0.0	The creation penalty
`HMM_FILE`	String	-	Name of the HMM file list
`GRAMMAR_FILE`	String	-	File storing the grammar
`LABEL_FILE`	String	-	File storing HMM NAME-> HMM LABEL mapping
`FIRST_TOKEN`	String	-	Label of a fixed first token
`FINAL_TOKEN`	String	-	Label of a fixed final token
`TRANSCRIPTION`	String	-	The correct transcription
`SILENCE`	String	""	The silence label(s)
`MAX_APPROX`	Boolean	False	Use max mixture approximation
`NBEST`	Int	1	Return best n hypotheses
`STATE_PATH`	Boolean	False	Record HMM state path
`HAS_DELTAS`	Boolean	0	Models have delta parameters
`USE_DELTAS`	Boolean	-	Models have delta parameters
`HYPOTHESIS FILTER`	String	""	Regular expression for filtering hypotheses
`OUTPUT_CONFUSIONS`	Boolean	0	Output confusion matrix
`DUMP_PARAMETERS`	Boolean	0	Write parameters to log file
`DISPLAY_GROUPS`	Boolean	0	Display mask backtrace (requires MATLAB)
`USE_DELTA_BOUNDS`	Boolean	False	Use bounded marginalisation (delta features)
`MD_WEIGHT`	Float	0.0	Balances amount of present/missing data
`NORMALISE_MODE`	{MODE_1, MODE_2}	MODE_1	The technique used to balance the amount of present/missing data

Documentation for CTKv1.1.4 - Last modified: Tue Jul 3 13:02:32 BST 2001