The CTK Development Page
Notes for HMM Decoders CTK v1.1.0
The decoders in CTKv1.1.xx are a little more sophisticated than those in CTKv1.0.xx and as a result a few non-backwardly compatible changes have had to be made to the block parameters. Specifically, LABELS has been replaced with LABEL_FILE, the format of HMM_FILE has changed, and the USE_OZ parameter has been superseded with the more general parameter HYPOTHESIS_FILTER.
These changes are described in the notes that follow.
HMM_FILE
The HMM_FILE parameter specifies the name of a file that associates HMM definitions with HMM NAMEs. This file can have one of two possible formats, depending on whether the HMMs are stored in a single file or are stored separately:
- Single file: HMM_FILE refers to an HTK MMF (Multiple Model File). This is a single file containing the definition of all the HMMs. For each HMM there is a quoted NAME
- Separate files: HMM_FILE refers to a file list. This is a file containing the list of the names of individual files which define individual HMMs in HTK format. Each line of the list consists of the HMM file name, optionally followed by the NAME to be assigned to the HMM. If no NAME is supplied, the HMM NAME is taken to be the same as the HMM's HTK file name without the path. e.g. The HMM_FILE may contain lines like the following:
/home/jon/hmms/one.3mix one
/home/jon/hmms/two.3mix two
/home/jon/hmms/three.3mix three
LABEL_FILE
This parameter specifies the name of a file which associates HMM NAMEs with HMM LABELs. Whereas each HMM must have a unique NAME, several HMMs can share the same LABEL. e.g. there may be both a male and female version of the digit one with NAMEs "one_m" and "one_f" both having the LABEL "1".
Each line of the file defines a separate LABEL. The LABEL occurs as the first character on the line and is followed by the NAME of each HMM that shares this LABEL. e.g:
1 one_m one_f
2 two_m two_f
S sil sp
etc.
This parameter supersedes the "LABELS" parameter that was used in CTKv1.0.xx.
GRAMMAR_FILE
This parameter specifies the name of a file containing the grammar to be applied to the set of models.
The GRAMMAR_FILE specifies the grammar in terms of the NAMEs of the individual HMMs. The format is the same as that used in version 1.x of HTK. For more details see here.
If no GRAMMAR_FILE is specified, then all the models are placed in a simple loop grammar. i.e. any model can follow any other model.
HYPOTHESIS_FILTER
This is a string parameter that takes the form of a regular expression. If supplied, this expression will be matched against the winning hypothesis, and if it matches the hypothesis will be rejected in favour of the next best hypothesis that does not match the filter. (If no suitable result can be found in the top 50-best list, then output will revert to the overall best hypothesis regardless of the filter).
For example, with a digit recognition task like AURORA, if HYPOTHESIS_FILTER="11" then the decoder will try and reject recognition hypotheses containing the substring "11".
The USE_OZ parameter no longer exists, its effect can be achieved by setting HYPOTHESIS_FILTER="(O.*Z)|(Z.*O)"