next up previous
Next: Morph: Up: LaSIE Modules Previous: Sentence Splitter:

Brill Tagger [4]:

a rule-based part-of-speech tagger that has been extensively trained on the Penn TreeBank corpus of manually tagged Wall Street Journal texts [22]. It uses, therefore, the 48 part-of-speech tags which make up the Penn TreeBank tag set.

The original tagger has been modified slightly for use in MUC-6. The modifications include the introduction of new tags for dates, SGML markup, and punctuation symbols, and the addition of several new lexical and contextual rules to the original rule base.

The SemanTag Tagger module is an alternative implementation of the tagging method used by the Brill tagger (http://www.rt66.com/gcooke/SemanTag/ ). Despite its name, it currently only carries out part-of-speech tagging.



Gillian Callaghan 2000-03-29