Next: Brill Tagger [4]:
Up: LaSIE Modules
Previous: Tokenizer:
based on the sentence
splitting algorithm used in the Sussex MUC-5 system, POETIC
[13], the module identifies sentence start and end byte
offsets, making use of SGML sentence markup if present.
Gillian Callaghan
2000-03-29