next up previous
Next: Unsupervised template learning Up: UDIE: what would it Previous: UDIE: what would it

Supervised template learning

Brill-style transformation-based learning methods are one of the few ML methods in NLP to have been applied above and beyond the part-of-speech tagging origins of virtually all ML in NLP. Brill's original application triggered only on POS tags; later [8] he added the possibility of lexical triggers. Since then the method has been extended successfully to e.g. speech act determination [39], and a template learning application was designed by Vilain [54].

A fast implementation based on the compilation of Brill-style rules to deterministic automata was developed at Mitsubishi labs [51] (see also [20]). The quality of the transformation rules learned depends on factors such as:

  1. the accuracy and quantity of the training data;
  2. the types of pattern available in the transformation rules;
  3. the feature set available used in the pattern side of the transformation rules.

The accepted wisdom of the machine learning community is that it is very hard to predict which learning algorithm will produce optimal performance, so it is advisable to experiment with a range of algorithms running on real data. There have as yet been no systematic comparisons between these initial efforts and other conventional machine learning algorithms applied to learning extraction rules for IE data structures (e.g. example-based systems such as TiMBL [23] and ILP [44].

Such experiments should be considered as strongly interacting with the issues discussed below (section 3 on the lexicon), where we propose extensions to earlier work done by us and others [4] on unsupervised learning of the surface forms (subcategorization patterns) of a set of root template verbs: this was work that sought to cover the range of corpus forms under which a significant verb's NEs might appear in text. Such information might or might not be available in a given set of $<$document, template$>$ pairs-e.g. would NOT be if the verbs appeared in sentences only in canonical forms. Investigation is still needed on the trade off between the corpus-intensive and the $<$document, filled template$>$ pair methods, if templates have not been pre-provided for a very large corpus selection (for, if they had, the methodology above could subsume the subcategorization work below). It will be, in practice, a matter of training sample size and richness.


next up previous
Next: Unsupervised template learning Up: UDIE: what would it Previous: UDIE: what would it
Gillian Callaghan 2000-03-29