Yorick Wilks and Roberta Catizone
The University of Sheffield, Department of Computer Science
Regent Court, 211 Portobello Street, Sheffield, UK
In this paper we discuss some methods that have been tried to ease this problem, and to create something more rapid than the bench-mark one-month figure, which was roughly what ARPA teams in IE needed to adapt an existing system by hand to a new domain of corpora and templates. An important distinction in discussing the issue is the degree to which a user can be assumed to know what is wanted, to have pre-existing templates ready to hand, as opposed to a user who has a vague idea of what is needed from a corpus.
We shall discuss attempts to derive templates directly from corpora; to derive knowledge structures and lexicons directly from corpora, including discussion of the recent LE project ECRAN which attempted to tune existing lexicons to new corpora. An important issue is how far established methods in Information Retrieval of tuning to a user's needs with feedback at an interface can be transferred to IE.