Most of my work is in the field of Natural Language
Processing although it is also been influenced by related
areas such as Information Retrieval and Machine Learning.
Areas I have worked on recently include:
- Lexical Semantics (analysis of word meaning), including
word sense disambiguation, lexical similarity and hybrid
- Information Extraction (identification of structured knowledge from text), particularly distant supervision approaches to relation extraction.
- Biomedical and Medical Text Processing (supporting access to medical literature), including word sense disambiguation, relation extraction, literature-based discovery/data mining and contradiction identification.
- Document analysis including identification of text
reuse/plagiarism and author identification.
Projects I have been involved with include:
- 2017-: Institute of Coding, HEFCE/Office for Students
- 2016: HypoGen: Hypothesis Generation and Visualisation from Large Corpora, DSTL
- 2013-6: Advanced Computing Research Center, HEFCE
- 2014-5: Sensemaking: Exploratory search for document collections, DSTL
- 2012-5: KDisc: Language Processing for Literature Based Discovery in Medicine, EPSRC
- 2011-3: PATHS: Personalised access to cultural heritage, EU
- 2011-2: Scaling-up WSD for the Life Sciences, EPSRC Knolwedge Transfer Account
- 2011-4: RE-COST: REducing the Cost of Oracles for Software Testing, EPSRC
- 2007-10: Lexical Disambiguation for the Biomedical Domain, EPSRC
- 2006-11: CASTLE: Providing Lexical Adaptation Techniques for Language Processing, EPSRC
- 2004-7: RESULT: Relation Extraction with Semi-sUpervised Learning Techniques: EPSRC