Most of my work is in the field of Natural Language
Processing although it is also been influenced by related
areas such as Information Retrieval.
Areas I have worked on recently include:
- Lexical Semantics (analysis of word meaning), including
word sense disambiguation, lexical similarity and hybrid
- Information Extraction (identification of structured knowledge from text), particularly distant supervision approaches to relation extraction.
- Biomedical and Medical Text Processing (supporting access to medical literature), including word sense disambiguation, relation extraction, literature-based discovery/data mining and contradiction identification.
- Document analysis including identification of text
reuse/plagiarism and author identification.
Previous Members and Visitors
Muhammad Adeel (Phd thesis: Mono-lingual Paraphrased Text Reuse and Plagiarism Detection)
Aletras (PhD thesis: Interpreting Document Collections using Topic Models)
Abdulaziz Alamri (PhD thesis: The Detection of Contradictory Claims in Biomedical Abstracts)
Rachel Cotterill (PhD thesis: Identifying Stylometric Correlates of Social Power)
Samuel Fernando (PhD thesis: Enriching Lexical Knolwedge Bases with Encyclopedia Relations)
Aitor Gonzalez Agirre
Roland Roller (Thesis: Detection Biomedical Relations using Distant Supervision)
Lucia Specia (Thesis: Uma abordagem hibrida relacional para a desambiguacao lexical de sentido na traducao automatica)
Kumutha Swampillai (Thesis: Information Extraction Across Sentences)
Projects I have been involved with include:
- 2017-: Institute of Coding, HEFCE
- 2016: HypoGen: Hypothesis Generation and Visualisation from Large Corpora, DSTL
- 2013-6: Advanced Computing Research Center, HEFCE
- 2014-5: Sensemaking: Exploratory search for document collections, DSTL
- 2012-5: KDisc: Language Processing for Literature Based Discovery in Medicine, EPSRC
- 2011-3: PATHS: Personalised access to cultural heritage, EU
- 2011-2: Scaling-up WSD for the Life Sciences, EPSRC Knolwedge Transfer Account
- 2011-4: RE-COST: REducing the Cost of Oracles for Software Testing, EPSRC
- 2007-10: Lexical Disambiguation for the Biomedical Domain, EPSRC
- 2006-11: CASTLE: Providing Lexical Adaptation Techniques for Language Processing, EPSRC
- 2004-7: RESULT: Relation Extraction with Semi-sUpervised Learning Techniques: EPSRC