MSc Projects 2016-17 Rob Gaizauskas <!-- <link type="text/css" rel="stylesheet" href="file:///Users/rob/work/teach/supervision/project-descriptions/Descriptions_UG_15

Rob Gaizauskas : MSc Projects 2016-17

Project Titles:

RJG-MSc-1: Automatic Face Detection in Images
RJG-MSc-2: Clustering and Cluster Labelling of Reader Comments in On-Line News
RJG-MSc-3: Gathering Visually Descriptive Language From Corpora
RJG-MSc-4: Size Matters: Acquiring Vague Spatial Size Information from Textual Sources
RJG-MSc-5: Machine Learning Approaches to Coreference Resolution in Conversational Data
RJG-MSc-6: Mobile Robot Navigation
RJG-MSc-7: Temporal Information Extraction from Clinical Text
RJG-MSc-8: Building an Information Seeking Dialogue Agent

[next][prev][Back to top]

RJG-MSc-1: Automatic Face Detection in Images

Suitable for: ACS/ASE

Background

Automatic face detection algorithms are now to be found in many digital cameras, a striking application of computer vision research that has been realised largely in the past 10 years. Face detection and identification has a broad set of uses ranging from helping to organize personal photo collections to applications in security and criminal investigation.

While the algorithms that have been developed take advantage of specific aspects of the task (e.g the nature of human faces) the task and the many aspects of the algorithms developed to address it are examples of the wider class of computer vision problems/approaches referred to as object detection. Thus, face detection is a good problem to start with in learning basic computer vision techniques for object detection. It is also a good choice of problem to start with because there are various face detection challenge datasets available that allow one to compare one's own approach with others on standard datasets for which the performance of a wide range of approaches is known. For example the NIST Face Recognition Challenges and Evaluation datasets or the 300 Faces-in-the-Wild Challenge" dataset of the Facial Keypoints Detection dataset are candidate datasets for consideration in the project.

Project Description

This project will begin by reviewing:

existing algorithms for face detection
existing datasets for training and evaluation face detection algorithms
existing image processing and computer vision libraries which may form components of the system to be implmented, in particular resources for image processing/computer vision that are implemented in Python or which may be accessed from within Python, the preferred language for the project.

Following this review, one or more approaches to face detection will be chosen and implemented, building on existing computer vision resources where it is sensible to do so. The resulting algorithm(s) will be evaluated against community standard benchmarks and refinements made to the algoirthms, as far as time permits.

Prerequisites

An interest in computer vision, Python programming skills and a solid mathematical background are the only prerequisites for the project.

Initial Reading and Useful Links

Wikipedia article on Face Detection
Wikipedia article on Facial Recognition Systems
The Face Detection Algorithm Set to Revolutionize Image Search
A Survey of Recent Advances in Face Detection Cha Zhang and Zhengyou Zhang, Microsoft Research. 2010.

Contact supervisor for further references.

[next][prev][Back to top]

RJG-MSc-2: Clustering and Cluster Labelling of Reader Comments in On-Line News

Suitable for: ACS/ASE

Background

In a news publisher website such as The Guardian, journalists publish articles on different topics from politics and civil rights to health, sports and celebrity news. The website design supports the publication and reading of original news articles and at the same time facilitates user-involvement via reader comments.

Increasingly, in a period of disruptive change for the traditional media, newspapers see their future as lying in such conversations with and between readers, and new technologies to support these conversations are becoming essential. In this scenario there are a number of potential users: news readers and the originating journalist want to gain a structured overview of the mass of comments, in terms of the sub-topics they address and the opinions (polarity and strength) the commenters hold about these topics; editors or media analysts may need a more widely scoping analysis.

At present none of these users can effectively exploit the mass of comment data -- frequently hundreds of comments per article -- as there are no tools to support them in doing so. What they need is new tools to help them make sense of the large volumes of comment data. Of particular potential here are NLP technologies such as: automatic summarisation, topic detection and clustering, sentiment detection and conversation/dialogue analysis.

The SENSEI Project is a recently completed EU-funded research project in which Sheffield was a partner. SENSEI's goal was to develop new technologies to assist in making sense of large volumes of conversational data, such as reader comments in on-line news, as described above. To do this we developed tools to group reader comments into topically related clusters and to generate relevant labels for this clusters so that a graphical summary in the form of a pie chart could be created. Using reference data created by the SENSEI project, this student project will explore techiques to improve the clustering and cluster labelling approaches developed in SENSEI

Project Description

This project will begin by reviewing the language processing literature in the areas of text clustering and cluster labelling. One or more techniques for clustering and cluster labeling will be selected for exploration.

The second part of the project will involve the detailed design, implementation and evaluation of the methods selected the first part of the project.

Prerequisites

Interest in language and/or text processing. No mandatory module prerequisities, but any or all of Machine Learning, Text Processing, Natural Language Processing are useful.

Initial Reading and Useful Links

Aker, A., Kurtic, E., Balamurali, A.R., Paramita, M., Barker, E., Hepple, M., and Gaizauskas, R. 2016. A Graph-based Approach to Topic Clustering for Online Comments to News. In Proceedings of the 38th European Conference on Information Retrieval (ECIR 2016).
Aker, A., Paramita, M., Kurtic, E., Funk, A., Barker, E., Hepple, M., and Gaizauskas, R. 2016. Automatic label generation for news comment clusters. In Proceedings of the 9th International Natural Language Generation Conference (INLG16).
SENSEI Project homepage.

[next][prev][Back to top]

RJG-MSc-3: Gathering Visually Descriptive Language From Corpora

Suitable for: ACS/ASE

Background

An area of recent growing academic and commercial interest has been the automatic generation of image descriptions, i.e. generating natural language descriptions of what is going on an image. Being able to do this would be of benefit for applications like image retrieval (images could be indexed with potentially rich natural languages descriptions and then retrieved using standard search engines) or in robotics where a robot could describe what it is seeing to a human.

While computer vision analysis of image content is a key part of generating adequate natural language descriptions of images, many approaches to image description generation require models of what people say about visual scenes, what sort of objects tend co-occur in specific scene types etc. This kind of information needs to be gathered from text. To date most reseachers seeking visually descriptive language have worked with caption datasets, but this data is limited in several ways. First, captions may not describe the literal contents of the image (e.g. a caption like Another bad day for our hero) says little about what is going on the image. Secondly, captions may include information not in the picture ( Taken with my new Nikon D7200 at f3.6 on day 2 of the holiday). Thirdly is a relatively limited amount of caption data out there.

One way to address these problems is to seek to gather visually descriptive data from sources other than image captions. Novelists and to some extent journalists or travel writers off describe scenes for us. Can we gather such data automatically? To do this one could formulate a clear definition of visually descriptive language (VDL), then annotate a corpus according to this definition so that examples of visually descriptive language were marked up and finally use supervised learning on the annotated corpus to classify text sequences as visually descriptive or not. Fortunately the first two steps of this process have been done by researchers in Computer Science at Sheffield (see reading below). This project aims to address the third task.

Project Description

After reviewing relevant background literature, this project will begin by gathering a corpus of texts potentially rich in visually descriptive language, e.g. part or all of the literary texts held in Project Gutenberg. Then, a text classifier will be trained on the existing corpus of VDL-annotated texts: various machine learning algorithms and feature sets will be explored to develop the most accurate classifier possible. Once the classifier is developed is will be run across the corpus of texts acquired in the first stage of the project. If time permits potential applications of the automatically aquired VDL will be considered.

Prerequisites

Interest in language and/or text processing. No mandatory module prerequisities, but any or all of Machine Learning, Text Processing, Natural Language Processing are useful. The project will likely be carried out in Python using Scikit learn so knowledge of Python would be advantageous.

Initial Reading and Useful Links

R. Gaizauskas and J. Wang and A. Ramisa (2015). Defining Visually Descriptive Languag.In Proceedings of the Fourth Workshop on Vision and Language, 10--17, 2015.

[next][prev][Back to top]

RJG-MSc-4: Size Matters: Acquiring Vague Spatial Size Information from Textual Sources

Suitable for: ACS/ASE

Background

How big is an adult elephant? or a toaster? We may not know exactly, but we have some ideas: an adult elephant is more than a meter tall and less than 10 meters; a toaster is more than 10cm tall and less than a meter. In particular we have lots of knowledge about the relative sizes of objects: a toothbrush is smaller than a toaster; a toaster is smaller than a refrigerator.

We use this sort of "commonsense" knowledge all the time in reasoning about the world: we can bring a toaster home in our car, but not elephant. An autonmous robot moving about in the world and interacting with objects would need lots of this kind of knowledge. Moreoever, we appear to use knowledge about typical object sizes, along with other knowledge, in interpreting visual scenes, especially in 2D images, where depth must be inferred by the observer. For example, if, when viewing an image, our knowledge about the relative sizes of cars and elephants is violated under the assumption that they are in the same depth plane, then we re-interpret the image so that the car or elephant moves forward or backward relative to the other, so that the assumption is no longer violated. Thus, information about relative or absolute object size is useful in computer image interpretation. It is also useful in image generation: if I want to generate a realistic scene containing cars and elephants then I must respect their relative size constraints. Various computer graphics applications could exploit such information.

Manually coding this kind of information is a tedious and potentially never ending task, as new types of objects are constantly appearing. Is there a way of automating the acquisition of such information? The hypothesis of this project is that there is: we mine this information from textual sources on the web that make claims about the absolute or relative sizes of objects.

Project Description

The project will explore ways of mining information about the absolute and relative size of objects from web sources, such as Wikipedia. The general task of acquiring structured information from free or unstructured text is called text mining or information extraction and is a well-studied application in the broader area of Natural Language Processing (NLP).

The project will begin by reviewing the general area of information extraction with NLP, with specific attention to tasks like named entity extraction (which has been used, e.g. to identify entities like persons, locations and organisation as well as times, dates and monetary amounts and could be adapted to identify object types and lengths) and relation extraction (which has been used to recognise relations between entities, such works-for, and attributes of entities, such as has-location, and could be adapted to recognise the size-of attribute).

Information extraction systems in general are either rule-based, where the system relies on manually-authored patterns to recognise entities and relations, or supervised learning-based where the system relies on learning patterns from manually annotated examples. Following initial reading and analysis of the data, a decision will be made about which approach to take.

In addition to identifying absolute size information (e.g. the Wikipedia article on "apple" says: "Commercial growers aim to produce an apple that is 7.0 to 8.3 cm (2.75 to 3.25 in) in diameter, due to market preference.") the project will also investigate how to acquire relative size information. For example, sentences indicating the topological relation inside ("You can help apples keep fresh longer by storing them in your pantry or refrigerator drawer") allow us to extract the information that apples are smaller than refrigerators. By building up a set of facts expressing relative sizes, together with absolute size information, we can infer upper and lower bounds on the size of various objects. Of course, differring, even conflicting, information may be acquired from different sources. Some means of merging/reconciling this information will need to be devised.

The final element of the project will be to couple the acquired facts to a simple inference engine that will allow is to infer "best estimate" bounds on the size of objects from whatever facts we have acquired. E.g. if all we know is that apples fit inside refrigerators and that refrigerators are no more than 3m tall, then need to be able to infer that apples are less than 3m tall.

Of course, in addition to implementing the above capabilities the project will need to investigate ways of evaluating the various outputs. For example, the accuracy of identifying size information ("15 cm", "5-6 feet"), size-of information ( size-of(apple,7.0-8.3cm)) and relative size information ("apple <= refrigerator") needs to be assessed, as does the correctness of the inference procedure.

Prerequisites

Interest in language and/or text processing and in image analysis. No mandatory module prerequisities, but any or all of Machine Learning, Text Processing and Natural Language Processing are useful.

Initial Reading and Useful Links

Jurafsky, Daniel and James H. Martin. (2008) Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (2nd ed). Prentice-Hall. Especially Chapter 22 on Information Extraction.
Pustejovsky, J., J.L. Moszkowicz and M. Verhagen. 2011. ISO-Space: The annotation of spatial information in language. In Proceedings of the Joint ACL-ISO Workshop on Interoperable Semantic Annotation , 1--9.
Jochen Renz (ed). (2002) Qualitative Spatial Reasoning with Topological Information. Springer. LNCS 2293.
Pan, Feng, Rutu Mulkar-Mehta and Jerry R. Hobbs. 2007. Modeling and Learning Vague Event Durations for Temporal Reasoning. In Proceedings of the Twenty-Second Conference on Artificial Intelligence (AAAI-07).

[next][prev][Back to top]

RJG-MSc-5: Machine Learning Approaches to Coreference Resolution in Conversational Data

Suitable for: ACS/ASE

Background

Anaphora or anaphors are natural language expressions which "carry one back" to something introduced earlier in a discourse or conversation. For example, a news article might begin by introducing David Cameron and later refer to him as "the Prime Minister" or "he". In this example, "the Prime Minister" and "he" are anaphors and "David Cameron" is called the antecedent. Anaphors are said to corefer with their antecedents, since they refer to the same things in the "real" (i.e. non-textual) world. Determining which expressions corefer, the problem of coreference resolution, is an important part of understanding the meaning of a text. Easy for humans, it is a significant challenge for computers.

The task of building coreference resolution algorithms has been studied in Natural Language Processing (NLP) for some time. To develop systems and assess their performance the coreference community has built annotated training and test datasets that contain manually annotated coreference links between coreferring phrases. The best known recent datasets are those produced for the CONLL 2011 and CONLL 2012 Shared Task Challenges, which were devoted to coreference resolution. Not only do such datasets allow competing algorithms to be compared by comparing their scores on the same data, they also permit the development of algorithms based on supervised machine learning techniques that learn from annotated examples.

Different text genres (e.g. news articles, fiction books, academic journal papers, transcribed TV programs) exhibit different linguistic features, and in particular different patterns of coreference. While the CONNL 2011 dataset contains a mix of genres, including some conversational data (transcribed telephone conversations), it does not contain and data drawn from the sort of social media conversations one finds in, e.g., reader comments on on-line news (see the BBC or The Guardian or The Independent websites for examples). This genre of text is being studied in the SENSEI Project, an EU-funded research project in which Sheffield is a partner. One of SENSEI's aims is to produce summaries of reader comment sets consisting of potentially hundreds of comments. Detecting coreferences within this data is likely to be critical to generating good summaries.

Project Description

This project has two clear objectives. The first is to develop and assess a general supervised learning-based coreference resolution algorithm using the CONLL 2011 and 2012 Shared Task Challenges and datasets as a basis. The second is to explore how well the developed algorithm performs on reader comment data. Some data for evaluating coreference in reader comments is being developed by one of the SENSEI partners and should be available for use by the project. Some additional data may need to be annotated for evaluation.

The project will begin by reading background and related work. Several freely available coreference systems, specifically the Stanford and BART coreference systems, will be investigated and run on the CONLL data. A system design, based on a supervised learning approach and possibly an extension of one of the freely available systems, will be selected and a system developed and evaluated on the CONLL data. The system will then be tested on the SENSEI reader comment data, the results analysed and improvements proposed and tested.

Prerequisites

Interest in language and/or text processing and in image analysis. No mandatory module prerequisities, but any or all of Machine Learning, Text Processing and Natural Language Processing are useful.

Initial Reading and Useful Links

Jurafsky, Daniel and James H. Martin. (2008) Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (2nd ed). Prentice-Hall. Especially Chapter 21 on Computational Discourse.
Pradhan, Sameer, Lance Ramshaw, Mitchell Marcus, Martha Palmer, Ralph Weischedel and Nianwen Xue. CoNLL-2011 Shared Task: Modeling Unrestricted Coreference in OntoNotes. In Proceedings of the Fifteenth Conference on Computational Natural Language Learning: Shared Task , 1--27, 2011.
Pradhan, Sameer, Alessandro Moschitti, Nianwen Xue, Olga Uryupina and Yuchen Zhang. CoNLL-2012 Shared Task: Modeling Multilingual Unrestricted Coreference in OntoNotes. In Joint Conference on EMNLP and CoNLL - Shared Task , 1--40, 2012.
The Stanford Coreference System
The Beautiful Anaphora Resolution Toolkit (BART)

[Back to top]
[next][prev][Back to top]

RJG-MSc-6: Mobile Robot Navigation

Suitable for: ACS/ASE

Background

Mobile robot navigation is an intellectually challenging and practically important problem. Key sub-problems include

self-localization -- determining where the robot is in relation to a map of the environment
mapping -- building up a map of the environment, when none exists, in terms of the location of objects within it
path or motion planning -- determining a path from the robot's current state to its goal state that avoids obstacles

Project Description

The aim of this project is to understand, implement and test the Monte Carlo localization algorithm (MCL) and perhaps to extend this to a particle filter-based approach to simultaneous localizaation and mapping (SLAM). In addition a basic path planning algorithm will be implemented, such as a cell decompoosition approach.

The algorithms will be implemented in Python and deployed on the NAO robots, recently acquired by the University. The goal is test the system at navigating amongst obstacles (e.g.carboard boxes) to carry out some instruction (e.g. to retrieve an object at a given location).

At the end of the project the student should have a solid understanding of the basics of robot self-localisation and path planning.

Prerequisites

Interest in robotics and sufficent mathematical knowledge and confidence to work with relatively sophisticated probablistic reasoning techniques (e.g. Dynamic Bayesian Nets and inexact inference over them).

Initial Reading and Useful Links

Chapter 25.3 and 25.4 in Russell, Stuart and Norvig, Peter (2010) Artificial Intelligence: A Modern Introduction (3rd ed). Pearson.
(e-book available from University Library)
Wikipedia article on Monte Carlo localization
Wikipedia article on Simultaneous localization and mapping
Wikipedia article on Motion Planning
Dellaert, F. and Fox, D. and Burgard, W. and Thrun, S. (1999) Monte Carlo Localization for Mobile Robots. IEEE International Conference on Robotics and Automation.

[next][prev][Back to top]

RJG-MSc-7: Temporal Information Extraction from Clinical Text

Suitable for: ACS/ASE

Background

Automatically extracting information from clinical notes, such as patient discharge summaries, is a challenging and important task. Much information about patient conditions and treatment is available only in textual form in clinical notes. The ability to extract this automatically would enable a wide range of applications in areas such as tracking disease status, monitoring treatment outcomes and complications and discovering medication side effects (Sun et al., 2013). Clinical events (diagnosis events, scan/test events, treatment events) and their temporal ordering and duration are essential information that any information extraction system working in this domain must be able to identify. The importance of this task was recognised in the 2012 I2B2 Shared Task Challenge (Sun et al., 2013), which defined a set of tasks relevant to temporal information extraction in clinical domains and supplied a corpus of annotated discharge summaries to facilitate research in the area. 18 systems from around the world took part to see how well their systems could perform.

Project Description

The aim of this project is to design, implement and evaluate a temporal relation extraction system to carry out the I2B2 Temporal relation extraction tasks. This will involve reviewing and assessing existing approaches, choosing and implementing an approach (likely a supervised machine learning based approach that will be trained on the training data supplied by the I2B2 organisers), and evaluating the results using the I2B2 test data. Results can be compared against the best performances obtained in the original 2012 challenge.

Prerequisites

Interest in language and/or text processing. No mandatory module prerequisities, but any or all of Machine Learning, Text Processing and Natural Language Processing are useful. Python is likely to be the language used to implement the system.

Initial Reading and Useful Links

Jurafsky, Daniel and James H. Martin (2008). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (2nd ed). Prentice-Hall. Especially Chapter 22 on Information Extraction.
Sun, Weiyi, Anna Rumshisky and Ozlem Uzuner (2013). Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. J Am Med Inform Assoc. 2013 Sep; 20(5): 806-813.

[next][prev][Back to top]

RJG-MSc-8: Building an Information Seeking Dialogue Agent

Suitable for: ACS/ASE

Background

Given recent advances in speech recognition, there is increasing interest in voice-driven information provision systems. Siri, Cortana, Amazon's Alexa and Google's Assistant are all examples of this. The architecture of such systems typically consists of speech recognition and speech generation components as initial and final components and then in between components to analyse the user input (the natural language understanding or NLU component), manage the dialogue between the user and the system (the dialogue manager), carry out any information seeking task required to answer a user query (task manager) and formulate a natural language response to the user (natural language generation or NLG component).

Any information provision system must draw on some source of information to supply answers to user queries. This could be the Web, but the Web is huge and unstructured and any attempt to answer questions using the Web must either content itself with a search engine-like approach in which whole documents are retrieved (not suitbale for speaking an answer to a user) or address the problems of questions answering or information extraction from arbitrary natural language. Another possibility is to use a large, multi-domain structured or semi-structured knowledge base. One such knowledge base is Google's Knowledge Graph; another is Wikidata.

Project Description

The aim of this project is to design, implement and evaluate an inforomation seeking dialogue agent that aims to carry out a fluent dialogue with a user to answer user questions using Google's Knowledge Graph or Wikidata as an information source. The agent will be text-based, i.e. it will not include speech recognition and generation components, though these could in principle be straightforwardly added, to constrain the scope of the project. The system should be able to carry out naturalistic dialogues on topics covered by Knowlege Graph (e.g Who played XX in the recent TV series YY? ... What other TV series or movies has he starred in? ). The project will begin by reviewing relevant literature on natural language front ends to databases and dialogue management and by exploring the Knowledge Graph API. Then an appropriate design will be chosen and implemented and the resulting system evaluated.

Prerequisites

Initial Reading and Useful Links

Google Knowledge Graph.
Jurafsky, Daniel and James H. Martin (2008). Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition (2nd ed). Prentice-Hall. Especially Chapters 22 on Information Extraction, Chapter 23 on Question Answering and Summarization and Chapter 24 on Dialogue and Conversational Agents.
Quarteroni, S. and Manandhar, S. (2009). Designing an Interactive Open-Domain Question Answering System. Natural Language Engineering 15(1), 73-95.
Wikidata.

[Back to top]

Last modified Nov 20 2016 by Rob Gaizauskas

RJG-MSc-1:	Automatic Face Detection in Images
RJG-MSc-2:	Clustering and Cluster Labelling of Reader Comments in On-Line News
RJG-MSc-3:	Gathering Visually Descriptive Language From Corpora
RJG-MSc-4:	Size Matters: Acquiring Vague Spatial Size Information from Textual Sources
RJG-MSc-5:	Machine Learning Approaches to Coreference Resolution in Conversational Data
RJG-MSc-6:	Mobile Robot Navigation
RJG-MSc-7:	Temporal Information Extraction from Clinical Text
RJG-MSc-8:	Building an Information Seeking Dialogue Agent