Rob Gaizauskas : 3rd Year Projects 2016-17

Email: Rob Gaizauskas

Project Titles:

RJG-UG-1:Automatic Face Detection in Images
RJG-UG-2:Analysing/Summarising Reader Comments in On-Line News
RJG-UG-3:Generating Image Descriptions from Labelled Bounding Boxes
RJG-UG-4:Size Matters: Acquiring Vague Spatial Size Information from Textual Sources
RJG-UG-5:Machine Learning Approaches to Coreference Resolution in Conversational Data
RJG-UG-6:Building Named Entity Recognizers from Wikipedia
RJG-UG-7:Mobile Robot Navigation


[next][prev][Back to top]

RJG-UG-1:   Automatic Face Detection in Images

Suitable for: CS/SE/AI/CS+Maths

Student numbers: this project is for 1 student only

Background

Automatic face detection algorithms are now to be found in many digital cameras, a striking application of computer vision research that has been realised largely in the past 10 years. Face detection and identification has a broad set of uses ranging from helping to organize personal photo collections to applications in security and criminal investigation.

While the algorithms that have been developed take advantage of specific aspects of the task (e.g the nature of human faces) the task and the many aspects of the algorithms developed to address it are examples of the wider class of computer vision problems/approaches referred to as object detection. Thus, face detection is a good problem to start with in learning basic computer vision techniques for object detection. It is also a good choice of problem to start with because there are various face detection challenge datasets available that allow one to compare one's own approach with others on standard datasets for which the performance of a wide range of approaches is known. For example the NIST Face Recognition Challenges and Evaluation datasets or the 300 Faces-in-the-Wild Challenge" dataset of the Facial Keypoints Detection dataset are candidate datasets for consideration in the project.

Project Description

This project will begin by reviewing: Following this review, one or more approaches to face detection will be chosen and implemented, building on existing computer vision resources where it is sensible to do so. The resulting algorithm(s) will be evaluated against community standard benchmarks and refinements made to the algoirthms, as far as time permits.

Prerequisites

An interest in computer vision, Python programming skills and a solid mathematical background are the only prerequisites for the project.

Initial Reading and Useful Links

Contact supervisor for further references.
[next][prev][Back to top]

RJG-UG-2:   Analysing/Summarising Reader Comments in On-Line News

Suitable for: CS/SE/AI/CS+Maths

Student numbers: this project is for up to 2 students.

Background

In a news publisher website such as The Independent or The Guardian, journalists publish articles on different topics from politics and civil rights to health, sports and celebrity news. The website design supports the publication and reading of original news articles and at the same time facilitates user-involvement via reader comments.

Increasingly, in a period of disruptive change for the traditional media, newspapers see their future as lying in such conversations with and between readers, and new technologies to support these conversations are becoming essential. In this scenario there are a number of potential users: news readers and the originating journalist want to gain a structured overview of the mass of comments, in terms of the sub-topics they address and the opinions (polarity and strength) the commenters hold about these topics; editors or media analysts may need a more widely scoping analysis.

At present none of these users can effectively exploit the mass of comment data -- frequently hundreds of comments per article -- as there are no tools to support them in doing so. What they need is new tools to help them make sense of the large volumes of comment data. Of particular potential here are NLP technologies such as: automatic summarisation, topic detection and clustering, sentiment detection and conversation/dialogue analysis.

The SENSEI Project is an EU-funded research project in which Sheffield is a partner. SENSEI's goal is to develop new technologies to assist in making sense of large volumes of conversational data, such as reader comments in on-line news, as described above. Researchers in Computer Science and Journalism here at Sheffield are currently addressing this challenge, using data from The Independent or The Guardian who are supporting the project. This student project will be carried out in conjunction with SENSEI, sharing data, tools and expertise.

Project Description

This project will begin by reviewing the language processing literature in one or more of the areas of automatic text summarisation, topic detection and clustering, sentiment detection and conversation/dialogue analysis, as well as the relevant literature in journalism and media studies about reader comment in news. Reader comment data will also be analysed and, based on this analysis and an understanding of relevant language processing techniques, a specification will be developed of a tool for analysing and re-presenting reader comment conversations along the dimensions of topic, sentiment or conversational structure.

The second part of the project will involve the detailed design, implementation and evaluation of a prototype for the tool specified the first part of the project. Of particular interest will be the evaluation of the tool: this may require the manual creation of some annotated data for use as a "gold standard" against which to evaluate the tool or the involvement of human subjects to make judgements about the utility of the tool.

Prerequisites

Interest in language and/or text processing. No mandatory module prerequisities, but any or all of Machine Learning, Text Processing, Natural Language Processing are useful.

Initial Reading and Useful Links


[next][prev][Back to top]

RJG-UG-3:   Generating Image Descriptions from Labelled Bounding Boxes

Suitable for: CS/SE/AI/CS+Maths

Student numbers: this project is for 1 student only

Background

The ever-growing volume of photographic images and videos on the web and in personal photographic collections motivates research into techniques to automatically recognize and annotate the content, i.e., subject matter of images, using either keywords or, ideally, fuller natural language descriptions. Such annotations will support better retrieval of images given a user query.

Advances in computer vision techniques mean that a wide range of visual features can now be extracted for use in image content analysis. These features can be used for object detection -- the task of identifying instances of objects in an image, labelling them with their class and localising them in the image by drawing a bounding box around them. The accuracy of object detection by modern vision processing systems has significantly advanced in the past few years. As a consequence, it is now worth thinking of how to select which objects to include in an image description and what to say about them in terms of their spatial relations and the activities or processes we can infer they may be involved in. This is the “content selection” part of the task of image description generation. The other part of the description task is choosing the linguistic form, i.e. the words and their order, to use to express the selected content.

The project will work in the framework of ImageCLEF, an on-going series of evaluation challenges run by the international information retrieval community. Starting in 2003, ImageCLEF has run an annual challenge consisting of one or more "tracks" – defined tasks with supplied development and test data which participants must address. Participants design and train their systems on the development data and then submit results of running their system on the test data. The results are evaluated by the organizers and then published at a workshop which participants attend to present their approaches to the task(s).

Project Description

The aim of this project is to build a prototype system for the generation of image descriptions. ImageCLEF 2015 is for the first time including, as part of the Image Annotation task, a subtask on the "Generation of Textual Descriptions of Images". There will be two conditions in this subtask: (a) participants are given "clean" object detection results for images (i.e. images with correctly labelled bounding boxes around objects of specific classes) and from this must generate an image description, and (b) participants use the output of noisy visual recognizers in generating an image description. The quality of the resulting image descriptions is then assessed against image descriptions generated by humans for the same set of images. This project will address the "clean" condition in the image description generation subtask and use ImageCLEF evaluation measures and data to assess results.

Prerequisites

Interest in language and/or text processing and in image analysis. No mandatory module prerequisities, but any or all of Machine Learning, Text Processing and Natural Language Processing are useful.

Initial Reading and Useful Links


[next][prev][Back to top]

RJG-UG-4:   Size Matters: Acquiring Vague Spatial Size Information from Textual Sources

Suitable for: A CS/SE/AI/CS+Maths

Student numbers: this project is for 1 student only

Background

How big is an adult elephant? or a toaster? We may not know exactly, but we have some ideas: an adult elephant is more than a meter tall and less than 10 meters; a toaster is more than 10cm tall and less than a meter. In particular we have lots of knowledge about the relative sizes of objects: a toothbrush is smaller than a toaster; a toaster is smaller than a refrigerator.

We use this sort of "commonsense" knowledge all the time in reasoning about the world: we can bring a toaster home in our car, but not elephant. An autonmous robot moving about in the world and interacting with objects would need lots of this kind of knowledge. Moreoever, we appear to use knowledge about typical object sizes, along with other knowledge, in interpreting visual scenes, especially in 2D images, where depth must be inferred by the observer. For example, if, when viewing an image, our knowledge about the relative sizes of cars and elephants is violated under the assumption that they are in the same depth plane, then we re-interpret the image so that the car or elephant moves forward or backward relative to the other, so that the assumption is no longer violated. Thus, information about relative or absolute object size is useful in computer image interpretation. It is also useful in image generation: if I want to generate a realistic scene containing cars and elephants then I must respect their relative size constraints. Various computer graphics applications could exploit such information.

Manually coding this kind of information is a tedious and potentially never ending task, as new types of objects are constantly appearing. Is there a way of automating the acquisition of such information? The hypothesis of this project is that there is: we mine this information from textual sources on the web that make claims about the absolute or relative sizes of objects.

Project Description

The project will explore ways of mining information about the absolute and relative size of objects from web sources, such as Wikipedia. The general task of acquiring structured information from free or unstructured text is called text mining or information extraction and is a well-studied application in the broader area of Natural Language Processing (NLP).

The project will begin by reviewing the general area of information extraction with NLP, with specific attention to tasks like named entity extraction (which has been used, e.g. to identify entities like persons, locations and organisation as well as times, dates and monetary amounts and could be adapted to identify object types and lengths) and relation extraction (which has been used to recognise relations between entities, such works-for, and attributes of entities, such as has-location, and could be adapted to recognise the size-of attribute).

Information extraction systems in general are either rule-based, where the system relies on manually-authored patterns to recognise entities and relations, or supervised learning-based where the system relies on learning patterns from manually annotated examples. Following initial reading and analysis of the data, a decision will be made about which approach to take.

In addition to identifying absolute size information (e.g. the Wikipedia article on "apple" says: "Commercial growers aim to produce an apple that is 7.0 to 8.3 cm (2.75 to 3.25 in) in diameter, due to market preference.") the project will also investigate how to acquire relative size information. For example, sentences indicating the topological relation inside ("You can help apples keep fresh longer by storing them in your pantry or refrigerator drawer") allow us to extract the information that apples are smaller than refrigerators. By building up a set of facts expressing relative sizes, together with absolute size information, we can infer upper and lower bounds on the size of various objects. Of course, differring, even conflicting, information may be acquired from different sources. Some means of merging/reconciling this information will need to be devised.

The final element of the project will be to couple the acquired facts to a simple inference engine that will allow is to infer "best estimate" bounds on the size of objects from whatever facts we have acquired. E.g. if all we know is that apples fit inside refrigerators and that refrigerators are no more than 3m tall, then need to be able to infer that apples are less than 3m tall.

Of course, in addition to implementing the above capabilities the project will need to investigate ways of evaluating the various outputs. For example, the accuracy of identifying size information ("15 cm", "5-6 feet"), size-of information ( size-of(apple,7.0-8.3cm)) and relative size information ("apple <= refrigerator") needs to be assessed, as does the correctness of the inference procedure.

Prerequisites

Interest in language and/or text processing and in image analysis. No mandatory module prerequisities, but any or all of Machine Learning, Text Processing and Natural Language Processing are useful.

Initial Reading and Useful Links


[next][prev][Back to top]

RJG-UG-5:   Machine Learning Approaches to Coreference Resolution in Conversational Data

Suitable for: CS/SE/AI/CS+Maths

Student numbers: this project is for 1 student only

Background

Anaphora or anaphors are natural language expressions which "carry one back" to something introduced earlier in a discourse or conversation. For example, a news article might begin by introducing David Cameron and later refer to him as "the Prime Minister" or "he". In this example, "the Prime Minister" and "he" are anaphors and "David Cameron" is called the antecedent. Anaphors are said to corefer with their antecedents, since they refer to the same things in the "real" (i.e. non-textual) world. Determining which expressions corefer, the problem of coreference resolution, is an important part of understanding the meaning of a text. Easy for humans, it is a significant challenge for computers.

The task of building coreference resolution algorithms has been studied in Natural Language Processing (NLP) for some time. To develop systems and assess their performance the coreference community has built annotated training and test datasets that contain manually annotated coreference links between coreferring phrases. The best known recent datasets are those produced for the CONLL 2011 and CONLL 2012 Shared Task Challenges, which were devoted to coreference resolution. Not only do such datasets allow competing algorithms to be compared by comparing their scores on the same data, they also permit the development of algorithms based on supervised machine learning techniques that learn from annotated examples.

Different text genres (e.g. news articles, fiction books, academic journal papers, transcribed TV programs) exhibit different linguistic features, and in particular different patterns of coreference. While the CONNL 2011 dataset contains a mix of genres, including some conversational data (transcribed telephone conversations), it does not contain and data drawn from the sort of social media conversations one finds in, e.g., reader comments on on-line news (see the BBC or The Guardian or The Independent websites for examples). This genre of text is being studied in the SENSEI Project, an EU-funded research project in which Sheffield is a partner. One of SENSEI's aims is to produce summaries of reader comment sets consisting of potentially hundreds of comments. Detecting coreferences within this data is likely to be critical to generating good summaries.

Project Description

This project has two clear objectives. The first is to develop and assess a general supervised learning-based coreference resolution algorithm using the CONLL 2011 and 2012 Shared Task Challenges and datasets as a basis. The second is to explore how well the developed algorithm performs on reader comment data. Some data for evaluating coreference in reader comments is being developed by one of the SENSEI partners and should be available for use by the project. Some additional data may need to be annotated for evaluation.

The project will begin by reading background and related work. Several freely available coreference systems, specifically the Stanford and BART coreference systems, will be investigated and run on the CONLL data. A system design, based on a supervised learning approach and possibly an extension of one of the freely available systems, will be selected and a system developed and evaluated on the CONLL data. The system will then be tested on the SENSEI reader comment data, the results analysed and improvements proposed and tested.

Prerequisites

Interest in language and/or text processing and in image analysis. No mandatory module prerequisities, but any or all of Machine Learning, Text Processing and Natural Language Processing are useful.

Initial Reading and Useful Links


[Back to top]

[next][prev][Back to top]

RJG-UG-6:   Building Named Entity Recognizers from Wikipedia

Suitable for: CS/SE/AI/CS+Maths

Student numbers: this project is for 1 student only

Background

"Named Entities" are real world entities which possess a name in natural language. Common examples are persons (e.g. "David Cameron", "Michelle Roberts"), organizations ("The University of Sheffield", "Google Inc.") and locations (e.g. "Sheffield", "Ghana").

Named entity recognizers are software applications that identify and classify mentions of named entities in running text. I.e. these programs determine the extent or boundaries of named entities in text, typically by marking them up using XML, and then classify the marked named entities into classes such as "person", "organization", "location", etc.

Named entity recognizers are important software components used in many language processing systems such as search engines, text mining systems, machine translation systems and so on. They have been studied for many years and relatively high accuracy systems exist for English and for a few other widely spoken languages. However, for many languages such systems do not exist and there is a need to develop them if these languages are to participate fully in the newest advances in language technology applications.

Currently the most popular approach to building named entity recognizers is to use supervised learning: a selection of training texts is annotated by hand to mark up all occurrences of named entities in them and to assign a class to each annotated mention. Standard machine learning algorithms are then used to train classifiers on this manually prepared data to annotate new texts automatically.

This approach works quite well, but has the drawback that a considerable amount of text needs to be annotated. This poses a limitation for minority lanaguges, which may not be able to muster the resource to carry out the annotation. Thus the question arises: can we use existing structured or semi-structured data to create training data automatically? One way this might be done is using Wikipedia. Wikipedia contains lots of text in many languages with many mentions of named entities. Its ancillary resource, DBPedia, contains typing information about many of these entities (e.g. person, location, etc.). Can we exploit these resources to create training data to train named entity recognizers for a wide range of languages? If so, how well will they work? These are the question this project will address.

Project Description

This project will design, implement and evaluate a prototype system consisting of two components. The first component will generate training data for a named entity recognizer by annotating text in Wikipedia articles with named entity types determined by using DBPedia and/or surface analysis of the initial sentences in Wikipedia entries, which typically indicate the entries' semantic types (person, company, etc.) in a straightforward way. The second component will use annotated data generated by the first component to train and evaluate a named entity recogniser, using an existing trainable named entity recogniser such as the ones in the Stanford Core NLP Tool suite in OpenNLP.

Prerequisites

Interest in language and/or text processing. No mandatory module prerequisities, but any or all of Machine Learning, Text Processing, Natural Language Processing are useful.

Initial Reading and Useful Links

Contact supervisor for further references.
[next][prev][Back to top]

RJG-UG-7:   Mobile Robot Navigation

Suitable for: CS/SE/AI/CS+Maths

Student numbers: this project is for 1 student only

Background

Mobile robot navigation is an intellectually challenging and practically important problem. Key sub-problems include

Project Description

The aim of this project is to understand, implement and test the Monte Carlo localization algorithm (MCL) and perhaps to extend this to a particle filter-based approach to simultaneous localizaation and mapping (SLAM). In addition a basic path planning algorithm will be implemented, such as a cell decompoosition approach.

The algorithms will be implemented in Python and deployed on the NAO robots, recently acquired by the University. The goal is test the system at navigating amongst obstacles (e.g.carboard boxes) to carry out some instruction (e.g. to retrieve an object at a given location).

At the end of the project the student should have a solid understanding of the basics of robot self-localisation and path planning.

Prerequisites

Interest in robotics and sufficent mathematical knowledge and confidence to work with relatively sophisticated probablistic reasoning techniques (e.g. Dynamic Bayesian Nets and inexact inference over them).

Initial Reading and Useful Links


[Back to top]