Pgt 2003

JPB-MSc-1: Automatic Photo Rotator (Zhou Su)
JPB-MSc-2: Video Analysis and Indexing (Jingsheng Du)
JPB-MSc-3: Facial Spot Removal (Oluwatosin Adenaike)
JPB-MSc-4: Face Detection for Meeting Room Video Data (Farhat Aisha)

The project descriptions below are only intended as starting points. If you wish to discuss possibilities in greater detail I encourage you to email me to arrange a meeting.

JPB-MSc-1: Automatic Photo Rotator

Description

This project is suitable for ACS.

When using a camera people will often choose to hold the camera on its side to take a ‘portrait’ format picture (i.e. tall and thin rather than short and wide). With digital cameras this leads to the annoyance that when the pictures are downloaded to a PC they are by default assumed to be landscape format and portrait shots appear on their sides. The user has to identify these pictures and manually rotate them. This project aims to develop a system that applies this rotation automatically.

This is a research style project. The key to the system will be the application of content-based image orientation classification. The system needs to be able to decide from the image’s appearance alone whether it is the right way up or whether it is turned on its side. A classifier needs to be trained to make this decision as accurately as possible. Previous work in the area has used a variety of classification techniques (see reading list below). A main concern of the project will be finding suitable image features on which to base the decision.

The project may use Intel’s <a href=http://www.intel.com/research/mrl/research/opencv/>OpenCV</a> - a C++ image processing library - to extract visual features from the photographs. Suitable training data can be collected by gathering photographs from the web.

Prerequisites

Good maths and programming skills: Preferably C/C++
Gonzales and Woods, Digital Image Processing, Addison-Wesley Pub. Co, Reading, Massachusetts, 1992. (or any other similar textbook)

[TOP]

JPB-MSc-2: Video Analysis and Indexing

Description

This project is suitable for ACS

Ever more multimedia data is becoming available over the Internet. Finding ways to analyse and search this data is becoming increasingly important. There is an ever growing set of techniques that are being applied to help structure this data through analysis of its visual component. These techniques include among others:

shot and scene segmentation,
object and camera motion analysis,
text detection.

For a comprehensive review see Brunelli et al (1999).

The project will make use of Intel’s recently released computer vision library (<a href=http://www.intel.com/research/mrl/research/opencv/>OpenCV</a>). This will provide a springboard enabling the project work to be developed quickly and efficiently.

I am happy to supervise a student wishing to work in this broad area, and I am willing to let the student choose a topic according to his or her specific interest. If you are considering this project please come and see me and we can discuss ideas in more detail.

Prerequisites

C/C++ programming skills,
a laptop capable or running linux will be an advantage.

Initial reading

Brunelli, Mich and Modena (1999) A Survey of video indexing, J. of Visual Communication and Image Representation
Smith and Kanade (1995) Video skimming for quick browsing based on audio and image characterization, Carnegie Mellon University technical report CMU-CS-95-186
Gonzales and Woods, Digital Image Processing, Addison-Wesley Pub. Co, Reading, Massachusetts, 1992. (or any other similar textbook)

[TOP]

JPB-MSc-3: Facial Spot Removal

Description

This project is suitable for SST (or possibly ACS)

There has been a recent growth of interest in automatic audio-visual speech recognition. Recognition systems are sometimes help during training by using videos of speakers who have visible markers attached to their faces (e.g. small coloured stickers). However, these markers may be distracting for Humans and mean the training data cannot be readily used in Human perceptual studies. So, the aim of this project will be to try and digitally remove the markers by extrapolating the colour and texture of the surrounding skin. Note, the technique also has application as a means for removing blemishes (e.g. moles, spots) from the faces of actors in still photographs and film!

The system can either work in a semi-automatic mode where an operator tells the system where the marker is in each frame. Or, more ambitiously, the system may try and track the markers from frame to frame allowing the operation to be performed with minimal Human intervention.

Prerequisites

none

Initial reading

Guenter et al, (1998) Making Faces Proc. 25th annual conference on Computer graphics and interactive techniques, Pages 55 - 66 (section 5)
Gonzales and Woods, Digital Image Processing, Addison-Wesley Pub. Co, Reading, Massachusetts, 1992. (or any other similar textbook)

[TOP]

JPB-MSc-4: Face Detection for Meeting Room Video Data

Description

This project is suitable for ACS.

Face detection is one of the key tasks in a range of applications. These include video surveillance, interactive robotics, audio visual speech recognition and image and video indexing and retrieval. Sheffield is involved in a number of major projects (e.g. M4) that involve the collection and analysis of audio-video data captured from meeting room scenarios. This project would aim to implement and and possibly extend an existing technique for face detection and attempt to evaluate it using the M4 data.

Prerequisites

COM6070 AVSP ART,
good maths,
either MATLAB or C/C++ programming experience.

Initial reading

Hjelmas and Low (2001) Face Detection: A Survey
The Machine Perception Lab at UCSD has a collection of recent face detection papers
Rowley Baluja Kanade (1996) Neural Network Based Face Detector
Turk and Pentland, (1991) Eigenfaces for recognition

This project is suitable for ACS.

Description

When we listen to someone speaking there are often many other background noises present that can make it difficult to hear what is being said. It has been sh own that in very noisy conditions we subconsciously make use of the speaker’s lip movements in order to separate the speech from the other sound sources.

One approach to the speech separation problem is described by Sodoyer et al. (2002) Their technique separates speech signals by maximising the coherence between the target speech signal and the visual speech parameters. The project will implement (and possibly, extend) the algorithm described by Sodoyer et al. The system will be tested using mixtures of connected digits constructed using the CUAVE audio-visual speech corpus.

Prerequisites

This is a difficult project and requires strong mathematical ability.
COM3130

Initial reading

Sodoyer et al., (2003) Further experiments on audio-visual speech source separation, Proc AVSP 2003</li>
Bregman, A.S., (1990) Auditory Scene Analysis , MIT Press, Cambridge, MA
Cooke and Ellis (2001) The auditory organization of speech and other sources in listeners and computational models. Speech Communication

[TOP]