Pgt 2013

JPB-MSc-1: Acoustic scene classification challenge
JPB-MSc-2: Acoustic event detection challenge
JPB-MSc-3: Data Visualisation for CHiME Domestic Audio Data [Group Project]

The project descriptions below are only intended as starting points. If you wish to discuss possibilities in greater detail I encourage you to email me to arrange a meeting.

JPB-MSc-1: Acoustic scene classification challenge

Description

Consider the problem of taking an audio recording and trying to tell where it has been made. For example, is it indoors or outdoors? Is it in a recording from a city street or from a country park? Is it a supermarket or a restaurant? This task is know as ‘acoustic scene classification’. Humans find it very easy but it is very difficult to produce automatic systems that operate reliably. However, it is an important task as it occurs as a component in many larger applications.

This project will attempt to build an acoustic scene classification system that will be evaluated using data from the recent D-CASE scene classification challenge described here: The D-CASE Challenge

TOP

[TOP]

JPB-MSc-2: Acoustic event detection challenge

Description

Consider the problem of listening an audio file and trying to detect the occurrence of a specific sound event within it. For example, the task might be to detect all occurrence of doors slamming, or of people laughing, or of telephones ringing. This task is known as acoustic event detection. Humans are incredibly good at this but it is extremely hard to produce automatic systems can come close to Human levels of performance. However, solutions to this problem would be incredible useful in a huge range of applications.

This project will attempt to build an acoustic event detector for a range of commonly occurring sounds following the specification of the recent D-CASE acoustic event detection challenge described here: The D-CASE Challenge

[TOP]

JPB-MSc-3: Data Visualisation for CHiME Domestic Audio Data [Group Project]

Description

As part of a funded research project we have recently collected 50-hours of audio data from a domestic family home. This data has mainly been used as background `noise’ for evaluating speech recognition algorithms, but it also provides an opportunity to learn about the characteristics of everyday listening environments. This project will employ unsupervised learning, clustering and data visualisation techniques in order to infer and represent interesting structure in the data. For example, is it possible to detect changes of state in the acoustic scene (e.g. a TV being turned on, a conversation starting)? Is it possible to detect repeating patterns in the data (e.g. a telephone ringing)? Is it possible to detect novel (i.e. previously unheard) acoustic events?

The project will experiment with traditional and `auditory’ acoustic feature extraction techniques and will start by applying conventional clustering and background modelling ideas. For an example of similar work see here.

This project is suitable for a team of 2 to 3 students. Students will be sharing the same data and some of the same analysis and statistical modelling tools but will be addressing different aspects of the problem.

Requirements

An interest in audio signal processing and statistical modelling.

Initial reading

The CHiME project page
Christensen, H., Barker, J., Ma, N., and Green, P. (2010) The CHiME corpus: a resource and a challenge for Computational Hearing in Multisource Environments. Interspeech’10, Makuhari, Japan, September 2010.
Ntalampiras, S., Potamitis, I. and Fakotakis, N. (2011) Probabilistic novelty detection for acoustic surveillance under real-world conditions IEEE Trans. on Multimedia, 13(4), pp713-719, 2011</li>

[TOP]