- JPB-MSc-1: Acoustic scene classification challenge
- JPB-MSc-2: Acoustic event detection challenge
- JPB-MSc-3: Data Visualisation for CHiME Domestic Audio Data [Group Project]
JPB-MSc-1: Acoustic scene classification challenge
Description
Consider the problem of taking an audio recording and trying to tell where it has been made. For example, is it indoors or outdoors? Is it in a recording from a city street or from a country park? Is it a supermarket or a restaurant? This task is know as ‘acoustic scene classification’. Humans find it very easy but it is very difficult to produce automatic systems that operate reliably. However, it is an important task as it occurs as a component in many larger applications.
This project will attempt to build an acoustic scene classification system that will be evaluated using data from the recent D-CASE scene classification challenge described here: The D-CASE Challenge
[TOP]
JPB-MSc-2: Acoustic event detection challenge
Description
Consider the problem of listening an audio file and trying to detect the occurrence of a specific sound event within it. For example, the task might be to detect all occurrence of doors slamming, or of people laughing, or of telephones ringing. This task is known as acoustic event detection. Humans are incredibly good at this but it is extremely hard to produce automatic systems can come close to Human levels of performance. However, solutions to this problem would be incredible useful in a huge range of applications.
This project will attempt to build an acoustic event detector for a range of commonly occurring sounds following the specification of the recent D-CASE acoustic event detection challenge described here: The D-CASE Challenge
[TOP]
JPB-MSc-3: Data Visualisation for CHiME Domestic Audio Data [Group Project]
Description
As part of a funded research project we have recently collected 50-hours of audio data from a domestic family home. This data has mainly been used as background `noise’ for evaluating speech recognition algorithms, but it also provides an opportunity to learn about the characteristics of everyday listening environments. This project will employ unsupervised learning, clustering and data visualisation techniques in order to infer and represent interesting structure in the data. For example, is it possible to detect changes of state in the acoustic scene (e.g. a TV being turned on, a conversation starting)? Is it possible to detect repeating patterns in the data (e.g. a telephone ringing)? Is it possible to detect novel (i.e. previously unheard) acoustic events?
The project will experiment with traditional and `auditory’ acoustic feature extraction techniques and will start by applying conventional clustering and background modelling ideas. For an example of similar work see here.
This project is suitable for a team of 2 to 3 students. Students will be sharing the same data and some of the same analysis and statistical modelling tools but will be addressing different aspects of the problem.
Requirements
An interest in audio signal processing and statistical modelling.
Initial reading
- The CHiME project page
- Christensen, H., Barker, J., Ma, N., and Green, P. (2010) The CHiME corpus: a resource and a challenge for Computational Hearing in Multisource Environments. Interspeech’10, Makuhari, Japan, September 2010.
- Ntalampiras, S., Potamitis, I. and Fakotakis, N. (2011) Probabilistic novelty detection for acoustic surveillance under real-world conditions IEEE Trans. on Multimedia, 13(4), pp713-719, 2011</li>