The CHiME Corpus

CHiME is collecting a corpus of domestic audio recorded using a binaural manikin. Recordings are being made in the living rooms, dining rooms and kitchens of a couple of real homes. In addition to the audio recordings, rooms impulse responses are being sampled at a systematic set of positions in each room. The responses will allow external speech recordings to be mixed into the background audio in a realistic and yet carefully controllable manner.

Once complete, the corpus will be made freely available for research use.

Recording Set Up

Equipment set up The CHiME recording setup consists of a pair of B&K microphones mounted in a B&K anthropometric manikin (head and torso) connected to a MacBook Pro which records 96KHz binaural signals direct to disk via a MOTU A-D unit. Impulse responses are measured with the same equipment using Farina's sine sweep method. The sine-sweeps are played through a B&K artificial mouth in order to simulate the directivity of natural speech.

Recordings have been made in a number of sessions sampling roughly a week's worth of family activity in each room. Impulse responses for each room have been made at a range of distances and angles with respect to the B&K head.

Audio Samples

annotated ratemap The figure to the right shows an example of a time-frequency representation of a twenty second segment of data that has already been recorded. This acoustically cluttered example illustrates some of the huge challenges presented by the target scenario: The speech is embedded in a noise background that although quasi-stationary can change abruptly in response to unpredictable events occurring in the room (doors opening, appliances being turned on or off); on top of the background there are abrupt impact noises such as footsteps and doors banging that can mask even highly energetic portions of the speech signal; there may be multiple speakers in the room producing overlapping speech; not all speech will be directed at the system.

Some audio examples are available here.

Further details

A technical report providing details of the recordings will appear shortly.