CREATIVE APPLICATIONS

The use of technology to manipulate and alter human or machine voices interest me greatly, and I’m particularly keen to use such techniques to create voices for animated agents and robots that are ‘appropriate’ to their visual and behavioural affordances. I'm also becoming increasingly involved in using such manipulations to explore more creative possibilities, and I’ve recently enjoyed very productive interactions with colleagues from the performing arts (particularly through my involvement in CREST - the EPSRC-funded Creative Speech Technology Network). In particular I work closely with Dr. Chris Newell based at the University of Hull. As well as pursuing various collaborative projects, Chris and I are slowly compiling a book entitled The Art of Artificial Voices in which we explore the creative uses of spoken language from both the technological and the performance perspectives.

Digital Voice Factory

Learn to talk backwards: Lets you record some speech, listen to it backwards and then attempt to reproduce what you’ve heard. With practice it’s possible to create sounds that, when reversed, sound normal.
Speak like a Dalek: Lets you record some speech and have it played back as if spoken by one of Dr. Who’s Daleks.
Have fun with your voice: Lets you record some speech and play around with how it sounds by changing the speed, pitch, modulation and echo.
Sound like a robot: Lets you record some speech and have it played back as if spoken by a robot.

The Technology

The main technique underlying Digital Voice Factory is a principle known as phase vocoding. Vocoding - short for voice coding - is the name given to a whole range of methods for analysing speech sounds and converting them into a digital electronic form that can be re-synthesised back into speech. All mobile phones use a vocoder. A phase vocoder is a particular type of voice coding that allows the digital signals to be manipulated in various ways, for example slowing it down or speeding it up without changing the pitch of the voice. The basic algorithm uses the fast Fourier transform (FFT) - a standard method for calculating the energy present at different frequencies. Phase vocoding was introduced in 1966 by Dr. Jim Flanagan of Bell Laboratories in the USA, and it’s widely used in the music industry – particularly in the commercial voice pitch correcting software Auto-Tune.

The Application

Digital Voice Factory is implemented in the Pure Data (Pd) programming environment with the addition of Pd's extra GUI package - GrIPD. The use of GrIPD means that that the software will only work on a PC running Windows. Of course, the PC must have a suitable soundcard installed. Note that it is designed to fit a 1680x1050 screen. Follow the instructions in the included readme.txt file to install everything on your own machine.

Download Digital Voice Factory (Zip file)

LINT-PADS

At an early CREST meeting, the actor, director and lecturer Paul Elsam introduced us to his LINT-PADS scheme (described in his book Acting Characters: 20 Essential Steps from Rehearsal to Performance). Paul had designed LINT-PADS in order to help young actors become more versatile by developing characterisation through their voice and speech. LINT-PADS is a simple mnemonic for the following vocal characteristics …

VOICE

Loudness is the volume at which sound is created.
Inflection is the way in which the voice slides through different musical notes.
Note describes the root note behind a person’s voice.
Tone is the hardness or softness of the quality of sound you make.

SPEECH

Pace is how quickly the words are spoken.
Accent is the distinctive regional or social sound heard when a person speaks.
Diction describes how well articulated the words are.
Specials are variations in sound pronunciation – what used to be called speech defects.

At a subsequent CREST meeting, I demonstrated that Paul’s LINT-PADS scheme could be applied to the modification of synthetic speech in real time. Implemented in Pure Data, the program allows a user to manipulate the parameters of a parallel formant synthesiser as it is speaking. Here’s a screen shot of the application …

Go to downloads page