Share on Facebook
Share on Twitter
Share via e-mail

Copyright © Roger K. Moore | contact                                     Site designed using Serif WebPlus X6


<< back to research


The use of technology to manipulate and alter human or machine voices interest me greatly, and I’m particularly keen to use such techniques to create voices for animated agents and robots that are ‘appropriate’ to their visual and behavioural affordances.  I'm also becoming increasingly involved in using such manipulations to explore more creative possibilities, and I’ve recently enjoyed very productive interactions with colleagues from the performing arts (particularly through my involvement in CREST - the EPSRC-funded Creative Speech Technology Network).  In particular I work closely with Dr. Chris Newell based at the University of Hull.  As well as pursuing various collaborative projects, Chris and I are slowly compiling a book entitled The Art of Artificial Voices in which we explore the creative uses of spoken language from both the technological and the performance perspectives.

Digital Voice Factory

The Technology

The main technique underlying Digital Voice Factory is a principle known as phase vocoding.  Vocoding - short for voice coding - is the name given to a whole range of methods for analysing speech sounds and converting them into a digital electronic form that can be re-synthesised back into speech.  All mobile phones use a vocoder.  A phase vocoder is a particular type of voice coding that allows the digital signals to be manipulated in various ways, for example slowing it down or speeding it up without changing the pitch of the voice.  The basic algorithm uses the fast Fourier transform (FFT) - a standard method for calculating the energy present at different frequencies.  Phase vocoding was introduced in 1966 by Dr. Jim Flanagan of Bell Laboratories in the USA, and it’s widely used in the music industry – particularly in the commercial voice pitch correcting software Auto-Tune.

The Application

Digital Voice Factory is implemented in the Pure Data (Pd) programming environment with the addition of Pd's extra GUI package - GrIPD. The use of GrIPD means that that the software will only work on a PC running Windows.  Of course, the PC must have a suitable soundcard installed.  Note that it is designed to fit a 1680x1050 screen.  Follow the instructions in the included readme.txt file to install everything on your own machine.

Download Digital Voice Factory (Zip file)


At an early CREST meeting, the actor, director and lecturer Paul Elsam  introduced us to his LINT-PADS scheme (described in his book Acting Characters: 20 Essential Steps from Rehearsal to Performance).  Paul had designed LINT-PADS in order to help young actors become more versatile by developing characterisation through their voice and speech.  LINT-PADS is a simple mnemonic for the following vocal characteristics …



At a subsequent CREST meeting, I demonstrated that Paul’s LINT-PADS scheme could be applied to the modification of synthetic speech in real time.  Implemented in Pure Data, the program allows a user to manipulate the parameters of a parallel formant synthesiser as it is speaking.  Here’s a screen shot of the application …

Go to downloads page