P.Dey, "Visual Speech in Technology-Enhanced Learning", PhD, 2012. (Supervisors: Dr Steve Maddock, Prof Rod Nicolson (Psychology)).
This thesis investigates the use of synthetic talking heads, with lip, tongue and face movements synchronized with synthesized or natural speech, in technology-enhanced learning. This work applies talking heads in a speech tutoring application for teaching English as a second language. Previous studies have shown that speech perception is aided by visual information, but more research is needed to determine the effectiveness of visualization of articulators in pronunciation training. This thesis explores whether or not visual speech technology can give an improvement in learning pronunciation.
This thesis investigates techniques for audiovisual speech synthesis, using both viseme-based and data-driven approaches to implement multiple talking heads. Intelligibility studies found the audiovisual heads to be more intelligible than audio alone, and the data-driven head was found to be more intelligible than the viseme-driven implementation.
The talking heads are applied in a pronunciation-training application, which is evaluated by second-language learners to investigate the benefit of visual speech in technology-enhanced learning. User trials explored the efficacy of the software in demonstrating the /b/-/p/ contrast in English. The results indicate that learners showed an improvement in listening and pronunciation after using the software, while the benefit of visualization compared to auditory training alone varied between individuals. User evaluations found that the talking heads were perceived to be helpful in learning pronunciation, and the positive feedback on the tutoring system suggests that the use of talking heads in technology-enhanced learning could be useful in addition to traditional methods.