Optimising voice cloning for assistive hearing technologies

Project ID: 2531bd1656

(You will need this ID for your application)

Research Area(s): Information and communications technologies

UCL Lead department: Division of Psychology and Language Sciences (PALS)

Lead Supervisor: Patti Adank

Project Summary:

As people age, understanding speech in noisy environments becomes increasingly difficult. While hearing aids amplify sound, they often fail to improve clarity or to reduce the mental effort required to follow conversations. Recent advances in voice cloning offer a promising new direction. We recently discovered that cloned (AI-generated, synthesised) voices offer a 20% intelligibility advantage over their human counterparts. This project will test how cloned voices can be optimised to support older adults with hearing loss, by enhancing intelligibility without compromising naturalness. Another key innovation in this project is the use of voice familiarity training, which allows listeners to become accustomed to specific voices, e.g., family members. This approach enables personalised speech synthesis, making synthetic speech more relatable and easier to understand in everyday settings. It also supports the development of optimised synthetic voices for public-facing applications, such as transport announcements or healthcare instructions. The PhD project will use open-source voice cloning software (e.g., Coqui TTS) to generate synthetic speech and embed them in degraded listening conditions: different types of energetic maskers (white noise, reverberation), informational maskers (competing speakers). The project will have the following overall approach:

Benchmark intelligibility across maskers (energetic/informational).
Identify which voice features listeners rely on most.
Test individualised voice enrichment strategies to enhance clarity without sacrificing naturalness. These three steps will be iterated with the aim to optimise voice cloning to each degradation, with the plan to also personalise cloning specifics to individuals’ hearing needs. This interdisciplinary project bridges speech technology, hearing science, and cognitive psychology, and will equip the student with skills in signal processing, experimental design, and human listening studies. Outcomes will inform the design of next-generation assistive communication technologies, including hearing aids, voice assistants, and speech-to-speech systems that adaptively enhance clarity in real time for ageing populations.