2019 saw the start of ten AHRC Collaborative Doctoral Awards – PhD projects in which students and partner organisations work together on a piece of applied research that will have immediate impact. The PhD experience these projects offer is very different from normal, bringing their own opportunities and challenges. In this series we hope to celebrate the early experiences of the students undertaking this collaborative and dynamic research.
This blog series will be regularly updated throughout the spring of 2020.
Lucy Brownson |
CHATSWORTH |
|
Christopher Wakefield |
COUNCIL FOR BRITISH ARCHAEOLOGY |
|
Louise Calf |
CHATSWORTH |
|
Katie Crowther |
NATIONAL TRUST |
|
Elliot Holmes |
ACULAB PLC |
Elliot Holmes
Department of Language and Linguistic Science, University of York
2019 Cohort
Over the last year, a Collaborative Doctoral Award (CDA) from WRoCAH has changed my life in the most unique and exciting ways. It’s given me the opportunity to research my passion, Forensic Speech Science, in a new and important way: I’m still conducting research with the support of a fantastic academic institution, the University of York, but through a partnership with Aculab, a telecommunications company, I can conduct my research with the additional perspectives of professionals in the wider world of Forensic Speech Science. This is fantastic because it allows me to get hands-on and see how my research will directly impact society.
Going back to my project, it’s called ‘Towards linguistically-informed automatic speaker recognition’. The overall goal is to use linguistic information about the human voice, such as pitch, creakiness, and breathiness, to better understand what makes your voice unique from someone else’s. Using this information, I want to improve current voice security systems, used by important businesses and banks such as Microsoft and HSBC, and ensure that they are as reliable as possible. I believe that this is a necessary investigation in the modern world because automatic speaker recognition systems are considered black boxes; we don’t actually know what voice features distinguish you from someone else. Thus, we cannot be sure whether these voice security systems are protecting you as well as they could be without first understanding the linguistic distinctions between your voice and that of someone else.
Aculab are a telephony company who created a cloud platform for users to accomplish a number of tasks automatically: it can manage calls and SMS messaging, send and receive documents, recognise speech, and convert text from numerous languages into speech. Excitingly, Aculab are also the developers of VoiSentry, a voice security system. This therefore offers novel opportunities for my research which would not be possible without the industrial links of a CDA project: through my partnership with Aculab, I will have access to state-of-the-art voice security technology that I will be able to use to identify which linguistic variables uniquely recognise you.
And this isn’t the only benefit of an industrial partner. Beyond having access to their technologies, I can also consult their state-of-the-art developers on my research and on where the Forensic Speech Science field is heading. Furthermore, Aculab also took me with them to Interspeech 2019 in Graz, Austria, where I was able to explore other state-of-the-art technologies in the field and network with other researchers. This was a huge personal development for me because it was my first experience of a big academic conference; as my career aspirations are to be a researcher, this marked a huge milestone in my career and I loved the experience.
A cheeky photo taken from the Interspeech 2019 introduction.
The partnership is already proving to be exceptionally fruitful. So far, it has allowed me to progress through my early PhD work at a rapid speed and I already have some interesting and important findings to share: using VoiSentry and other technologies, I have been able to identify what linguistic features separate human speech from synthetic speech. This will benefit vulnerable voice security systems, such as HSBC’s which has recently been breached, by ensuring that hackers with state-of-the-art voice synthesisers cannot replicate your voice and access your bank account. Specifically, I have found that your voice break patterns and your shimmer patterns, which are random deviations in your volume, are the features of your voice that synthesisers cannot replicate very well.
Conferences also have added tourism benefits…
It is findings such as these that I hope guide my PhD: using Aculab’s insights and technologies to ground my research in real-world voice security issues, I next want to examine how features such as these voice break and shimmer patterns distinguish one person’s speech from another person’s speech. I believe that measuring these features will create the most reliable methodology for voice security because of the linguistic focus; it will be the first methodology to explicitly measure the voice for security purposes using linguistic theory. These goals, I think, sum up the best values of a CDA project: whilst rooted in an educational institution where I am getting the best support for my research skills and academic findings, the industrial partnership highlights exactly how this research can be used in the real world.
To conclude, my experience so far has proven to me that CDA projects are rewarding opportunities where everyone involved reaps worthwhile rewards: as a partner, you get to see how your products and developments can be used in a research context and you get to use this research to change and evolve your work and company. As a student, you get invaluable access to support from both your University, who provide academic support, and from your industrial partner, who can provide support in the form of resources, such as VoiSentry for me, and perspectives on the real-world application of your project. This, for me personally, has ultimately become a huge source of confidence: I have the support of many experts around me and it makes me feel proud knowing that my research can be important.
To find out more about Elliot and his PhD project, ‘Towards linguistically-informed automatic speaker recognition‘, visit the WRoCAH Research pages.