thoughts / The concerns of voice technology

Can we learn from our mistakes of previous technology and get a headstart with the new voice industry.

Increasingly, the use of digital voices is being used to challenge stereotypes around our response to the voice of a specific gender. Previous research has shown that people have often preferred to hear a male voice as this is seen to be an 'authority source'. Developers are wanting to overthrow this gender stereotyping while also aiming to contribute to a global conversation about gender and ensuring voice technology is inclusive of; however, people identify.

This is exactly what developers from Project Q were aiming for when they created the worlds first genderless voice. They recorded the voices of people who identify as female, male, non-binary or transgender and have created the voice of Q, which incorporates all of these voices. They are hoping that businesses will invest in this voice and contribute to a much-needed sea change in challenging gender stereotypes.

With the development of smart speakers such as Alexa and Siri, this has made the web far more accessible for users with disabilities. Smart speakers have enabled users with visual impairments to access the internet through voice control, making it much easier to make online purchases, access information and send messages. Equally, for users with mobility issues, smart speakers have removed the need for keyboards or touch screens making access to digital services much more accessible.

There are concerns

There are some reservations about the ability to use human voices to create digital impersonations of those voices. Developments in AI mean that it is getting increasingly likely that a digital voice will be capable of mimicking an original voice, thereby paving the way for a blurring of reality as to what key public figures have said and have not said. Concerns around the potential for voice impersonations to create fake news, put words in the mouth of key political figures and potentially to empower criminals are becoming increasingly justifiable.

This is why there are calls for the development of a defence against audio deepfakes before it's too late. Experts are requesting that alongside the development of voice digitalisation, there is a corresponding growth of forensic technology that can detect whether or not a voice is real or not. These concerns are coming from genuine concerns around the potential for voice digitalisation to make the existing problem of fake news even more of an issue.

Developments in the potential to digitalise your voice have grown leaps and bounds in the last decade. It's got to a point where even Alexa sounds a little outdated when you have the opportunity to hear realistic speech generated from technology that learns from real human voices. Siri in iOS 13 has a new voice, a more humanistic one and with Google's WaveNet program from DeepMind, it won't be long until a generated voice is indistinguishable from a real human voice.

The benefits of voice digitalisation are numerous, including its potential to challenge gender stereotypes, as well as opening up the internet to those with accessibility needs. Inevitably, there are critics of this developing technology who fear the potential for realistic speech generation to create fake news and potential chaos.