Digitalising your own voice
Voice technology has developed tenfold in recent years and it is not uncommon nowadays to find an Alexa or Google Home in one of your friend’s houses. We all know that smart speakers such as Amazon and Google have created a branding in voice technology. Customers can purchase an Alexa or Echo voice control from Amazon for as little as £35 and enjoy directing commands to a smart speaker that can play music, answer questions, check the news and weather and even set alarms and reminders for you. In response we hear a computer-generated voice and often this is far as any of us have come to hearing voice digitalisation.
However, did you know that developments in technology are making digital voices sound more human and emotional? Microsoft has developed a text-to-speech artificial intelligence that is able to generate realistic speech from the use of real voice samples from 200 people (which equates to just 20 minutes of speaking) alongside transcriptions of the voice samples. This marks a pivotal development in turning text into realistic speech.
As you can see, in order to develop usable voice technology that is realistic developers need real human voices to work with. Which is why we are seeing more and more companies who are asking those that are interested in voice technology to donate their voice, thereby digitising their voice for future use cases. Mozilla Common Voice is one such program who are aiming to provide open data for voice systems. They want to create usable voice technology for their machines but are restricted by lack of access to the data used by larger companies so are asking individuals to ‘donate their voice’. This will enable them to create an ‘open-source voice database’ that can be utilised by anyone wanting to build voice technology into new apps and devices.
We wanted to see if we could digitalise our own voices and enable others to do the same, to create a bank, a privacy centric way for us to store our own voices for now and future generations. If you’re not a podcaster or an actress, then you would be surprised by how little there is of you talking on recorded audio. Yes, there will be you on a video with friends or kids, but that audio is likely degraded due to other voices and background sound.
In November 2018 we launched a teaser video of what we though Voice Bank could be, a way for families to digitalise the voices of family members who may not be around all the time. This included those who have passed on but also family members who work away for long periods and who may not be able to communicate back home such as those in the forces. After releasing the video, we began to get approached with lots of use case from dementia care, vocal limitations to enterprise software accessibility.
So what is Voice Bank and how does it work?
Voice Bank is a mobile app on iOS and Android that prompts you to record around 2000 sentences that you can record at your leisure. These sentences cover all the sounds that are needed to re-construct your voice digitally, these recordings are stored on your device ONLY. Once you record all the sentences you can save your voice bank to your own location (Dropbox, local machine, OneDrive etc). Currently if you want to hear what your voice bank sounds like then your voice has to currently be sent to the cloud for processing.
We are releasing the Beta version of Voice Bank shortly and looking for testers who speak English. If you are interested in Voice Bank please sign up via voicebank.io.