Testing voice recognition tech findings

Thursday 14th May 2020

Earlier this year, we posted a notice from Natasha Toland, who was looking for participants for a study on stammering and voice recognition software as part of her dissertation. With the study now complete, here's what she found.

A young woman holding up a smartphone and smiling

Natasha

My name is Natasha Toland and I am just finishing my third year at the University of Winchester. As part of my final year, I had to complete a dissertation (10,000 words, I know it’s a lot) and my chosen topic was the effect of stammers on voice recognition technology.

The idea for the study was based on my own experience of using voice recognition technology and the issues I have had with it growing up. I wanted to see how much a stammer affected the process of converting spoken words into text.

Methodology

The BSA/STAMMA was the main point of call for gathering participants through social media. All I asked people to do was to record their voices from home saying six sentences and questions aloud (on the first attempt only) on a smartphone or voice recorder and then email the recording over to me. It took five minutes to complete. I then played those samples to different types of voice recognition software or devices to respond to the question/statements. The tech I used were Cortana on a Windows laptop, Siri on an Apple iPod and Google Assistant on an android phone.

If the top technology companies would design something that gave a little more time for the person to speak, or included people who stammer in the development stage, then something would most definitely come out as a positive of this.

I put together a control group (people who do not stammer) and a group of people who do stammer. When the recordings of people who stammer were played to each of the voice recognition technologies, a lot of the time it cut the recording off because the sentences did not fit exactly into the specific time limits they set. If a word was repeated, then the voice recognition would ignore every word that came after it. Some of the voice recognition technologies actually ignored the stammer to a certain degree but overall it struggled with understanding what the person said.

I found that it didn't matter how severe the stammer was. The voice recognition struggled with just a mild stammer, which is what I have myself. I also found that the control group was still affected even though the main discussion was about stammers.

Conclusion

From my findings, I can determine that there are still major flaws with voice recognition technology in general and if the top technology companies would perhaps design something that gave a little more time for the person to speak, or if they included people who stammer in the development stage of the product, then something would most definitely come out as a positive of this.

I am hoping that I will be able to conduct further research into this and stammers in general and with some changes made to the work, I am hoping to get it published. I would hope that somehow some of the top technology companies would see the work that I have produced. I also want to send my research to some of the companies that make the technology to get their opinion on it and on what they can do to adapt their devices and adverts seen on television.

Testing voice recognition tech findings

Methodology

Conclusion

Become a member

It's free

Explore other sections