How Speech to text Software works

Things like Siri or google now on
our phones rely on speech to text software to work and without them there would
be no way to interact with these services. Speech to text software has many other
uses besides in our phones and it is being used more and more. What this
software does is take recorded speech and determines what was said in that
recording. What it needs to function is a microphone to read audio from, and an
internet connection. What happens when we ask our phones to do something is the
phone’s microphone sends audio data to a server where it is broken down into
tiny bits of speech called Phonemes. Phonemes are short distinctive sounds in a
language that are all unique. These sounds are usually made up of vowels or
combinations of vowels like the “oa” sound in road. There are 44 different
phonemes in the English language and a speech recognition program can read
these phonemes and use the order and context of them to determine what you are
saying. The software uses a similar method to if/else statements we have been
doing in class. The program takes a phoneme from the audio of your voice and
compares it to the 44 phonemes in the English language. If it finds a match,
then it gets the meaning of that bit of speech from what the audio bit was
matched to. The program does this for every bit of audio that was recorded
until it can determine the words you said. this type of speech recognition is
used for many things like automated telephone systems, dictation to a computer,
and (as mentioned earlier) for virtual assistants in our smartphones.
Sources:
http://scienceline.org/2014/08/ever-wondered-how-does-speech-to-text-software-work/
http://www.lancsngfl.ac.uk/curriculum/literacy/lit_site/lit_sites/phonemes_001/
http://www.technologyguide.com/feature/ios-users-must-try-google-now/google-now-voice-search-screenshot/
http://ios.wonderhowto.com/how-to/siri-exploited-again-bypass-lock-screen-ios-8-protect-yourself-0157749/
No comments:
Post a Comment