We are independent & ad-supported. We may earn a commission for purchases made through our links.
Advertiser Disclosure
Our website is an independent, advertising-supported platform. We provide our content free of charge to our readers, and to keep it that way, we rely on revenue generated through advertisements and affiliate partnerships. This means that when you click on certain links on our site and make a purchase, we may earn a commission. Learn more.
How We Make Money
We sustain our operations through affiliate commissions and advertising. If you click on an affiliate link and make a purchase, we may receive a commission from the merchant at no additional cost to you. We also display advertisements on our website, which help generate revenue to support our work and keep our content free for readers. Our editorial team operates independently of our advertising and affiliate partnerships to ensure that our content remains unbiased and focused on providing you with the best information and recommendations based on thorough research and honest evaluations. To remain transparent, we’ve provided a list of our current affiliate partners here.
Technology

Our Promise to you

Founded in 2002, our company has been a trusted resource for readers seeking informative and engaging content. Our dedication to quality remains unwavering—and will never change. We follow a strict editorial policy, ensuring that our content is authored by highly qualified professionals and edited by subject matter experts. This guarantees that everything we publish is objective, accurate, and trustworthy.

Over the years, we've refined our approach to cover a wide range of topics, providing readers with reliable and practical advice to enhance their knowledge and skills. That's why millions of readers turn to us each year. Join us in celebrating the joy of learning, guided by standards you can trust.

What Are the Different Speech Recognition Techniques?

By Eugene P.
Updated: May 17, 2024
Views: 6,366
References
Share

Several speech recognition techniques are used to capture spoken words and convert them into data that can be used by a software program. There are three broad ways to analyze speech in an effort to determine what is being said. The first is called discrete speech, meaning only a single word is spoken at a time. The second is known as connected speech, and words must be spoken in a certain manner to be understood. Finally, there is continuous speech, which is how most people normally speak.

The most common algorithm used to for all types of speech recognition techniques is the Hidden Markov Model (HMM). This system involves large data trees of phonemes, or basic sounds and syllables, which are divided by the statistical probability of one sound following another. By comparing each phoneme to a node in the data tree of sounds, the actual completed word can be determined with a high rate of accuracy in a relatively short period of time.

One problem that is difficult to overcome with some speech recognition techniques is isolating where a word starts and ends. This task is complicated by background noise in the room and the fact that some syllables have an audio signature that resembles a break between words. For this reason, discrete and connected speech recognition techniques are the most accurate.

Another factor that separates different speech recognition techniques is the issue of software vocabulary. Software that is interpreting speech can either have a very limited vocabulary with a high accuracy, or a large vocabulary that must be matched to a specific user’s individual speech patterns. When a program uses the HMM method of assembling words, the fewer the number of words that are understood, the more accurate the program can be. This is the method that most automated telephone systems use to decipher numbers or responses to questions.

Speech recognition techniques that understand a large vocabulary are usually designed to interact with very few or only one user. This is because the program must be trained to understand the speech patterns of the person speaking. The training involves reading pre-made paragraphs of text to the software. The words being read are known, so the program is able to build a statistical model of phonemes specific to the user. This gives the program a much better chance of understanding the user, but it also might hinder the program's understanding of people with whom it has not trained.

The most difficult of the speech recognition techniques is interpreting continuous or natural speech. Many people tend to run words together and speak at different speeds, so the accuracy of programs that translate continuous speech is lower than that of the other methods. Still, programs do exist that can translate this type of speech, some of them employing fuzzy logic and neural networks to help recognize patterns and isolate words.

Share
WiseGeek is dedicated to providing accurate and trustworthy information. We carefully select reputable sources and employ a rigorous fact-checking process to maintain the highest standards. To learn more about our commitment to accuracy, read our editorial process.
Link to Sources

Editors' Picks

Discussion Comments
Share
https://www.wisegeek.net/what-are-the-different-speech-recognition-techniques.htm
Copy this link
WiseGeek, in your inbox

Our latest articles, guides, and more, delivered daily.

WiseGeek, in your inbox

Our latest articles, guides, and more, delivered daily.