Domestic Speech Recognition Technology Listed Company Summary_Current Recognition Technology Status_Speech Recognition Principle and Application

First, the introduction of speech recognition technology

Speech recognition technology, also known as Automated Speech Recognition AutomaTIc Speech RecogniTIon (ASR), aims to convert vocabulary content in human speech into computer readable input such as buttons, binary codes or sequences of characters. Unlike speaker recognition and speaker confirmation, the latter attempts to identify or confirm the speaker who made the speech rather than the vocabulary content contained therein.

Second, the basic principle of speech recognition

The system is essentially a pattern recognition system, including three basic units: feature extraction, pattern matching, and reference pattern library. Its basic structure is shown in the following figure:

Domestic Speech Recognition Technology Listed Company Summary_Current Recognition Technology Status_Speech Recognition Principle and Application

The unknown voice is converted into an electrical signal by the microphone and then added to the input end of the recognition system. First, the pre-processing is performed, then the speech model is established according to the characteristics of the human voice, the input speech signal is analyzed, and the required features are extracted. Create a template for speech recognition. In the process of recognition, the computer compares the voice template stored in the computer with the characteristics of the input voice signal according to the model of speech recognition, and finds a series of optimal matching with the input voice according to a certain search and matching strategy. template. Then according to the definition of this template, the recognition result of the computer can be given by looking up the table. Obviously, this optimal result has a direct relationship with the choice of features, the quality of the speech model, and the accuracy of the template.

Third, the classification of speech recognition systems

The system can be classified according to the restrictions on the input speech. If the correlation between the speaker and the recognition system is considered, the recognition system can be classified into three categories: (1) a specific person speech recognition system. Only consider the recognition of the voice of a person. (2) Non-specific person voice system. The recognized speech has nothing to do with people, and the recognition system is usually learned by a large number of different people's speech databases. (3) Multi-person identification system. The voice of a group of people can usually be recognized, or become a specific group of speech recognition systems that only require training of the voices of the group of people to be identified.

If you think about the way you talk, you can also classify the recognition system into three categories: (1) Isolated word speech recognition system. The isolated word recognition system requires a pause after entering each word. (2) Connective speech recognition system. The conjunction input system requires a clear pronunciation of each word, and some legatos begin to appear. (3) Continuous speech recognition system. Continuous speech input is a natural, fluent continuous speech input with a large number of legato and accent.

If the vocabulary size of the recognition system is considered, the recognition system can also be divided into three categories: (1) Small vocabulary speech recognition system. A speech recognition system that typically includes dozens of words. (2) A medium vocabulary speech recognition system. An identification system that typically includes hundreds of words to thousands of words. (3) Large vocabulary speech recognition system. A speech recognition system that typically includes thousands to tens of thousands of words. With the improved computing power of computer and digital signal processors and the accuracy of recognition systems, the classification system is constantly changing according to the vocabulary size. It is currently a medium vocabulary recognition system and may be a small vocabulary speech recognition system in the future. These different limitations also determine the difficulty of the speech recognition system.

Domestic Speech Recognition Technology Listed Company Summary_Current Recognition Technology Status_Speech Recognition Principle and Application

Fourth, the application of speech recognition

The areas that can be applied are roughly divided into five categories:

Office or business system. Typical applications include: filling out data forms, database management and control, keyboard enhancements, and more.

Manufacturing: In quality control, the speech recognition system provides a “hands-free” and “no-eye” prosecution (component inspection) for the manufacturing process.

Telecommunications: A fairly wide range of applications are available on dial-up telephone systems, including the automation of attendant assistance services, international and domestic remote e-commerce, voice call distribution, voice dialing, and classified ordering.

Medical: The main application in this area is the generation and editing of professional medical reports by sound.

Others: including games and toys controlled and operated by voice, voice recognition systems that help people with disabilities, and voice control for non-critical functions such as on-board traffic control systems and sound systems.

Spherical Mirror

Hanzhong Hengpu Photoelectric Technology Co.,Ltd , https://www.hplenses.com

Posted on