Microsoft Speech API provides the SpeechRecognitionEngine
class, which works as a backbone for the speech-enabled application. With the Kinect SDK as well, the SpeechRecognitionEngine
class accepts an audio stream from the Kinect sensor and processes it. The Recognition engines, requires set of grammars ( set of commands), that tell the recognizer what to match. You can build the grammars either using Choices
Class or using XML documentation approach.
Once you are done with defining the grammar, you need to load the grammar to the speech recognizer as follows:
Grammar grammar = new Grammar(grammarBuilder); speechRecognizer.LoadGrammar(grammar);
Fig: Building Grammar using Choices / XML and Loading into Speech Recognizer
To build an extensive speech enable application you need multiple set of commands, that enables you to need to multiple grammar module to load inside the speech recognizer.
Fig : Loading Multiple Grammar
Now, when you have the large set of grammar modules, you may required to unload some of them and load few other set of grammar ! that how you can enable the dynamic nature of a application. The SpeechRecognitionEngine
class has the UnloadGrammar
method. Using that you can unload the grammars when required. To unload a grammar, you have to pass the grammar name as an argument in the UnloadGrammar
method.
speechRecognizer.UnloadGrammar(grammar);
You can even use the UnloadAllGrammars
method to unload all the grammars. Once the grammar is unloaded, you have to load it again for further use.
Fig: Loading and Unloading the grammars
Couple of days back, one of my Kinect dev. friend was asked me this question. I answered him and while I looked back, I found one of my reply on MSDN Forum , which was given almost one and half years back. So, I thought to share over here with some more level of details for your further reference and understanding.
Hope you liked it!
Don’t forgot to check out the Kinect for Windows SDK Tips