限制语音识别结果在Android语音识别、结果、Android

2023-09-14 23:06:07 作者:欲权少女通缉犯

我在做一个应用程序,允许人们说话,几个选项(弦乐)之间进行选择。我有一个小问题,使得Android的语音识别适合我的想法。

I'm making an app that allows people to speak and select between a few options (Strings). I'm having a little problem making the Android Speech Recognizer fit my idea.

有只传递给SpeechRecognizer是有效的,并有它的之间的最佳的比赛?

Is there a way to just pass to the SpeechRecognizer the parameters that are "valid" and having it select between those the "best" match?

我不需要code,我只是需要一些指导,我的谷歌福看来今天要失败的我。

I don't need the code, I just need some guidance as my google-fu seems to be failing me today.

推荐答案

我们的解决这个问题,在 HTTP描述://kaljurand.github.io/Grammars/ ,例如:看看从这个页面链接的文件:

Our solution to this problem is described at http://kaljurand.github.io/Grammars/, e.g. check out the paper linked from this page:

Kaarel Kaljurand,TanelAlumäe。在语音控制自然语言   基于识别用户界面(CNL 2012)

Kaarel Kaljurand, Tanel Alumäe. Controlled Natural Language in Speech Recognition Based User Interfaces (CNL 2012)

的基本思想是:

请不要使用谷歌的语音识别,因为你不能(目前)通过语言模型(如语法)把它(在我们的情况下,它也不支持输入语言,我们想用); ,所以你需要实现自己的语音识别(例如,基于狮身人面像),并使其接受语法为输入的一部分; 执行语法。如果它是可以接受的词组一个简单的列表,然后JSGF会做的语法描述语言,对于较复杂的语法,我建议语法框架(可以自动编译到JSGF或有限状态自动机); 在实现一个Android应用程序,它通过添加方式对语法传递给识别器扩展了RecognizerIntent API。您可以如碱它在Kõnele。 don't use Google's speech recognizer because you cannot (currently) pass the language model (e.g. a grammar) to it (in our case it also didn't support the input language that we wanted to use); so you need to implement your own speech recognizer (e.g. based on Sphinx) and make it accept grammars as part of the input; implement the grammar. If it's a simple list of acceptable phrases then JSGF will do as the grammar description language, for more complex grammars I recommend Grammatical Framework (which you can automatically compile to JSGF or finite-state automata); implement an Android app that extends the RecognizerIntent API by adding a way to pass the grammar to the recognizer. You can base it e.g. on Kõnele.

这一切都可能是你的情况矫枉过正。后处理谷歌的结果(如@gregm建议)肯定是更容易实现。但是,如果你想扩展到更复杂和/或多种语言的语言模型,那么我们的做法无疑提供了所需的模块化和EX pressive力量。

All this might be an overkill in your case. Post-processing of Google's results (as @gregm suggests) is certainly easier to implement. But if you want to scale to more complex and/or multilingual language models then our approach certainly provides the required modularity and expressive power.