#9 Handle numbers in a smarter way

クローズ
clara7年前 に作成 · 3件のコメント
clara7年前 にコメント

Right now, we’re just adding number words to the sentences.corpus file one per line, and removing the %d from commands with numbers, then hoping for the best. This isn’t a good approach from Pocketsphinx’s perspective, because it makes its job of understanding the commands we want to hear more difficult. It’s easy to notice now, for example, that the “start a %d minute timer” example command is often heard as “start a minute timer” even when there is a number spoken in place of the %d.

One possible solution to this might be to generate a grammar for commands based on our configuration, then use that grammar instead of a simple corpus of sentences. This may require generating our language model differently, but I’m not sure about that right now.

Right now, we're just adding number words to the sentences.corpus file one per line, and removing the %d from commands with numbers, then hoping for the best. This isn't a good approach from Pocketsphinx's perspective, because it makes its job of understanding the commands we _want_ to hear more difficult. It's easy to notice now, for example, that the "start a %d minute timer" example command is often heard as "start a minute timer" even when there is a number spoken in place of the %d. One possible solution to this might be to generate a grammar for commands based on our configuration, then use that grammar instead of a simple corpus of sentences. This may require generating our language model differently, but I'm not sure about that right now.
clara7年前 にコメント
オーナー

It appears that using grammars in pocketsphinx with GStreamer is not currently possible. Note that it only appears this way to me right now, and I might be mistaken. If it is the case though, I might want to make some patches to the pocketsphinx GStreamer plugin, because this would be the second deficiency I’ve found in that plugin.

It appears that using grammars in pocketsphinx with GStreamer is not currently possible. Note that it only appears this way to me right now, and I might be mistaken. If it is the case though, I might want to make some patches to the pocketsphinx GStreamer plugin, because this would be the second deficiency I've found in that plugin.
clara7年前 にコメント
オーナー

Grammars appear to be possible in the latest version of pocketsphinx, supported as of Kaylee 0.1.1. Further progress on this issue is possible, but #10 is probably a blocker: the lmtool doesn’t produce grammars, and its FAQ suggests that Kaylee might have to produce them on its own.

Grammars appear to be possible in the latest version of pocketsphinx, supported as of Kaylee 0.1.1. Further progress on this issue is possible, but #10 is probably a blocker: the lmtool doesn't produce grammars, and its FAQ suggests that Kaylee might have to produce them on its own.
clara7年前 にコメント
オーナー

Some initial testing with grammars indicates that they may not produce satisfactory results in this application. There were many false positives in my tests, so I will likely not be implementing grammars in Kaylee any time soon.

The alternatives for how to fix this issue seem less than ideal as well. For commands with one number, Kaylee could generate a corpus with the command as it is now, and with some numbers (logarithmically distributed?) filling the place of %d. That wouldn’t be horrible, but commands with n numbers would end up creating an n-th degree polynomial number of entries in the corpus. This could easily outgrow the size allowed by the lmtool, and would just be horribly large in general.

Since I cannot think of any efficient, simple way to handle numbers better than Kaylee does now, I am closing this issue.

Some initial testing with grammars indicates that they may not produce satisfactory results in this application. There were *many* false positives in my tests, so I will likely not be implementing grammars in Kaylee any time soon. The alternatives for how to fix this issue seem less than ideal as well. For commands with one number, Kaylee could generate a corpus with the command as it is now, and with some numbers (logarithmically distributed?) filling the place of %d. That wouldn't be horrible, but commands with *n* numbers would end up creating an *n*-th degree polynomial number of entries in the corpus. This could easily outgrow the size allowed by the lmtool, and would just be horribly large in general. Since I cannot think of any efficient, simple way to handle numbers better than Kaylee does now, I am closing this issue.
clara がクローズ 7年前
サインインしてこの会話に参加。
マイルストーンなし
担当者なし
1 人の参加者
期日

Dec 31, 0000 期日は過ぎています

依存関係

この課題に依存関係はありません。

読み込み中…
キャンセル
保存
まだ内容がありません