#9 Handle numbers in a smarter way

Closed
opened 7 years ago by clara · 3 comments
clara commented 7 years ago

Right now, we’re just adding number words to the sentences.corpus file one per line, and removing the %d from commands with numbers, then hoping for the best. This isn’t a good approach from Pocketsphinx’s perspective, because it makes its job of understanding the commands we want to hear more difficult. It’s easy to notice now, for example, that the “start a %d minute timer” example command is often heard as “start a minute timer” even when there is a number spoken in place of the %d.

One possible solution to this might be to generate a grammar for commands based on our configuration, then use that grammar instead of a simple corpus of sentences. This may require generating our language model differently, but I’m not sure about that right now.

Right now, we're just adding number words to the sentences.corpus file one per line, and removing the %d from commands with numbers, then hoping for the best. This isn't a good approach from Pocketsphinx's perspective, because it makes its job of understanding the commands we _want_ to hear more difficult. It's easy to notice now, for example, that the "start a %d minute timer" example command is often heard as "start a minute timer" even when there is a number spoken in place of the %d. One possible solution to this might be to generate a grammar for commands based on our configuration, then use that grammar instead of a simple corpus of sentences. This may require generating our language model differently, but I'm not sure about that right now.
clara commented 7 years ago
Owner

It appears that using grammars in pocketsphinx with GStreamer is not currently possible. Note that it only appears this way to me right now, and I might be mistaken. If it is the case though, I might want to make some patches to the pocketsphinx GStreamer plugin, because this would be the second deficiency I’ve found in that plugin.

It appears that using grammars in pocketsphinx with GStreamer is not currently possible. Note that it only appears this way to me right now, and I might be mistaken. If it is the case though, I might want to make some patches to the pocketsphinx GStreamer plugin, because this would be the second deficiency I've found in that plugin.
clara commented 7 years ago
Owner

Grammars appear to be possible in the latest version of pocketsphinx, supported as of Kaylee 0.1.1. Further progress on this issue is possible, but #10 is probably a blocker: the lmtool doesn’t produce grammars, and its FAQ suggests that Kaylee might have to produce them on its own.

Grammars appear to be possible in the latest version of pocketsphinx, supported as of Kaylee 0.1.1. Further progress on this issue is possible, but #10 is probably a blocker: the lmtool doesn't produce grammars, and its FAQ suggests that Kaylee might have to produce them on its own.
clara commented 7 years ago
Owner

Some initial testing with grammars indicates that they may not produce satisfactory results in this application. There were many false positives in my tests, so I will likely not be implementing grammars in Kaylee any time soon.

The alternatives for how to fix this issue seem less than ideal as well. For commands with one number, Kaylee could generate a corpus with the command as it is now, and with some numbers (logarithmically distributed?) filling the place of %d. That wouldn’t be horrible, but commands with n numbers would end up creating an n-th degree polynomial number of entries in the corpus. This could easily outgrow the size allowed by the lmtool, and would just be horribly large in general.

Since I cannot think of any efficient, simple way to handle numbers better than Kaylee does now, I am closing this issue.

Some initial testing with grammars indicates that they may not produce satisfactory results in this application. There were *many* false positives in my tests, so I will likely not be implementing grammars in Kaylee any time soon. The alternatives for how to fix this issue seem less than ideal as well. For commands with one number, Kaylee could generate a corpus with the command as it is now, and with some numbers (logarithmically distributed?) filling the place of %d. That wouldn't be horrible, but commands with *n* numbers would end up creating an *n*-th degree polynomial number of entries in the corpus. This could easily outgrow the size allowed by the lmtool, and would just be horribly large in general. Since I cannot think of any efficient, simple way to handle numbers better than Kaylee does now, I am closing this issue.
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Due Date

Dec 31, 0000 Overdue

Dependencies

This issue currently doesn't have any dependencies.

Loading…
Cancel
Save
There is no content yet.