#12 Actions other than shell commands

Closed

opened 7 years ago by clara · 5 comments

I would really like to have the ability for voice commands to perform actions other than running shell commands. While it is true that running shell commands is very versatile, having other actions could bring several benefits:

It is common to run text-to-speech for providing output. If the microphone and speakers are not isolated from one another, it is then necessary to mute the microphone while TTS is running to prevent accidentally triggering more commands. This could be done automatically with a TTS action.
Automatic microphone muting could be extended to an audio output action that plays a sound file. This could be useful with an implementation of timers, discussed below.
Some commands may create internal state that should be modifiable with other commands. An example of this is cancelable timers as mentioned in #8. This may be a complex task using shell commands without relying on an external timer management program. This could be instead built in to Kaylee using a timer action.

This really represents first steps towards my long-standing idea of drop-in modules for functions such as timers, weather information, music control, &c. As such, considerable design must be done to allow these long-term goals without having to redesign this feature. I will discuss the design of this feature with probably-just-myself in this issue in order to organize my thoughts in a public environment.

Reminder to myself: instead of muting the microphone, we should pause the recognizer. This wasn’t possible at all with the shell scripts I had written for Kaylee, so I muted the microphone instead.

Pulseaudio’s module-echo-cancel works pretty well, making the recognizer pausing less important. It would still be nice to have, because sometimes the echo cancellation doesn’t kick in until the speakers have been playing sound for a second or so, which may be enough to get a “hello world” loop started. Recognizer pausing should definitely be optional though, because echo cancellation can allow commands to be recognized while TTS is running, e.g. at the end of a timer.

Built-in TTS functionality might be harder to implement in a nice way than I expected. It should be available to Python code directly for things like a weather module, but it should also be available to shell commands to provide consistency and recognizer pausing. How can I pause the recognizer of another process? Maybe dbus is a good solution. I should look into that more.

Thinking about this more, I’m less convinced that pausing the recognizer while Kaylee is speaking should be optional. It seems like being able to interrupt her and be understood just wouldn’t be useful in many situations.

dbus is definitely the way to go for IPC. I plan on using pydbus for this, which I’m happy with because it has a very friendly API, but also because it uses Gio underneath and I’m already using other Glib stuff.

clara referenced this issue from a commit 7 years ago

Initial support for plugins As part of the effort for resolving #12, I've started work on a plugin API for Kaylee. While very much a work in progress, it will allow Python plugins to be written, loaded from user configuration, and hooked in to events from necessary portions of Kaylee to handle voice commands. Currently there is only one plugin, a partial implementation of shell command support as existed previously. It works in that it executes commands, but several old features are missing. Also, the GUIs are probably broken, but I'm not worried about that at the moment.

Plugin support has been merged into master. TTS can be done by sending signals, and it’s now within reach to implement a timer plugin. As such, I think this issue is mostly done, so I’m closing it.

clara closed 7 years ago

enhancement

0.2

clara

1 Participants

Due Date

Dec 31, 0000 Overdue

Dependencies

This issue currently doesn't have any dependencies.