#12 Actions other than shell commands

Closed
opened 7 years ago by clara · 5 comments
clara commented 7 years ago

I would really like to have the ability for voice commands to perform actions other than running shell commands. While it is true that running shell commands is very versatile, having other actions could bring several benefits:

  • It is common to run text-to-speech for providing output. If the microphone and speakers are not isolated from one another, it is then necessary to mute the microphone while TTS is running to prevent accidentally triggering more commands. This could be done automatically with a TTS action.
  • Automatic microphone muting could be extended to an audio output action that plays a sound file. This could be useful with an implementation of timers, discussed below.
  • Some commands may create internal state that should be modifiable with other commands. An example of this is cancelable timers as mentioned in #8. This may be a complex task using shell commands without relying on an external timer management program. This could be instead built in to Kaylee using a timer action.
I would really like to have the ability for voice commands to perform actions other than running shell commands. While it is true that running shell commands is very versatile, having other actions could bring several benefits: - It is common to run text-to-speech for providing output. If the microphone and speakers are not isolated from one another, it is then necessary to mute the microphone while TTS is running to prevent accidentally triggering more commands. This could be done automatically with a TTS action. - Automatic microphone muting could be extended to an audio output action that plays a sound file. This could be useful with an implementation of timers, discussed below. - Some commands may create internal state that should be modifiable with other commands. An example of this is cancelable timers as mentioned in #8. This may be a complex task using shell commands without relying on an external timer management program. This could be instead built in to Kaylee using a timer action.
clara commented 7 years ago
Owner

This really represents first steps towards my long-standing idea of drop-in modules for functions such as timers, weather information, music control, &c. As such, considerable design must be done to allow these long-term goals without having to redesign this feature. I will discuss the design of this feature with probably-just-myself in this issue in order to organize my thoughts in a public environment.

This really represents first steps towards my long-standing idea of drop-in modules for functions such as timers, weather information, music control, &c. As such, considerable design must be done to allow these long-term goals without having to redesign this feature. I will discuss the design of this feature with probably-just-myself in this issue in order to organize my thoughts in a public environment.
clara commented 7 years ago
Owner

Reminder to myself: instead of muting the microphone, we should pause the recognizer. This wasn’t possible at all with the shell scripts I had written for Kaylee, so I muted the microphone instead.

Reminder to myself: instead of muting the microphone, we should pause the recognizer. This wasn't possible at all with the shell scripts I had written for Kaylee, so I muted the microphone instead.
clara commented 7 years ago
Owner

Pulseaudio’s module-echo-cancel works pretty well, making the recognizer pausing less important. It would still be nice to have, because sometimes the echo cancellation doesn’t kick in until the speakers have been playing sound for a second or so, which may be enough to get a “hello world” loop started. Recognizer pausing should definitely be optional though, because echo cancellation can allow commands to be recognized while TTS is running, e.g. at the end of a timer.

Built-in TTS functionality might be harder to implement in a nice way than I expected. It should be available to Python code directly for things like a weather module, but it should also be available to shell commands to provide consistency and recognizer pausing. How can I pause the recognizer of another process? Maybe dbus is a good solution. I should look into that more.

Pulseaudio's module-echo-cancel works pretty well, making the recognizer pausing less important. It would still be nice to have, because sometimes the echo cancellation doesn't kick in until the speakers have been playing sound for a second or so, which may be enough to get a "hello world" loop started. Recognizer pausing should definitely be optional though, because echo cancellation can allow commands to be recognized while TTS is running, e.g. at the end of a timer. Built-in TTS functionality might be harder to implement in a nice way than I expected. It should be available to Python code directly for things like a weather module, but it should also be available to shell commands to provide consistency and recognizer pausing. How can I pause the recognizer of another process? Maybe dbus is a good solution. I should look into that more.
clara commented 7 years ago
Owner

Thinking about this more, I’m less convinced that pausing the recognizer while Kaylee is speaking should be optional. It seems like being able to interrupt her and be understood just wouldn’t be useful in many situations.

dbus is definitely the way to go for IPC. I plan on using pydbus for this, which I’m happy with because it has a very friendly API, but also because it uses Gio underneath and I’m already using other Glib stuff.

Thinking about this more, I'm less convinced that pausing the recognizer while Kaylee is speaking should be optional. It seems like being able to interrupt her and be understood just wouldn't be useful in many situations. dbus is definitely the way to go for IPC. I plan on using [pydbus](https://github.com/LEW21/pydbus) for this, which I'm happy with because it has a very friendly API, but also because it uses Gio underneath and I'm already using other Glib stuff.
clara commented 7 years ago
Owner

Plugin support has been merged into master. TTS can be done by sending signals, and it’s now within reach to implement a timer plugin. As such, I think this issue is mostly done, so I’m closing it.

Plugin support has been merged into master. TTS can be done by sending signals, and it's now within reach to implement a timer plugin. As such, I think this issue is mostly done, so I'm closing it.
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Due Date

Dec 31, 0000 Overdue

Dependencies

This issue currently doesn't have any dependencies.

Loading…
Cancel
Save
There is no content yet.