Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Willow doesn't recognize my wife's voice #111

Open
dslugPX opened this issue May 26, 2023 · 6 comments
Open

Willow doesn't recognize my wife's voice #111

dslugPX opened this issue May 26, 2023 · 6 comments
Labels
1.0 Issues to address for 1.0 release audio Issues related to audio wake, speech recognition, audio quality, etc dynamic configuration Config and behavior changes for post-dynamic configuration support

Comments

@dslugPX
Copy link

dslugPX commented May 26, 2023

Filing this (1 of 3) from our conversation on reddit.

My wife has a relatively high voice, and it's higher when she's "trying" as she sometimes is when getting used to Willow.

Her voice is often not recognized at all until she actively works to use a much lower register when she speaks (Not Elizabeth Holmes low... but you get the idea!).

Happy to provide whatever additional details would makes sense to help. And she's up for helping test too - so we can do some things specifically to help test if you provide some instructions.

@kristiankielhofner
Copy link
Contributor

Welcome to Willow GH!

We are working through exposing virtually all of the configuration options we have for wake word detection and speech recognition in the Willow Configuration section of config. We'll reference this issue in those commits (likely later today) so you can experiment with them in testing with her.

One question - is wake the only issue? That is to say, can she speak in her natural (however high and excited) voice once she successfully activates wake and have accurate recognition of the command?

Something to try right now - we support multiple wake words. Have you tried an alternative such as Alexa? Pitch, tone, and pronunciation (among others) are fundamental aspects of wake words and we want to try to distinguish between them in working through this issue.

@dslugPX
Copy link
Author

dslugPX commented May 26, 2023 via email

@kristiankielhofner
Copy link
Contributor

We don't love it either but it has multiple purposes:

  1. In cases where people are replacing Alexa they have been conditioned to say "Alexa ...." without thinking about it at this point. This allows them to have a smoother onboarding experience with Willow.

  2. It's good for testing pronunciation and other issues like this one.

  3. Some people actually like it.

stintel added a commit that referenced this issue May 27, 2023
This should allow people experiencing problems with the wake word to
experiment with different accuracy.

Related to #111.
stintel added a commit that referenced this issue May 27, 2023
This should allow people experiencing problems with the wake word to
experiment with different accuracy.

Related to #111.

As this is something that should only be changed if instructed by us,
move it in an "Advanced settings" submenu.
kristiankielhofner pushed a commit that referenced this issue May 28, 2023
This should allow people experiencing problems with the wake word to
experiment with different accuracy.

Related to #111.

As this is something that should only be changed if instructed by us,
move it in an "Advanced settings" submenu.
@kristiankielhofner
Copy link
Contributor

@dslugPX We have more config options in tree you can try but I just thought of something else - can you record speech samples of your wife's voice when Willow fails to wake? We do this in our testing and I'm embarrassed it took me this long to realize it would be helpful here too!

Ideally they would be high quality (two channel, 48 kHz). You can record several sessions of her speech when Willow fails to wake with your phone (or similar). When you play them back near Willow, Willow should fail to wake as well again.

Then, if you could also record a few examples with her higher register when Willow does wake and provide all of them we can do all of the debugging ourselves!

@dslugPX
Copy link
Author

dslugPX commented Jun 2, 2023

@kristiankielhofner - Yes! We can do that.
I've also compiled some more data for you around speech recognition.
I have a bunch of aliases, and we have been trying to find ones that work better than others.

Somethings that are definitely true. S sounds are really hard. "Skip Space" or "Skip Drums and Space" generally fails.
"Two give me two" works almost every single time.

Anything with a natural pause in cadence, also causing issues.

Still seeing a bunch of issues with background noise too.

I owe you some files, and more data. We have been collecting it, but just haven't managed to pull it all together cohesively for you yet. But will. And soon.

@kristiankielhofner
Copy link
Contributor

Very interesting information, thanks!

Now I'm really interested to hear a few recordings. We've not seen or heard any specific issues with "s sounds" or anything else... The background noise is going to be very interesting as well. The only issue of concern we have there is the end of VAD timing, in the testing we've done (different environments, accents, different types of noise) the source separation and noise cancellation seems to do a very good job as-is.

The recordings are best but can you elaborate on what you mean by "natural pause in cadence"?

@kristiankielhofner kristiankielhofner added audio Issues related to audio wake, speech recognition, audio quality, etc dynamic configuration Config and behavior changes for post-dynamic configuration support 1.0 Issues to address for 1.0 release labels Jun 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.0 Issues to address for 1.0 release audio Issues related to audio wake, speech recognition, audio quality, etc dynamic configuration Config and behavior changes for post-dynamic configuration support
Projects
None yet
Development

No branches or pull requests

2 participants