Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict number of slots #810

Open
Shotgun167 opened this issue Jun 12, 2019 · 2 comments
Open

Restrict number of slots #810

Shotgun167 opened this issue Jun 12, 2019 · 2 comments

Comments

@Shotgun167
Copy link

Shotgun167 commented Jun 12, 2019

Referencing issue #254

If a "contiguous slot" can be broken into "multiple slots", how is the unordered list of slots to be interpreted? In my case, I have an utterance:

[query](Ask) [name](John) [object](\"Will you be having dinner with us?\")

The trailing string is meant to be a single slot. Instead, it gets broken into multiple slots, some that make sense, and some that do not. It turns the string into a second query. I get:

[query](Ask) [name](John) [object](\"[query](Will) [name](you) [query](be having) [object](dinner with us)?\")

Would there be a way to say this intent can have 3, and only 3, slots?

[EDIT by maintainer: put examples in markdown code blocks]

@adrienball
Copy link
Contributor

@Shotgun167
Being able to set a fix number of slots is indeed not possible at the moment, and would be valuable for use cases like the one you described where you have free-text slots that are typically a bit long.

The machine learning model used to perform slot filling, a CRF, cannot directly be constrained in such a way. However, there is perhaps room for some tricks, like generating several combinations of slots, each of them being of the correct size, and pick the most likely one. This could be heavy in terms of computation though.
I'll pin this in our backlog.

The current workaround for this kind of issue, is to add more training utterances. Use cases with free-text slots are typically hard ones, so they generally require significantly more training data.

@joffreyvillard
Copy link

IMHO, a good solution to this problem is to apply some hard-coded rules at the application level after the Snips-NLU parsing step, e.g. protecting any part enclosed between double quotes by removing any slot overlapping with this part, and turning this protected part into a pre-defined slot ("object" in the example above).
As a refinement, to be able to handle multiple slotName / entities (not only "object"), a pre-parsing step could rather replace the whole enclosed part with some keyword (that would have been used for the training of all potential slots in that situation); after the Snips-NLU parsing, the value of the detected slot would then be replaced by the original string (but all start/end indexes would need to be updated as well).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants