Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new parsing apis #724

Merged
merged 7 commits into from
Dec 14, 2018
Merged

Add new parsing apis #724

merged 7 commits into from
Dec 14, 2018

Conversation

adrienball
Copy link
Contributor

@adrienball adrienball commented Dec 11, 2018

Description:
This PR introduces 2 new parsing APIs.
The first one allows to run intent classification only, and get back a list of intents along with their probability:

>>> nlu_engine.get_intents("turn the lights on")
[
 {
   "intentName": "turnLightOn",
   "probability": 0.6363648460343694
 },
 {
   "intentName": null,
   "probability": 0.2580088944934134
 },
 {
   "intentName": "turnLightOff",
   "probability": 0.22791834836267366
 },
 {
   "intentName": "setTemperature",
   "probability": 0.181781583254962
 }
]

The second API allows to extract the slots when the intent is already known:

>>> engine.get_slots(u"Hey, lights on in the lounge !", "turnLightOn")
[
 {
   "range": {
     "start": 22,
     "end": 28
   },
   "rawValue": "lounge",
   "value": {
     "kind": "Custom",
     "value": "living room"
   },
   "entity": "room",
   "slotName": "room"
 }
]

On top of these two new APIs, an optional top_n parameter is added to the parse method, allowing to perform intent parsing on the top_n most likely intents.

This should address #623, and cover the feature brought by @deeiip in #715.

Checklist:

  • My PR is ready for code review
  • I have added some tests, if applicable, and run the whole test suite, including linting tests
  • I have updated the documentation, if applicable

@adrienball adrienball requested a review from ClemDoum December 11, 2018 18:16
@adrienball adrienball force-pushed the task/new-parsing-apis branch from 76cfd38 to 3e27bd5 Compare December 11, 2018 18:20
@adrienball adrienball mentioned this pull request Dec 11, 2018
3 tasks
@adrienball adrienball force-pushed the task/new-parsing-apis branch from 1b033fa to ec1a307 Compare December 12, 2018 10:51
@codecov-io
Copy link

codecov-io commented Dec 12, 2018

Codecov Report

Merging #724 into develop will increase coverage by 0.27%.
The diff coverage is 91.24%.

@@             Coverage Diff             @@
##           develop     #724      +/-   ##
===========================================
+ Coverage    88.16%   88.44%   +0.27%     
===========================================
  Files           65       66       +1     
  Lines         3853     3963     +110     
  Branches       735      765      +30     
===========================================
+ Hits          3397     3505     +108     
- Misses         347      349       +2     
  Partials       109      109

@adrienball adrienball force-pushed the task/new-parsing-apis branch from ec1a307 to cc4bef5 Compare December 12, 2018 14:07
@@ -73,24 +69,29 @@ def classproperty(func):
# pylint: enable=invalid-name


def type_error(expected_type, found_type):
return TypeError("Expected %s but found: %s" % (expected_type, found_type))
def type_error(expected_type, found_type, object_label=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the naming a bit misleading now.



def missing_key_error(key, object_label=None):
if object_label is None:
return KeyError("Missing key: '%s'" % key)
return KeyError("Expected %s to have key: '%s'" % (object_label, key))
raise DatasetFormatError("Missing key: '%s'" % key)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the naming of the function a bit misleading now.

format"""


class IntentFormatError(SnipsNLUError):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these last 2 inherit from the DatasetFormatError ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed

@@ -116,33 +116,49 @@ def get_intent(self, text, intents_filter=None):
NotTrained: When the intent classifier is not fitted

"""
intents_results = self._get_intents(text, intents_filter)
if not intents_results or intents_results[0][RES_INTENT_NAME] is None:
return None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering why we did this choice at the time.
I would prefer to prefer to have {"intentName": null, "probability": 0.6} as for the other intents.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is the right time.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also move this logic to the IntentClassifier since the logic of returning the most likely intents, something like:

class IntentClassifier(with_metaclass(ABCMeta, ProcessingUnit)):

    @abstractmethod
    def fit(self, dataset):
        pass

    def get_intent(self, text, intents_filter):
        return self.get_intents(text)[0]

    @abstractmethod
    def get_intents(self, text):
        """Performs intent classification on the provided *text* and returns
        the list of intents ordered by decreasing probability

        The length of the returned list is exactly the number of intents in the
        dataset + 1 for the None intent

        .. note::

            The probabilities returned along with each intent are not
            guaranteed to sum to 1.0. They should be considered as scores
            between 0 and 1.
        """
        pass

since I don't think the logic of returning the most likely intent out of the intents list will change from one IntentClassifier from another.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But then you don't pass the intents_filter parameter.

@adrienball adrienball merged commit c439a81 into develop Dec 14, 2018
@adrienball adrienball deleted the task/new-parsing-apis branch December 14, 2018 15:33
@timtutt
Copy link

timtutt commented Jan 26, 2019

When will there be a new release that includes this PR?

@ClemDoum
Copy link
Collaborator

Hi @timtutt a new release should be available by the end of the week or early next week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants