Fix dict get #91

sara-02 · 2018-02-14T15:27:56Z

Fix the dict.get() functions to prevent errors pertaining to fetching from None type for String or List objects.

While creating the patch, the loop was appended after setting property, should be before setting the property.

centos-ci · 2018-02-14T15:36:34Z

@sara-02 Your image is available in the registry: docker pull registry.devshift.net/bayesian/data-model-importer:SNAPSHOT-PR-91

tuxdna · 2018-02-14T16:02:00Z

src/graph_populator.py


        # Add CVE property if it exists
        if 'security_issues' in input_json.get('analyses', {}):
            cves = []
-            for cve in input_json.get('analyses', {}).get('security_issues', {}).get('details', []):
-                cves.append(cve.get('id') + ":" + str(cve.get('cvss', {}).get('score')))
+            for cve in input_json.get('analyses').get('security_issues', {}).get('details', []):


Should be input_json.get('analyses', {}) ?

@tuxdna that condition is checked in the if condition above the for loop if 'security_issues' in input_json.get('analyses', {}): . So I did not check that again.

You should extract analyses and corresponding properties into local variables, and use those variables instead:

analyses = input_json.get('analyses', {}) security_issues = analyses.get('security_issues', {}) if security_issues: pass # your logic here

This way you will always catch the missing keys early on, and also reduce some errors due to duplicated code and typos.

Sure, I will make that change.

tuxdna · 2018-02-14T16:03:19Z

src/graph_populator.py

@@ -75,18 +75,18 @@ def construct_version_query(cls, input_json):
        if 'code_metrics' in input_json.get('analyses', {}):
            count = 0
            tot_complexity = 0.0
-            languages = input_json.get('analyses').get('code_metrics').get('details', {}) \
+            languages = input_json.get('analyses').get('code_metrics', {}).get('details', {}) \


Should be input_json.get('analyses', {}) ?

same as above.

tuxdna · 2018-02-14T16:03:24Z

src/graph_populator.py

                         .get('total_lines', -1))

-            cm_num_files = str(input_json.get('analyses').get('code_metrics').get('summary', {})
+            cm_num_files = str(input_json.get('analyses').get('code_metrics', {}).get('summary', {})


Should be input_json.get('analyses', {}) ?

same as above.

tuxdna · 2018-02-14T16:03:31Z

src/graph_populator.py

            prp_version += " ".join(["ver.property('licenses', '{}');".format(l) for l in licenses])
+            licenses = input_json.get('analyses').get('source_licenses', {}).get('summary', {}) \


Should be input_json.get('analyses', {}) ?

Same as above.

tuxdna · 2018-02-14T16:04:45Z

Also fix pylint errors.

jpopelka

LGTM, with some space for improvements

jpopelka · 2018-02-14T15:57:57Z

src/graph_populator.py

@@ -75,18 +75,18 @@ def construct_version_query(cls, input_json):
        if 'code_metrics' in input_json.get('analyses', {}):
            count = 0
            tot_complexity = 0.0
-            languages = input_json.get('analyses').get('code_metrics').get('details', {}) \
+            languages = input_json.get('analyses').get('code_metrics', {}).get('details', {}) \


Why don't you add a default {} to all the input_json.get('analyses').get(...) ?
If that's because you expect the 'analyses' to always be there, then it's actually better to use input_json['analyses'] because:

>>> {}.get('analyses').get('something') Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'NoneType' object has no attribute 'get' >>> {}['analyses'].get('something') Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 'analyses'

Which of the above two error messages tells you more about what's happened ? (the second, right ?)

+1. The second one is more explicit, so you suggest we replace all occurences of analyses.get() with ['analyses'] ?
That will require modifications in the if statements as well, or should we keep the if statements as is, and only change if code inside the condition.

Yes, my rule of thumb is to use dict['key'] if you are sure the 'key' is there (because you've already checked) or it makes no sense to continue if it wasn't there. Because you want to see an explicit exception it happens to not be there.
And you use dict.get('key', default_value) if your code can continue even without it.
I think you can leave the if statements as they are for now.

Ok. We can make that check for analyses, because if analysis is not present then out data gathering did not happen properly.

jpopelka · 2018-02-14T16:14:19Z

src/graph_populator.py

@@ -187,7 +189,7 @@ def construct_version_query(cls, input_json):
    @classmethod
    def construct_package_query(cls, input_json):
        """Construct the query to retrieve detailed information of given package."""
-        pkg_name = input_json.get('package')
+        pkg_name = input_json.get('package', '')


Technically it's OK, but, well, the question is 'what is better' ?
a) To somehow mask a missing piece to continue in any case (even if that means ingesting incorrect/incomplete data) ?
b) Or to fail early so that someone can check what's happened, fill an issue and fix the cause (even if that means more work) ?

:-)

Someone who prefers (b) would actually do (the same as in my first comment)
pkg_name = input_json['package'] so that you'd get an explicit KeyError: 'package' in case the input data happens to be incomplete.

I made that change, because in the next line we are using the string contained in pkg_name to generate tokens in it.
Even better would be to first get the value and then check that it should not be None or Empty string, because either way it will be useless to ingest, and then throw an error.

Exactly.
You have couple options how to do the 'checking' part, I'd probably use simple assert:

assert input_json.get('package'), "no or empty 'package'" pkg_name = input_json['package']

This assert raises an AssertionError if the 'package' is not in input_json or the value there is None or ''

You can even have it in a function as we do in worker/utils or worker/base

is this PR still relevant?

abs51295 · 2018-03-13T19:18:03Z

Commenting [test] to see if build works as expected. Build was failing for #120 and want to know if centos-ci is the reason for the failure.

abs51295 · 2018-03-13T19:18:19Z

[test]

sara-02 and others added 2 commits February 14, 2018 20:45

Prevent None object error while fetch from dict

fb5dd60

Fix location of cve_id loop,

17ff6cb

While creating the patch, the loop was appended after setting property, should be before setting the property.

sara-02 requested a review from tuxdna February 14, 2018 15:28

tuxdna suggested changes Feb 14, 2018

View reviewed changes

jpopelka reviewed Feb 14, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix dict get #91

Fix dict get #91

sara-02 commented Feb 14, 2018

centos-ci commented Feb 14, 2018

tuxdna Feb 14, 2018

sara-02 Feb 15, 2018

tuxdna Feb 15, 2018

sara-02 Feb 16, 2018

tuxdna Feb 14, 2018

sara-02 Feb 15, 2018

tuxdna Feb 14, 2018

sara-02 Feb 15, 2018

tuxdna Feb 14, 2018

sara-02 Feb 15, 2018

tuxdna commented Feb 14, 2018

jpopelka left a comment

jpopelka Feb 14, 2018

sara-02 Feb 15, 2018

jpopelka Feb 15, 2018

sara-02 Feb 16, 2018

jpopelka Feb 14, 2018 •

edited

Loading

miteshvp Feb 15, 2018

sara-02 Feb 15, 2018

jpopelka Feb 15, 2018

tisnik Mar 12, 2018

abs51295 commented Mar 13, 2018 •

edited

Loading

abs51295 commented Mar 13, 2018

		prp_version += " ".join(["ver.property('licenses', '{}');".format(l) for l in licenses])
		licenses = input_json.get('analyses').get('source_licenses', {}).get('summary', {}) \

Fix dict get #91

Are you sure you want to change the base?

Fix dict get #91

Conversation

sara-02 commented Feb 14, 2018

centos-ci commented Feb 14, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tuxdna commented Feb 14, 2018

jpopelka left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpopelka Feb 14, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abs51295 commented Mar 13, 2018 • edited Loading

abs51295 commented Mar 13, 2018

jpopelka Feb 14, 2018 •

edited

Loading

abs51295 commented Mar 13, 2018 •

edited

Loading