Skip to content

Fix dict get #91

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Conversation

sara-02
Copy link
Contributor

@sara-02 sara-02 commented Feb 14, 2018

Fix the dict.get() functions to prevent errors pertaining to fetching from None type for String or List objects.

sara-02 and others added 2 commits February 14, 2018 20:45
While creating the patch, the loop was appended after setting property, should be before setting the property.
@sara-02 sara-02 requested a review from tuxdna February 14, 2018 15:28
@centos-ci
Copy link
Collaborator

@sara-02 Your image is available in the registry: docker pull registry.devshift.net/bayesian/data-model-importer:SNAPSHOT-PR-91


# Add CVE property if it exists
if 'security_issues' in input_json.get('analyses', {}):
cves = []
for cve in input_json.get('analyses', {}).get('security_issues', {}).get('details', []):
cves.append(cve.get('id') + ":" + str(cve.get('cvss', {}).get('score')))
for cve in input_json.get('analyses').get('security_issues', {}).get('details', []):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be input_json.get('analyses', {}) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tuxdna that condition is checked in the if condition above the for loop if 'security_issues' in input_json.get('analyses', {}): . So I did not check that again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should extract analyses and corresponding properties into local variables, and use those variables instead:

analyses = input_json.get('analyses', {})
security_issues = analyses.get('security_issues', {})
if security_issues:
    pass
    # your logic here

This way you will always catch the missing keys early on, and also reduce some errors due to duplicated code and typos.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I will make that change.

@@ -75,18 +75,18 @@ def construct_version_query(cls, input_json):
if 'code_metrics' in input_json.get('analyses', {}):
count = 0
tot_complexity = 0.0
languages = input_json.get('analyses').get('code_metrics').get('details', {}) \
languages = input_json.get('analyses').get('code_metrics', {}).get('details', {}) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be input_json.get('analyses', {}) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

.get('total_lines', -1))

cm_num_files = str(input_json.get('analyses').get('code_metrics').get('summary', {})
cm_num_files = str(input_json.get('analyses').get('code_metrics', {}).get('summary', {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be input_json.get('analyses', {}) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

prp_version += " ".join(["ver.property('licenses', '{}');".format(l) for l in licenses])
licenses = input_json.get('analyses').get('source_licenses', {}).get('summary', {}) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be input_json.get('analyses', {}) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

@tuxdna
Copy link
Contributor

tuxdna commented Feb 14, 2018

Also fix pylint errors.

Copy link
Contributor

@jpopelka jpopelka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, with some space for improvements

@@ -75,18 +75,18 @@ def construct_version_query(cls, input_json):
if 'code_metrics' in input_json.get('analyses', {}):
count = 0
tot_complexity = 0.0
languages = input_json.get('analyses').get('code_metrics').get('details', {}) \
languages = input_json.get('analyses').get('code_metrics', {}).get('details', {}) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't you add a default {} to all the input_json.get('analyses').get(...) ?
If that's because you expect the 'analyses' to always be there, then it's actually better to use input_json['analyses'] because:

>>> {}.get('analyses').get('something')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'get'

>>> {}['analyses'].get('something')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'analyses'

Which of the above two error messages tells you more about what's happened ? (the second, right ?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. The second one is more explicit, so you suggest we replace all occurences of analyses.get() with ['analyses'] ?
That will require modifications in the if statements as well, or should we keep the if statements as is, and only change if code inside the condition.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my rule of thumb is to use dict['key'] if you are sure the 'key' is there (because you've already checked) or it makes no sense to continue if it wasn't there. Because you want to see an explicit exception it happens to not be there.
And you use dict.get('key', default_value) if your code can continue even without it.
I think you can leave the if statements as they are for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. We can make that check for analyses, because if analysis is not present then out data gathering did not happen properly.

@@ -187,7 +189,7 @@ def construct_version_query(cls, input_json):
@classmethod
def construct_package_query(cls, input_json):
"""Construct the query to retrieve detailed information of given package."""
pkg_name = input_json.get('package')
pkg_name = input_json.get('package', '')
Copy link
Contributor

@jpopelka jpopelka Feb 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it's OK, but, well, the question is 'what is better' ?
a) To somehow mask a missing piece to continue in any case (even if that means ingesting incorrect/incomplete data) ?
b) Or to fail early so that someone can check what's happened, fill an issue and fix the cause (even if that means more work) ?

:-)

Someone who prefers (b) would actually do (the same as in my first comment)
pkg_name = input_json['package'] so that you'd get an explicit KeyError: 'package' in case the input data happens to be incomplete.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made that change, because in the next line we are using the string contained in pkg_name to generate tokens in it.
Even better would be to first get the value and then check that it should not be None or Empty string, because either way it will be useless to ingest, and then throw an error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly.
You have couple options how to do the 'checking' part, I'd probably use simple assert:

assert input_json.get('package'), "no or empty 'package'"
pkg_name = input_json['package']

This assert raises an AssertionError if the 'package' is not in input_json or the value there is None or ''

You can even have it in a function as we do in worker/utils or worker/base

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this PR still relevant?

@abs51295
Copy link
Contributor

abs51295 commented Mar 13, 2018

Commenting [test] to see if build works as expected. Build was failing for #120 and want to know if centos-ci is the reason for the failure.

@abs51295
Copy link
Contributor

[test]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants