Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

curate apply-geolocation-rules: Fails when fields contain null values #1758

Open
joverlee521 opened this issue Feb 14, 2025 · 0 comments
Open
Labels
bug Something isn't working

Comments

@joverlee521
Copy link
Contributor

Discovered in ebola workflow

If the NDJSON includes values that are null, they get parsed as None values in the Python dict.
However, the command expects string values as long as the geolocation field exists in the record, so this causes an uncaught error

Traceback (most recent call last):
  File "/nextstrain/augur/augur/__init__.py", line 70, in run
    return args.__command__.run(args)
  File "/nextstrain/augur/augur/curate/__init__.py", line 246, in run
    dump_ndjson(validated_output_records)
  File "/nextstrain/augur/augur/io/json.py", line 89, in dump_ndjson
    for item in iterable:
  File "/nextstrain/augur/augur/curate/__init__.py", line 154, in validate_records
    for idx, record in enumerate(records):
  File "/nextstrain/augur/augur/curate/apply_geolocation_rules.py", line 227, in run
    annotated_values = transform_geolocations(
  File "/nextstrain/augur/augur/curate/apply_geolocation_rules.py", line 163, in transform_geolocations
    annotated_values = get_annotated_geolocation(geolocation_rules, transformed_values, case_sensitive)
  File "/nextstrain/augur/augur/curate/apply_geolocation_rules.py", line 109, in get_annotated_geolocation
    return get_annotated_geolocation(geolocation_rules, raw_geolocation, case_sensitive, rule_traversal)
  File "/nextstrain/augur/augur/curate/apply_geolocation_rules.py", line 142, in get_annotated_geolocation
    return get_annotated_geolocation(geolocation_rules, raw_geolocation, case_sensitive, rule_traversal)
  File "/nextstrain/augur/augur/curate/apply_geolocation_rules.py", line 109, in get_annotated_geolocation
    return get_annotated_geolocation(geolocation_rules, raw_geolocation, case_sensitive, rule_traversal)
  File "/nextstrain/augur/augur/curate/apply_geolocation_rules.py", line 142, in get_annotated_geolocation
    return get_annotated_geolocation(geolocation_rules, raw_geolocation, case_sensitive, rule_traversal)
  File "/nextstrain/augur/augur/curate/apply_geolocation_rules.py", line 109, in get_annotated_geolocation
    return get_annotated_geolocation(geolocation_rules, raw_geolocation, case_sensitive, rule_traversal)
  File "/nextstrain/augur/augur/curate/apply_geolocation_rules.py", line 95, in get_annotated_geolocation
    current_rules = current_rules.get(field_value.lower())
AttributeError: 'NoneType' object has no attribute 'lower'
@joverlee521 joverlee521 added the bug Something isn't working label Feb 14, 2025
joverlee521 referenced this issue in nextstrain/ebola Feb 14, 2025
Preview: <https://nextstrain.org/community/victorlin/nextstrain-test@e88bb6a/ebola/all-outbreaks-pathoplexus>

Downloading the data is trivial with the public API endpoints. Some
de-duplication is required, and the column names were the main hurdle.
There were many unhandled exceptions from augur curate commands which
for the most part I could figure out what was going wrong, however, in
some cases such as apply-geolocation-rules, I couldn't figure out how to
add division info without causing an error.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant