Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CI Action #14

Merged
merged 2 commits into from
Dec 4, 2024
Merged

Fix CI Action #14

merged 2 commits into from
Dec 4, 2024

Conversation

honzaflash
Copy link
Collaborator

@honzaflash honzaflash commented Dec 3, 2024

https://axds.atlassian.net/browse/ATN-717

  • fix flake8 dependency issue by updating it's version in pre-commit config
  • fix tests failing because of attempted int(math.inf) in bufr.py
    • not sure if this is actually a good behavior: I cap the number to sys.maxsize before conversion to int
    • inf is introduced when calculating drift and distance > 0 and time delta = 0 (on encoding/wildlife_computers.py:76)
      • maybe replacing inf with nan right at the source would be better? I don't know the usage context so I'm not sure.
      • turns out there is a deeper cause described in a comment below

- update version of `flake8` in `pre-commit` config
- `importlib_metadata` in higher versions of python removes deprecated endpoint that is used by older `flake8` in `pre-commit`
- also update the push Action workflow to be more consistent across the jobs regarding versions of stuff
@honzaflash
Copy link
Collaborator Author

honzaflash commented Dec 4, 2024

Turns out there is a deeper cause to the inf problem.

At some point, numpy (or pandas?) changed how datetime64 is parsed/stored.
The time column in the examples/profile.parquet used to load as datetime64[ns] so for conversion to seconds it was divided (wholly //) by 1e9.
In new versions of these packages it gets loaded as datetime64[ms] so after dividing it by extra 1e6 the resulting numbers were often 0. That was the source of divide by zero which produced inf.

New approach to conversion that I just force-pushed uses the library API so it should be slightly more bulletproof. (.astype('timedelta64[s]') should convert the units for us regardless of what they were originally)

- numpy changed how timestamps are parsed into datetime64
- old conversion relied on the units being `ns`
  - this caused the conversion from `ms` to result in many 0 time deltas
  - dividing by 0 produced `inf` and attempting `int(inf)` raised an Error
- now use numpy API to convert to `s`
@lukecampbell lukecampbell merged commit 3063377 into main Dec 4, 2024
12 checks passed
@honzaflash honzaflash deleted the fix-ci branch December 4, 2024 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants