Skip to content

Conversation

grassesi
Copy link
Contributor

@grassesi grassesi commented Oct 15, 2025

Description

Store lead time as a attribute of the zarr group corresponding to one OuputDataset. When converting OutputDataset to xarray, lead time is available as a coordinate over the axis "forecast_step".

Issue Number

closes #929

Is this PR a draft? Mark it as draft.

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

Copy link
Collaborator

@tjhunter tjhunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grassesi a few comments. Something does not seem right in one piece.

Also, your code is not backward compatible. @iluise : is this fine for you that all the sameples are invalidated? We can make it backward compatible but I am not sure it is worth it.

return list(example_stream.group_keys())

@functools.cached_property
def lead_times(self) -> list[int]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you using it? I don't see a place where you would need it.

Also, let's not used cached_property unless it is vital to do so. Most people will be confused.

@grassesi
Copy link
Contributor Author

@grassesi a few comments. Something does not seem right in one piece.
Yes it is very much untested, I need to do that today, just wanted to get it out there.

Also, your code is not backward compatible. @iluise : is this fine for you that all the sameples are invalidated? We can make it backward compatible but I am not sure it is worth it.
I would very much avoid implementing Backward compatibility for this one 😅

@iluise
Copy link
Collaborator

iluise commented Oct 15, 2025

@grassesi a few comments. Something does not seem right in one piece.

Also, your code is not backward compatible. @iluise : is this fine for you that all the sameples are invalidated? We can make it backward compatible but I am not sure it is worth it.

Yes, no problem for the backward compatibility. If we have the checkpoints we can always re-run inference so I'd not worry about it.

return DataCoordinates(times, coords, geoinfo, channels, geoinfo_channels)

def _extract_sources(self, sample, stream_idx, key, lead_time):
def _extract_sources(self, sample, stream_idx, key, lead_time: int):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add types for all args.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

multiply forecast_step by len_hrs in ZarrIO

4 participants