-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
selecting terminology: steps, horizons, and intervals #1
Comments
The http://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html use forecast_period as an attribute: Forecast period is the time interval between the forecast reference time and the validity time. A period is an interval of time, or the time-period of an oscillation. In https://climpred.readthedocs.io/en/stable/index.html we call this dimension lead, in s2s-ai https://s2s-ai-challenge.github.io/#data lead_time. |
The cf conventions also allow for intervals via bnds/ boundaries http://cfconventions.org/Data/cf-conventions/cf-conventions-1.9/cf-conventions.html#cell-boundaries |
I'm happy to adopt other names, I really had no clue when choosing them and I was inspired by an ECMWF implementation with step. Maybe it would help (me) to clarify the different types of temporal information that are available. Do I understand correctly that:
|
|
Thanks, I've updated the extension a bit. I renamed the step to horizon and added @aaronspring I did not understand your point 3, sorry. Can you give an example? How can there be a time before the forecast is started? Disclaimer: I'm from another domain and have basically no clue about forecasts... |
Edited 3. Example instantaneous: |
https://confluence.ecmwf.int/display/COPSRV/Guide+to+NetCDF+encoding+for+C3S+providers under time coordinates is a similar overview |
Thanks @m-mohr , this is looking very good. I really like your documentation as well. If I follow correctly, you're mapping the standard STAC notion of This would more closely match the CF conventions @aaronspring cites above -- in which the valid time is called
Perhaps more importantly, this would mean that the |
Not to nitpick, but I'm not sold on the term I do really like how you note this field formatted as an 8601 duration, that makes it clear what the units are. Could we call it Minor additional comment: I think it the standard should be explicit about whether this period applies to the interval before or after the valid time ( |
|
|
so
what I am still thinking about is multiple ways to express and think of
regarding
what about monthly lead forecasts such as http://iridl.ldeo.columbia.edu/expert/SOURCES/.Models/.NMME/? |
I am working with @cboettig on the ecological forecasting applications of these standards. Some thoughts
|
Good points everyone. Regarding 'before' or 'after', @rqthomas is right that it's just about where the Using @m-mohr 's ISO-8601 duration format, maybe we could simply add a Re aggregation for instantaneous variables, I like @m-mohr 's suggestion of using @aaronspring makes a good point about daily (or even longer intervals), which are implicitly durations / intervals. I think existing conventions are fine here -- e.g. I think we're still deciding on the term for this "duration" field? I agree with @aaronspring 's idea that 'forecast_period" would seem to cover all the aggregations (agg, ave, min, max, or any other aggregating function) -- except as Aaron points out, CF has already given a different definition to that very term, as datetime - reference_datetime (aka horizon). I think we're also still deciding on if we're ok with |
For the planetary computer (also in a STAC catalog) they already have step and this format P0DT1H0M0S |
It's a bit too much that we discuss at the same time here, so is a bit confusing to me but I'll try to wade through it now. The first thing I've done is to swap the definitions for forecast and reference datetime as proposed (also in #7). I've also changed I agree that The I must admit I did not fully get the "before" vs "after" discussion yet. I need to read through it again tomorrow when it's not so close to midnight. ;-) Regarding the granularity of datetimes and the maximum value for the horizon or duration: I don't know where this limitation comes from, but it's actually not in the spec. The datetimes have a millisecond precision and the horizon/durations (i.e. the ISO durations) can pretty much be any length up to thousands of years if you like. ;-) So I don't see an issue there.
It's exactly the same format that we use here, they just provide some optional 0's. You cal also simplify to PT1H. Did I forget anything or understood something wrong? Thoughts? Looking forward to them! Thanks for all the discussions, I appreciate all of them! |
🎉 thanks @m-mohr , I think this is looking really good. The forecast-specific terms, I think Okay, good call about datetimes with different precision really just being equivalent. So wrt to Aaron's question about 'daily' forecasts, etc, then I guess a Overall this is great, feeling pretty good about the metadata descriptions for all these temporal components. (Though definitely will help to flush out some examples, as in #4) |
Just to give a bit of background: most forecast naming conventions, whether from ECMWF, NOAA, CF-NeCDF, WMO, etc., have been devised for meteorologists communicating with meteorologists. When we devised the OGC WMS Time and Elevation Best Practice about 10 years ago, it took some effort to realise that non-expert end users wanted a forecast/hindcast/nowcast for a datetime, or perhaps a specific interval, and are not really interested in whether it was a 2 day or 30 hour or a 15 minute forecast, started at a specific time. The primary datetime for them is the Generally, only the expert providers of the "best data" are concerned about all the other times, and think in terms of And of course the HTH, and apologies if the background is well-known. |
I don't think it's a good idea to add a +/- if it's not part of the ISO8601 spec (which I don't know). On the other hand, it is also not required I think. I think this comes from not enough documentation around datetime and start/end_datetime. The extension right now is meant to always use start/end_datetime if the duration is specified and the datetime is set to null (as of now, see #9). Hacing the start and end datetimes, you don't need these +/- because you already have the start and end defined, right?
Yes, basically duration is meant to express the difference between the start and end datetimes. So basically start_datetime + duration = end_datetime. datetime itself is not set yet, but we can discuss whether we should set it to something useful, e.g. the start_datetime. In STAC, whenever start and end datetimes are set, the individual domains that author an extension need to check what could be a good use for the datetime field and propose its usage. If there's nothing that is useful, it is set to null. For example, for satellitle imagery that has a long capture time, datetime is usually set to the center datetime. I hope you get the idea, but maybe it needs more details to be easier to understand. Let me know please.
If I understand it correctly, yes. I can't fully follow @chris-little's comments, but it feels like we already follow the proposal because we set the valid time to be datetime or start/end_datetime, which are the most prominent/used datetimes in STAC. Right? |
Right! Another suggestion: ISO8601 is basically a notation overlaid with calendar rules. It has many options, so the IETF restrictive profile for timestamps, RFC3339, is more useful. But RFC3339 very very carefully does not mention durations. There are sound mathematical and semantic reasons for specify intevals using the start and end points, rather than one endpoint plus or minus a duration. That is because 'subtracting' timestamps to get a duration, or adding a duiration to a startpoint to get an endpoint, relies on software to have implemented calendar algorithms completely correctly. Also, ontologies used for reasoning can work better with intervals than with durations. When one uses durations in, for example, a forecast: T+00, +3 hours, +6, ...+72 hours, etc, technically, that is a different timescale, with origin at T+00, and a Unit of Measure of hours and normal rules of arithmetic. I find keeping that distinction between temporal 'coordinates' and Calendars helpful. |
@chris-little Thanks a lot. The STAC datetimes are all RFC3339 and I think they should be the main source if you want the exact values. The reason why I added a representation of the duration is that it's very useful for search and that's a main focus of STAC. So if you want to search for a forecast with 6h duration it's much more easy at the API level to search for |
@m-mohr My suggestion then is to offer both approaches, log the queries, then when stats seem reliable, configure to optimise for the popular queries. And when in doubt, keep it simple, as I am not sure who would want to search for a 6 hour forecast! |
We'll likely offer both, but we can't log it as this is just a specification and we don't know where and how this will be implemented in the end. We are mostly done with this issue, right? I'd like to focus on more concrete/smaller issues as they are easier to discuss instead of having one issue that tackles multiple concerns. So I'm closing this, but please open new issues if anything else needs to be discussed or improved! |
Thanks @m-mohr for starting this, looks like an excellent beginning here. Hope it's okay to open a thread to discuss some terminology. I really like how you have both
forecast:datetime
andforecast:step
, with one of the two being required.I'm not sure that
step
is the ideal term. The term 'step' to me at least implies a uniform interval, i.e. that the forecast is using a 3H step size, rather than that theforecast:datetime
occurs 3H after theforecast:start_datetime
. Also some forecasts, like NOAA's Global Ensemble Forecast System (GEFS), use a variable step size, moving from 3H to 6H interval after the first 10 days. I think it is common to use the term "horizon" as the difference between start time and current time, and step as the difference between subsequent observations.More substantively, it's not clear how to report forecasts of values that are defined only over intervals of time rather than instants. For instance, in GEFS, some values are forecast at the step, whereas others refer to interval predictions, such as the amount of rain accumulating in a 3H-6H interval, or the average, min, or max value during an interval. GEFS calls this column "forecast valid", and uses a text based set of values like "3hr forecast", "0 -3hr acc", or "0-3 hr ave" to distinguish, which I agree is not very machine readable (see https://www.nco.ncep.noaa.gov/pmb/products/gens/gep01.t00z.pgrb2a.0p50.f003.shtml). Still, we certainly want to be able to express things like 'rainfall accumulation" in this standard. (while such values could be converted to a rate, that's really not the same, and obviously NOAA has but some thought into this). Maybe an additional field is necessary in these contexts.
The text was updated successfully, but these errors were encountered: