Skip to content

Add some notes on RasterIndex design #4

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 15, 2025
Merged

Add some notes on RasterIndex design #4

merged 4 commits into from
Apr 15, 2025

Conversation

dcherian
Copy link
Owner

@dcherian dcherian commented Apr 9, 2025

@dcherian dcherian changed the title Add some notes Add some notes on RasterIndex design Apr 9, 2025
@dcherian dcherian force-pushed the raster-index-nots branch 11 times, most recently from aa17148 to 3118ca1 Compare April 9, 2025 22:52
@dcherian dcherian force-pushed the raster-index-nots branch from 3118ca1 to 07acaf2 Compare April 10, 2025 02:31
2. All possible transforms have **offsets** which means they need to be kept up-to-date during slicing.
3. Allow extracting metadata necessary to accurately represent the information on disk.

## Some complications
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another complication is the special but very common case where the model space can be represented by orthogonal 1D coordinates (e.g., rectilinear affine with no rotation), which by design is not supported by the xarray.CoordinateTransform base class.

In corteva/rioxarray#846 the workaround consists in introducing an AxisAffineTransform(CoordinateTransform) subclass that wraps a 2D affine but handles one axis, as well as a AxisAffineTransformIndex(CoordinateTransformIndex) subclass. The latter allows providing useful features in the 1D case, such as automatically converting the index into a PandasIndex when indexing at arbitrary locations.


1. Make it easy to read a GeoTIFF with CRS and raster -> model space transformation information in to Xarray with appropriate indexes. There are at least two indexes: one associated with the CRS; and one with the transformation.
2. The raster ↔ model transformation information can be ambiguous, so an explicit API should be provided.
1. [RPCs](http://geotiff.maptools.org/rpc_prop.html):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with RPCs, but I wonder if this wouldn't allow an Xarray Dataset with, e.g.,

  • x and y coordinates associated with a RasterIndex[AffineTransformIndex]

and

  • lat and lon coordinates associated with a RasterIndex[RPCIndex].

which would enable both ds.sel(x=..., y=...) and ds.sel(lat=..., lon=...).

However, I read that

The RPC model allows a row/column location to be computed for a given lat, long and height value. It is not inherently invertable, though it is usually possible to compute at lat,long location from a row, column and height value using iterative methods.

So IIUC CoordinateTransform.forward() would be trickier to implement. Maybe we could make the implementation optional? This would mean that lat, lon are virtual coordinates that cannot be materialized. But since the RPC model allows CoordinateTransform.inverse() it would still allow ds.sel(lat=..., lon=...).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would mean that lat, lon are virtual coordinates that cannot be materialized.

Yes, that's right. I think this may still work with the current model we have, perhaps we need to override the repr.

Each of the wrapped index has an associated transform:
```python
@dataclass
class RasterTransform:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this class meant to inherit from xarray.CoordinateTransform or is it a different concept?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a different concept that represents which of the 5 ways the transformation is recorded in GeoTIFF metadata.

@benbovy
Copy link
Collaborator

benbovy commented Apr 10, 2025

That looks great @dcherian. I left a few comments mainly about how to build RasterIndex on top of Xarray coordinate transform, although I'm not sure that these are really relevant here since you seem to focus the notes on how to pass information from/to metadata.

2. ModelTransformationIndex ↔ ModelTransformationTag
3. ModelTiepointScaleIndex ↔ ModelTiepointTag + ModelPixelScaleTag
4. GCPIndex ↔ Ground Control Points
5. RPCIndex ↔ Rational Polynomial Coefficients
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently a 6th possible one could be based on subsampled auxiliary coordinates, detailed in CF section 8.3 and equivalent to GDAL's geolocation arrays with PIXEL_STEP and/or LINE_STEP > 1.

Copy link
Collaborator

@benbovy benbovy Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that a decode/encode workflow strategy makes a lot of sense in this case.

Taking the example 8.3 from CF section 8.3, the decode step may consist in:

  • turn the tie point coordinate variables lat(tp_yc, tp_xc) and lon(tp_yc, tp_xc) into lat(yc, xc) and lon(yc, xc) Xarray coordinates associated with a custom transformation index that stores only the tie points. In other words, uncompress the dimensions of the lat & lon coordinates without uncompressing their data.
  • also remove the tie point index variables and the interpolation variable, and track their data / metadata internally in the index

The encode step would then consist in restoring the compressed tie point coordinate & index variables as well as the interpolation variable.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nice, i agree completely

@dcherian dcherian force-pushed the raster-index-nots branch from 0097974 to e8d21c1 Compare April 15, 2025 21:54
@dcherian dcherian merged commit 633698f into main Apr 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants