Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

apply_neighborhood: check if overlap is respected when no neighbours #982

Open
jdries opened this issue Jan 2, 2025 · 1 comment
Open
Assignees

Comments

@jdries
Copy link
Contributor

jdries commented Jan 2, 2025

apply_neighborhood adds nodata at cube borders rather than properly buffering

https://github.com/Open-EO/openeo-geotrellis-extensions/blob/222ff52c8d2146029f3ef6a0847660b3be3bbd54/openeo-geotrellis/src/main/scala/org/openeo/geotrellis/OpenEOProcesses.scala#L1071

The reason: bufferTiles will only buffer if data is available. We now solve this using 'makeSquareTiles', but that inserts nodata.

Option 1

What we can do to solve this is to make buffer sizes a multiple of tile sizes. Effect should be that we load an extra full tile rather than only the border.

Option 2

Create 'buffered' tiles at load time immediately instead of as an extra step. This would also speed up the whole process?? Caveat is that subsequent processes would need to properly preserve the buffering when transforming tiles, up until the apply_neighbourhood??

Also the 'buffertiles' call returns an RDD that is no longer the of type <SpaceTimekey,MultibandTile>, so it won't be compatible with anything downstream. Also if we would convert back to multi band tile, it will still be of incorrect type...

Workaround option

As a workaround, users can simply increase the extent of the load_collection that needs to be buffered, and then clip away bad data after the apply_neighborhood.

@jdries jdries self-assigned this Jan 2, 2025
@VictorVerhaert
Copy link

In option 1, does the load an extra full tile refer to an internal tile (set with the tileSize feature flag) or an actual S2 Tile? (assuming the first but wanted to be sure)
In that case the first option might be the fastest solution with limited performance given that the tileSize is set to a small extent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants