-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Orfeo backscatter: preserve sparse partitioner #1018
Comments
This might also explain why the point extractions for EUGW cost more than anticipated |
* sar_backscatter: try to preserve partitioner #1018 * sar_backscatter: print partitioner in test, may need an assert #1018 * sar_backscatter: print partitioner in test, may need an assert #1018 * sar_backscatter: should now get partitioner from scala #1018 * sar_backscatter: should now get partitioner from scala #1018 * sar_backscatter: add assert on partitioner #1018
The fix is available on staging. For the weed job, costs went from ~900 to ~100. It is the aggregate temporal step for sentinel-1 which is now more effective: AFTER:
This is probably caused by the fact that aggregate temporal will insert empty tiles where it expects data. |
So if I understand correctly this influences PG's that use sar_backscatter followed by an aggregate_temporal? (so not only point extractions) |
The V2 code path of orfeo backscatter results in an RDD with a default partitioner:
openeo-geopyspark-driver/openeogeotrellis/collections/s1backscatter_orfeo.py
Line 1031 in 85ac055
As a result, point extraction jobs suffer from bad performance, which would not happen if sparse partitioners are available.
The V1 code path does have some support for sparse partitioners:
openeo-geopyspark-driver/openeogeotrellis/collections/s1backscatter_orfeo.py
Line 609 in 85ac055
The text was updated successfully, but these errors were encountered: