Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend-download-area-options #114

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

GISAM
Copy link
Collaborator

@GISAM GISAM commented Mar 16, 2025

This update adds new options, providing greater flexibility in choosing the download extent (#10):

Download the current viewport – Save only the visible area.
Draw an area to download – Select a custom region.
Download the entire dataset – Retrieve the full dataset.

When i get time, I'll explore extracting geometry from an uploaded file or maybe selecting a layer from the table of contents.

GISAM added 4 commits March 15, 2025 22:46
This commit introduces a new dialog class, DownloadAreaDialog, which allows users to select different download area options: current extent, drawn area, or entire dataset. The dialog includes radio buttons for each option, an information label about potential file sizes, and OK/Cancel buttons for user interaction. The selected option is managed through a toggle method, enhancing the user experience for downloading data.
This commit introduces the DrawAreaInstructionsDialog class, which provides users with step-by-step instructions for drawing a download area on the map. The dialog includes a label with detailed instructions and buttons for confirming or canceling the drawing action. Additionally, the plugin's main logic has been updated to incorporate this new dialog, enhancing the user experience when selecting download areas.
… accordingly, allowing for no spatial filter when the entire dataset is downloaded.
@mtravis
Copy link
Collaborator

mtravis commented Mar 16, 2025

@GISAM have you tested downloading the whole dataset yet? I'm not sure DuckDB would handle converting some of the larger datasets to parquet, etc without us telling it to spill in to a temp memory location. Even then it might fail on say something like Overture buildings which is over 200GB.

@GISAM
Copy link
Collaborator Author

GISAM commented Mar 16, 2025

@GISAM have you tested downloading the whole dataset yet? I'm not sure DuckDB would handle converting some of the larger datasets to parquet, etc without us telling it to spill in to a temp memory location. Even then it might fail on say something like Overture buildings which is over 200GB.

@mtravis I don't think I'd even have the memory on my computer to download 200GB! All could really do for testing was to start a download for the entire dataset, then cancel it once the process began. I can look at adding a temporary directory path for spilling and set a memory limit to ensure DuckDB spills to disk when needed. And maybe, set up a process to check available disk space, compare it to the file size, and generate an "insufficient memory" message if there’s not enough disk space to handle the download.

@mtravis
Copy link
Collaborator

mtravis commented Mar 20, 2025

@GISAM have you tested downloading the whole dataset yet? I'm not sure DuckDB would handle converting some of the larger datasets to parquet, etc without us telling it to spill in to a temp memory location. Even then it might fail on say something like Overture buildings which is over 200GB.

@mtravis I don't think I'd even have the memory on my computer to download 200GB! All could really do for testing was to start a download for the entire dataset, then cancel it once the process began. I can look at adding a temporary directory path for spilling and set a memory limit to ensure DuckDB spills to disk when needed. And maybe, set up a process to check available disk space, compare it to the file size, and generate an "insufficient memory" message if there’s not enough disk space to handle the download.

@GISAM

That could work I guess but I can see foresee there being issues with QGIS locking up for a considerable time. Maybe the plugin isn't the best tool for doing this?

@GISAM
Copy link
Collaborator Author

GISAM commented Mar 20, 2025

That could work I guess but I can see foresee there being issues with QGIS locking up for a considerable time. Maybe the plugin isn't the best tool for doing this?

Fair point! I can't imagine many users would want to download the entire datasets anyway, as for large scale analysis, something like Apache Sedona or Whereobots would be better suited than QGIS. I can remove the option to download the entire dataset.

…. This option gets the total bounding box extent of all the combined features within the uploaded file.
@GISAM
Copy link
Collaborator Author

GISAM commented Mar 24, 2025

@cholmes i think test_plugin.py might be hanging because the plugin's run method now includes additional dialog boxes (DownloadAreaDialog and DrawAreaInstructionsDialog) that wait for user input, but the test cases only mock the initial DataSourceDialog.

So, when the test runs, it gets past the first mocked dialog but then encounters these new unmocked dialogs which wait indefinitely for a user interaction that will never come.

@cholmes
Copy link
Owner

cholmes commented Mar 24, 2025

Awesome work @GISAM! Sorry for not sounding in earlier, was at a sprint last week.

@cholmes i think test_plugin.py might be hanging because the plugin's run method now includes additional dialog boxes (DownloadAreaDialog and DrawAreaInstructionsDialog) that wait for user input, but the test cases only mock the initial DataSourceDialog.

Could you try to mock those as well? I did most all the initial tests with AI support, and it was relatively easy. I did struggle some with some of the user interaction stuff, and so removed some of it. So if you need to remove some tests to get this in that's ok. But I'd like to try to move us towards always adding new tests for new functionality, and slowly try to get the test coverage to at least 90%. 0.7.x had a number of issues where smaller things broke when a new thing got added, so tests should make us more confident in adding new things. If you have to reduce the tests to get this in then just be sure to ticket that we need coverage of it.

I'm not sure DuckDB would handle converting some of the larger datasets to parquet, etc without us telling it to spill in to a temp memory location.

I'm 97% sure that DuckDB will handle this fine - in my experience it automatically spills to disk pretty seamlessly most of the time.

Even then it might fail on say something like Overture buildings which is over 200GB.

I do think it'll likely do ok. But I do think many users will have no idea what they're getting into. I do think we should warn people at least on Overture datasets just how big things are. I think we can just hard code that to start. But ideally we could query datasets and give people at least an estimate of the size, for any dataset, if it's over maybe 5 gigabytes? Or we could just query number of rows and give rough estimate ranges.

I can remove the option to download the entire dataset.

I lean towards keeping it in. I like the idea of trying to get user disk size, combined with what I said above where we give the user a warning of the size (should be with full downloads and also large areas - like always try to check size and warn users, like we do with geojson). But we could remove from this PR and ticket and add back in as a next step.

@cholmes
Copy link
Owner

cholmes commented Mar 27, 2025

Thanks for fixing the test! Really great work all around on this. I just tried it out, and I think at least a couple tweaks would be good.

  • I think it'd be better to have some way to select which way you want to get the extent before you select 'ok' on the layers you want, as this introduces an extra step for everyone, even if they just want to download the extent. An ideal could be that there is an 'extent' field that gets auto-populated by the current extent. Not sure how possible / tricky that is, but then it'd make it clear what the extent is in the default case. And then there could be a couple buttons to get the extent in alternate ways.

  • I like how the draw area works, it feels quite nice. But I'd call it 'draw box' at least some where, as then I think people have to read less of the instructions, and it also makes it clear that other types of areas aren't supported. Could potentially add 'draw polygon', though I'm not exactly sure how the interaction would work.

  • For 'upload a vector file' I'd make it clear that it's just using the bounds of the vector file, not only download the areas of the vector file.

  • Generally the 'upload a vector file' feels weird to me, since it's not really uploading, and it's really easy to drag and drop a layer into qgis. If we're to have one more interaction I'd lean towards something that leverages either the active layer or selected features - like make use of the access we have in QGIS, don't make people add a new layer (or re-upload a layer they have), just have them make use of the map.

The Planet QGIS plugin does a few of these things:

extents

And I realized I didn't do the selected feature one right in the above:

extent-select

I don't think Planet plugin is ideal here, there's too many options and too much overlap. And their draw geometry doesn't work that great. They do have the 'upload' option too, so it does seem like some precedent for it. But I do still lean away from it.

I don't have an exact thing in mind, but I lean towards more like ~3 options - one that's current extent, one that is draw (maybe with geometry, but certainly bbox to start, and then one (or two?) based on a layer. I think the ones I like most are 'selected features', and rely on QGIS's select functionality, and bounds of a selected layer. Perhaps they could be combined where if there's no selected features then it uses the selected layer.

And then I do lean towards some option in the main dialog for extent, maybe one that expands out to show more options.

But I'm also up for us just keeping it simpler to start, and mostly clarify and streamline what you already have. Thanks a ton for your contribution, I really appreciate it.

@GISAM
Copy link
Collaborator Author

GISAM commented Mar 29, 2025

@cholmes thanks for the input, I'll make some tweaks as soon as i get a chance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants