-
Notifications
You must be signed in to change notification settings - Fork 1
Implement multi-site local sensitivity analysis workflow #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
BTW this looks really nice, well done. Though I'd like to work through the steps as I review. Could you please add a readme with instructions on how to use the repository? |
|
thanks for reviewing, updated readme in #2 |
infotroph
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized I left these comments unposted on my partial readthrough the other day, so adding now. Would it be useful for me to finish a full review now or wait for updates first?
scripts/012_aggregate_sensitivity.R
Outdated
| #---------------------------------- | ||
| # read.settings() uses xmlToList loses duplicate <variable> tags, so read XML directly | ||
| library(XML) | ||
| xml_doc <- XML::xmlParse(file.path("output", "pecan.CONFIGS.xml")) | ||
| sa_variables <- unique(XML::xpathSApply( | ||
| xml_doc, | ||
| "//sensitivity.analysis//variable", | ||
| XML::xmlValue | ||
| )) | ||
| XML::free(xml_doc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| #---------------------------------- | |
| # read.settings() uses xmlToList loses duplicate <variable> tags, so read XML directly | |
| library(XML) | |
| xml_doc <- XML::xmlParse(file.path("output", "pecan.CONFIGS.xml")) | |
| sa_variables <- unique(XML::xpathSApply( | |
| xml_doc, | |
| "//sensitivity.analysis//variable", | |
| XML::xmlValue | |
| )) | |
| XML::free(xml_doc) | |
| sa_variables <- settings$sensitivity.analysis |> | |
| (\(x) x[names(x) == "variable"])() |> | |
| unlist() |> | |
| unique() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| @@ -0,0 +1,108 @@ | |||
| #!/usr/bin/env Rscript | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this site selection process differ from what David already implemented in the downscaling repo? Seems preferable to use an existing tool rather than add that complexity to the UA process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have refactored this script to completely remove the clustering logic. Instead it now acts purely as a consumer of the shared design_points.csv artifact (currently the david's manually selected 198 sites) to perform the necessary PFT mapping and sub-sampling for UA.
We will treat this as the interface for now, and can revisit or refactor the integration logic once the broader coordination strategy between the Downscaling and Uncertainty workflows is finalized.
see comment #1 (comment)
R/local_sensitivity.R
Outdated
| @@ -0,0 +1,497 @@ | |||
| #' Aggregate local sensitivity results across sites | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what "local" means here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch regarding the ambiguity
In this context, 'Local' refers to the mathematical method used by pecan's variance decomposition one-at-a-time (OAT) SA, where parameters are varied individually around their median to calculate partial derivatives (elasticities). This is distinct from the global sensitivity analysis we perform later.
I have updated the function documentation to explicitly state this aggregates one-at-a-time (OAT) parameter sensitivity results to avoid confusion with geographical locality.
R/local_sensitivity.R
Outdated
| #' Aggregate local sensitivity results across sites | ||
| #' | ||
| #' @param sensitivity_outdir Directory containing PEcAn sensitivity outputs | ||
| #' @param design_points Data frame of site metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend documenting what columns are expected in this df, and whether unexpected columns are ignored or cause errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 documented
R/local_sensitivity.R
Outdated
| ) | ||
|
|
||
| # Process all files | ||
| all_results <- purrr::map_dfr(sa_files, function(sa_file) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be much easier to read and understand if you define it as a function with an informative name -- yes, even if it's only called once as all_results <- purrr::map_dfr(sa_files, name_of_function)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have extracted the file processing logic into a standalone helper function process_sa_file
Thanks for the comments, |
…rely as a consumer of the shared design_points.csv artifact
…t list extraction
| # Currently, this script consumes the 'design_points.csv' from the shared | ||
| # directory. At this stage of the project, these are MANUALLY SELECTED | ||
| # points (198 sites), not yet the output of the automated clustering | ||
| # workflow. | ||
| # | ||
| # TODO: Once the integration architecture between the Downscaling and | ||
| # Uncertainty repos is finalized, refactor this script, continue to consume | ||
| # the artifact generated by that pipeline. | ||
| # ======================================================================= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.
Summary:
Implements a complete multi-site local sensitivity analysis (SA) workflow for SIPNET as specified in #151. This PR delivers aggregated sensitivity results across design points, environmental gradient analysis, and a Quarto-based report summarizing parameter importance patterns.
Addresses Issue #151 Requirements