-
Notifications
You must be signed in to change notification settings - Fork 274
Rmd template for running workflow #3486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Rmd template for running workflow #3486
Conversation
mdietze
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, I think the idea is to work with a Rmd, not a R script. Second, if code is in a Rmd code block, I don't see the advantage of also putting it in a function.
web/workflow_modular.R
Outdated
|
|
||
| run_model_execution <- function(settings_path, debug = FALSE) { | ||
| # Load settings | ||
| settings <- PEcAn.settings::read.settings(settings_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you want to re-load the settings object each time. Just pass the settings object, not the settings path
web/workflow_modular.R
Outdated
| settings <- PEcAn.settings::read.settings(settings_path) | ||
|
|
||
| # Write configs | ||
| if (PEcAn.utils::status.check("CONFIG") == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
status.check doesn't do much outside of the web interface, I think these bits can be dropped
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right! PEcAn.utils::status.check() is primarily useful for the web interface, and for a standalone script, it doesn't add much value. We can safely remove those checks and simplify the script while ensuring proper execution
web/workflow_modular.R
Outdated
| if (PEcAn.utils::status.check("CONFIG") == 0) { | ||
| if (debug) cat("Writing model configurations...\n") | ||
|
|
||
| PEcAn.utils::status.start("CONFIG") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with the status.start and status.end
|
thanks for reviewing ! i will make those changes in the new commits. |
|
@mdietze I think we should keep this file in |
web/workflow_modular.Rmd
Outdated
| output: html_document | ||
| --- | ||
|
|
||
| ```{r libraries} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a RMarkdown file, can you maybe add some text about what each section does. This will help for novice users to understand why this specific function is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a short description and a link to the documentation.
web/workflow_modular.Rmd
Outdated
| run.ensemble.analysis(plot.timeseries=TRUE) | ||
| ``` | ||
|
|
||
| ```{r finish} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this is needed. You know when it is done, when the last call returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The finish section isn't strictly necessary since the workflow naturally ends when the last function call completes. I included it as a simple confirmation message, especially useful when settings$debug is enabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know if you prefer it removed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to have this block removed
|
@mdietze @robkooper I am curious about your input on this PR—I’ve added descriptions for each section for clarity.Could you please review this once? |
web/workflow_modular.Rmd
Outdated
| # Load PEcAn settings files. | ||
|
|
||
| Open and read in settings file for PEcAn run. | ||
| To create a pecan.xml, you can download one generated in the PEcAn web interface or one of the `pecan.<modelname>.xml` files in the tests/ directory of the PEcAn repository (github.com/pecanproject/pecan). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This is a good place to start, but I think in the longer term (not this PR) we'll want the Rmd to help build the settings and deprecate the web interface
- There might be useful info you can pull from the tutorials (e.g. Demo 1, Demo 2, etc) to help populate the text between code blocks. Similar to (1), the long term goal (not this PR) is also to update those tutorials around the notebook-based interface
web/workflow_modular.Rmd
Outdated
|
|
||
| settings_path <- "settings.xml" | ||
| settings <- PEcAn.settings::read.settings(settings_path) | ||
| settings <- PEcAn.utils::do_conversions(settings) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do.conversions 100% needs to be in a separate block.
Also, you're missing the steps that check/update the settings. e.g. PEcAn.settings::prepare.settings
web/workflow_modular.Rmd
Outdated
| } | ||
| ``` | ||
|
|
||
| # Trait Analysis |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we want the Trait Analysis and Meta Analysis in the default workflow. If we do they should be combined into one block since they're run together by default. If the block is retained, the text need to be expanded to describe this as OPTIONAL and what you need to provide if you elect not to run this (i.e. specify a posterior file in the settings object).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. I'll merge them into one block .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have to make sure to clarify that if users skip the Trait & Meta Analysis step, they must manually specify a posterior file in the settings to ensure the model has the necessary input data for further analysis.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. In practice, specifying a posterior file is the default way most runs are done, but it's also true that PEcAn doesn't ship with default posterior files for any models, which argues for keeping these tasks in the workflow. That said, it would be useful to add text/code to show users how to grab the posterior from this analysis so that they can re-use it in their future analyses. It's also worth noting, in the text, that this chunk is one of the few where a connection to the BETY database is currently no optional. A workflow that can run front-to-back with BETY being optional is desirable since it makes installation much simpler and also makes it possible to run in a HPC environment. One thing we could also do is to take the posterior files associated with published analyses (e.g. Fer et al 2018) and make sure they are somewhere publically archived and machine readable so that users could do a demo that doesn't require BETY.
web/workflow_modular.Rmd
Outdated
|
|
||
| ```{r run-model} | ||
| runModule.start.model.runs(settings) | ||
| runModule.get.results(settings) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move get.results to the Model Analyses block. This function is specific to those analyses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, could be beyond this PR, but I think either this block, or a block after, would be a great place to show how to visualize the model outputs (e.g. a simple time series plot, a simple bivariate scatter plot). These would replace the interactive visualizations in the old web portal. I'd put this as #1 priority for the next PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would something simple like this work?
PEcAn.visualization::plot_netcdf(datafile, yvar, xvar, width, height, filename, year) (after taking the inputs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable. Once you have a fully working workflow it would be nice if you could post a copy of the knit report so we all can see what the workflow looks like once run.
A few other things to note:
- Please make sure to update changelogs
- Please make sure to update the overall Documentation to reference this workflow
- We should think about what sort of test need to be added to ensure this workflow continues to function (e.g., no new PRs break the workflow). My gut instinct is that this would be an integration test that runs the full workflow for any PR similar to the existing SIPNET Github Action. That said, adding an entirely new GH Action may be beyond an initial PR but is the sort of thing we should follow up on quickly.
web/workflow_modular.Rmd
Outdated
| run.sensitivity.analysis() # Run sensitivity analysis and variance decomposition on model output | ||
| run.ensemble.analysis() # Run ensemble analysis on model output. | ||
| run.ensemble.analysis(plot.timeseries=TRUE) | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be beyond this PR, but I think either this block, or a block after, would be a great place to visualize the results of these analyses.
web/workflow_modular.Rmd
Outdated
| run.ensemble.analysis(plot.timeseries=TRUE) | ||
| ``` | ||
|
|
||
| ```{r finish} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to have this block removed
|
|
|
For that bug, have you identified what line of code is throwing the error and what the values are of the arguments being passed? If you can do that it's usually clear which argument is invalid, and then you can traceback to figure out where that input got misspecified or corrupted |
yes i tried to log them.It looks something like this... But not sure why the values are NULL. |
|
@mdietze I am sharing the knit report for the workflow, with only the last step (running the workflow) removed. From my investigation, the issue seems to stem from logging into RStudio with a different user, which prevents the correct configuration of the path to |
|
I believe this PR is ready now. @mdietze @robkooper When you have a moment, could you please take a quick look and let me know your feedback? I have also shared the knit report above. |
|
|
||
| This tutorial provides a step-by-step guide to running the **PEcAn Modular Workflow**. The PEcAn (Predictive Ecosystem Analyzer) system automates ecological modeling, helping researchers analyze plant functional traits, run model simulations, and perform sensitivity analyses. | ||
|
|
||
| After setting up PEcAn locally, open **RStudio** in your browser and log in with: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These instructions are specific to the Docker stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this should be in general.Will do the changes.
|
|
||
| - Installed **PEcAn** and its dependencies. | ||
| - An XML settings file (`settings.xml`) configured for your use case. | ||
| - A model binary (e.g., **SIPNET** or **ED2**) specified in your settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- A couple lines up refers to pecan.xml but here we refer to settings.xml, this will be confusing to new users
- Can we add an example pecan.xml? For now it could be something configured to run specifically in the default Docker stack, but in the future we'll want to add a section to the Rmd itself to build/update the settings object.
- I don't think you need an "or" in the e.g., especially if the default starting point will be SIPNET
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added the steps for pecan.xml.
| - An XML settings file (`settings.xml`) configured for your use case. | ||
| - A model binary (e.g., **SIPNET** or **ED2**) specified in your settings. | ||
|
|
||
| --- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From here down appears to mostly duplicate the Rmd itself, which is unnessisary and redundant, meaning that it will also be hard to maintain as any changes there will have to be duplicated here. How to run the Rmd should be self-documenting.
...e/02_demos_tutorials_workflows/02_user_demos/06_Workflow_using_Rmd_template.Rmd/workflow.Rmd
Outdated
Show resolved
Hide resolved
...e/02_demos_tutorials_workflows/02_user_demos/06_Workflow_using_Rmd_template.Rmd/workflow.Rmd
Outdated
Show resolved
Hide resolved
| PEcAn requires that settings be converted into the correct units before running model simulations. | ||
|
|
||
| ```{r convert-settings} | ||
| if (!is.list(settings$host) || length(settings$host) > 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This bit isn't being explained. Conceptually, it belongs in the prepare.settings block (and probably in prepare.settings itself)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just required for now. @infotroph already raised a Pr for this #3492 .in rstudio this is a solution for the issue described in #3492 .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- In general, don't put temporary workarounds for unrelated issues into a feature PR. In cases like this where there's already a permanent fix proposed, I usually apply the fix for local testing but do not commit it into the feature branch.
- For this issue specifically, this is the wrong fix as well. The issue in met.process: ensure host arg is passed on as a list #3492 wasn't the format of the host block, it was how
met.processhandled it internally, and it's expected for settings$host to contain other items besidesname. Removing those other items will break later parts of the workflow that need to use them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for that. As the issue is solved now, this can be safely removed.
|
|
||
| # Trait and Meta Analysis | ||
|
|
||
| PEcAn retrieves plant trait data and performs meta-analysis to derive parameter distributions for the model. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still need to explain this is optional and what the alternatives are
| Model-specific configuration files are generated before running simulations. | ||
|
|
||
| ```{r run.write.configs} | ||
| settings$model$binary <- "~/pecan/models/sipnet/" # Update the path to your model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updates to the settings need to be done higher up (e.g. around where you update outdir) and you need to explain what specifically you are doing in a way that a novice user would be able to update.
Also, this path is very misleading as 1. it points to a folder, not a binary and 2. no one should be installing the model binary inside the model coupler folder (or anywhere else in the PEcAn code itself)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually this will point to model path but in setting as the name of the variable is modelbinary,on the first look it seems to be a binary,but it should be model path.I have tried to write the doc in that it will be clear.
...e/02_demos_tutorials_workflows/02_user_demos/06_Workflow_using_Rmd_template.Rmd/workflow.Rmd
Outdated
Show resolved
Hide resolved
|
|
||
| --- | ||
|
|
||
| # Model Analyses |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing the code block that visualizes the output, which should come before the SA and EA. Also, this text should explain that it is run if those bit of the settings are configured (which they are not in a default run) and point the reader to where they would learn about how to configure them. Because this won't run by default it should also be described as optional
|
@mdietze could you review this once more? |
|
I think the description may reference the wrong issue (#2784) could you please check? |
|
@mdietze whenever you have a moment please take a look at the changes once. |
book_source/02_demos_tutorials_workflows/04_modular_workflow.Rmd
Outdated
Show resolved
Hide resolved
book_source/02_demos_tutorials_workflows/04_modular_workflow.Rmd
Outdated
Show resolved
Hide resolved
|
Hi @mdietze, |
Signed-off-by: Aritra Dey <[email protected]>
Signed-off-by: Aritra Dey <[email protected]>
Signed-off-by: Aritra Dey <[email protected]>
…ow' into feat/modular-workflow
Signed-off-by: Aritra Dey <[email protected]>
|
@AritraDey-Dev should this PR be pulled in or has it been superseded by the (already merged) Demo 1 PR (and this should be closed) |



Description
This is for the issue #1866 and following up on the discussion by @mdietze in the issue #2784
This pull request introduces a new R Markdown file for the PEcAn modular workflow, which includes loading necessary packages, reading settings, and running various analyses. The key changes are summarized below:
New R Markdown file for PEcAn modular workflow:
web/workflow_modular.Rmdwith metadata including title, author, date, and output format.Motivation and Context
This PR fixes #1866
Review Time Estimate
Types of changes
Checklist: