Skip to content

Conversation

@AritraDey-Dev
Copy link
Member

@AritraDey-Dev AritraDey-Dev commented Mar 14, 2025

Description

This is for the issue #1866 and following up on the discussion by @mdietze in the issue #2784

This pull request introduces a new R Markdown file for the PEcAn modular workflow, which includes loading necessary packages, reading settings, and running various analyses. The key changes are summarized below:

New R Markdown file for PEcAn modular workflow:

  • Added file web/workflow_modular.Rmd with metadata including title, author, date, and output format.
  • Loaded PEcAn packages and settings files to prepare for the workflow execution.
  • Implemented trait analysis to fetch plant trait data and prior distributions.
  • Performed meta-analysis to derive probabilistic distributions for model parameters.
  • Generated model configuration files, executed model simulations, and retrieved results for further analysis.

Motivation and Context

This PR fixes #1866

Review Time Estimate

  • Immediately
  • Within one week
  • When possible

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • My name is in the list of CITATION.cff
  • I agree that PEcAn Project may distribute my contribution under any or all of
    • the same license as the existing code,
    • and/or the BSD 3-clause license.
  • I have updated the CHANGELOG.md.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@AritraDey-Dev AritraDey-Dev changed the title Feat/modular workflow Feat/monolithic to modular workflow Mar 14, 2025
Copy link
Member

@mdietze mdietze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, I think the idea is to work with a Rmd, not a R script. Second, if code is in a Rmd code block, I don't see the advantage of also putting it in a function.


run_model_execution <- function(settings_path, debug = FALSE) {
# Load settings
settings <- PEcAn.settings::read.settings(settings_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you want to re-load the settings object each time. Just pass the settings object, not the settings path

settings <- PEcAn.settings::read.settings(settings_path)

# Write configs
if (PEcAn.utils::status.check("CONFIG") == 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

status.check doesn't do much outside of the web interface, I think these bits can be dropped

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right! PEcAn.utils::status.check() is primarily useful for the web interface, and for a standalone script, it doesn't add much value. We can safely remove those checks and simplify the script while ensuring proper execution

if (PEcAn.utils::status.check("CONFIG") == 0) {
if (debug) cat("Writing model configurations...\n")

PEcAn.utils::status.start("CONFIG")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with the status.start and status.end

@AritraDey-Dev
Copy link
Member Author

thanks for reviewing ! i will make those changes in the new commits.

@AritraDey-Dev AritraDey-Dev requested a review from mdietze March 16, 2025 17:53
@AritraDey-Dev
Copy link
Member Author

AritraDey-Dev commented Mar 17, 2025

@mdietze I think we should keep this file in base/all/inst/ directory instead of keeping it in web directory?
Curious to know your thought on the changes.

output: html_document
---

```{r libraries}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a RMarkdown file, can you maybe add some text about what each section does. This will help for novice users to understand why this specific function is needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a short description and a link to the documentation.

run.ensemble.analysis(plot.timeseries=TRUE)
```

```{r finish}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure this is needed. You know when it is done, when the last call returns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The finish section isn't strictly necessary since the workflow naturally ends when the last function call completes. I included it as a simple confirmation message, especially useful when settings$debug is enabled.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you prefer it removed!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to have this block removed

@AritraDey-Dev AritraDey-Dev requested a review from robkooper March 17, 2025 19:21
@AritraDey-Dev
Copy link
Member Author

AritraDey-Dev commented Mar 19, 2025

@mdietze @robkooper I am curious about your input on this PR—I’ve added descriptions for each section for clarity.Could you please review this once?

# Load PEcAn settings files.

Open and read in settings file for PEcAn run.
To create a pecan.xml, you can download one generated in the PEcAn web interface or one of the `pecan.<modelname>.xml` files in the tests/ directory of the PEcAn repository (github.com/pecanproject/pecan).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. This is a good place to start, but I think in the longer term (not this PR) we'll want the Rmd to help build the settings and deprecate the web interface
  2. There might be useful info you can pull from the tutorials (e.g. Demo 1, Demo 2, etc) to help populate the text between code blocks. Similar to (1), the long term goal (not this PR) is also to update those tutorials around the notebook-based interface


settings_path <- "settings.xml"
settings <- PEcAn.settings::read.settings(settings_path)
settings <- PEcAn.utils::do_conversions(settings)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do.conversions 100% needs to be in a separate block.

Also, you're missing the steps that check/update the settings. e.g. PEcAn.settings::prepare.settings

}
```

# Trait Analysis
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we want the Trait Analysis and Meta Analysis in the default workflow. If we do they should be combined into one block since they're run together by default. If the block is retained, the text need to be expanded to describe this as OPTIONAL and what you need to provide if you elect not to run this (i.e. specify a posterior file in the settings object).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point. I'll merge them into one block .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have to make sure to clarify that if users skip the Trait & Meta Analysis step, they must manually specify a posterior file in the settings to ensure the model has the necessary input data for further analysis.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. In practice, specifying a posterior file is the default way most runs are done, but it's also true that PEcAn doesn't ship with default posterior files for any models, which argues for keeping these tasks in the workflow. That said, it would be useful to add text/code to show users how to grab the posterior from this analysis so that they can re-use it in their future analyses. It's also worth noting, in the text, that this chunk is one of the few where a connection to the BETY database is currently no optional. A workflow that can run front-to-back with BETY being optional is desirable since it makes installation much simpler and also makes it possible to run in a HPC environment. One thing we could also do is to take the posterior files associated with published analyses (e.g. Fer et al 2018) and make sure they are somewhere publically archived and machine readable so that users could do a demo that doesn't require BETY.


```{r run-model}
runModule.start.model.runs(settings)
runModule.get.results(settings)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move get.results to the Model Analyses block. This function is specific to those analyses

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, could be beyond this PR, but I think either this block, or a block after, would be a great place to show how to visualize the model outputs (e.g. a simple time series plot, a simple bivariate scatter plot). These would replace the interactive visualizations in the old web portal. I'd put this as #1 priority for the next PR.

Copy link
Member Author

@AritraDey-Dev AritraDey-Dev Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would something simple like this work?
PEcAn.visualization::plot_netcdf(datafile, yvar, xvar, width, height, filename, year) (after taking the inputs)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable. Once you have a fully working workflow it would be nice if you could post a copy of the knit report so we all can see what the workflow looks like once run.

A few other things to note:

  1. Please make sure to update changelogs
  2. Please make sure to update the overall Documentation to reference this workflow
  3. We should think about what sort of test need to be added to ensure this workflow continues to function (e.g., no new PRs break the workflow). My gut instinct is that this would be an integration test that runs the full workflow for any PR similar to the existing SIPNET Github Action. That said, adding an entirely new GH Action may be beyond an initial PR but is the sort of thing we should follow up on quickly.

run.sensitivity.analysis() # Run sensitivity analysis and variance decomposition on model output
run.ensemble.analysis() # Run ensemble analysis on model output.
run.ensemble.analysis(plot.timeseries=TRUE)
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be beyond this PR, but I think either this block, or a block after, would be a great place to visualize the results of these analyses.

run.ensemble.analysis(plot.timeseries=TRUE)
```

```{r finish}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to have this block removed

@AritraDey-Dev AritraDey-Dev requested a review from mdietze March 19, 2025 19:07
@AritraDey-Dev AritraDey-Dev changed the title Feat/monolithic to modular workflow Rmd template for running workflow Mar 23, 2025
@AritraDey-Dev
Copy link
Member Author

Screenshot from 2025-03-23 21-25-36
@mdietze, I've been stuck on this issue for a while despite multiple debugging attempts—any insights would be really helpful !

@mdietze
Copy link
Member

mdietze commented Mar 23, 2025

For that bug, have you identified what line of code is throwing the error and what the values are of the arguments being passed? If you can do that it's usually clear which argument is invalid, and then you can traceback to figure out where that input got misspecified or corrupted

@AritraDey-Dev
Copy link
Member Author

Screenshot from 2025-03-23 21-25-36 @mdietze, I've been stuck on this issue for a while despite multiple debugging attempts—any insights would be really helpful !

yes i tried to log them.It looks something like this...

Screenshot from 2025-03-23 21-45-55

But not sure why the values are NULL.

@AritraDey-Dev
Copy link
Member Author

AritraDey-Dev commented Mar 23, 2025

test.pdf

@mdietze I am sharing the knit report for the workflow, with only the last step (running the workflow) removed.

From my investigation, the issue seems to stem from logging into RStudio with a different user, which prevents the correct configuration of the path to job.sh(As sometime it take values from pecan directory). I believe the workflow should function correctly in a properly configured environment. The changelogs and documentation i will add soon.

@AritraDey-Dev
Copy link
Member Author

I believe this PR is ready now. @mdietze @robkooper When you have a moment, could you please take a quick look and let me know your feedback? I have also shared the knit report above.


This tutorial provides a step-by-step guide to running the **PEcAn Modular Workflow**. The PEcAn (Predictive Ecosystem Analyzer) system automates ecological modeling, helping researchers analyze plant functional traits, run model simulations, and perform sensitivity analyses.

After setting up PEcAn locally, open **RStudio** in your browser and log in with:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These instructions are specific to the Docker stack.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this should be in general.Will do the changes.


- Installed **PEcAn** and its dependencies.
- An XML settings file (`settings.xml`) configured for your use case.
- A model binary (e.g., **SIPNET** or **ED2**) specified in your settings.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. A couple lines up refers to pecan.xml but here we refer to settings.xml, this will be confusing to new users
  2. Can we add an example pecan.xml? For now it could be something configured to run specifically in the default Docker stack, but in the future we'll want to add a section to the Rmd itself to build/update the settings object.
  3. I don't think you need an "or" in the e.g., especially if the default starting point will be SIPNET

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the steps for pecan.xml.

- An XML settings file (`settings.xml`) configured for your use case.
- A model binary (e.g., **SIPNET** or **ED2**) specified in your settings.

---
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From here down appears to mostly duplicate the Rmd itself, which is unnessisary and redundant, meaning that it will also be hard to maintain as any changes there will have to be duplicated here. How to run the Rmd should be self-documenting.

PEcAn requires that settings be converted into the correct units before running model simulations.

```{r convert-settings}
if (!is.list(settings$host) || length(settings$host) > 1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This bit isn't being explained. Conceptually, it belongs in the prepare.settings block (and probably in prepare.settings itself)

Copy link
Member Author

@AritraDey-Dev AritraDey-Dev Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just required for now. @infotroph already raised a Pr for this #3492 .in rstudio this is a solution for the issue described in #3492 .

Copy link
Member

@infotroph infotroph Mar 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. In general, don't put temporary workarounds for unrelated issues into a feature PR. In cases like this where there's already a permanent fix proposed, I usually apply the fix for local testing but do not commit it into the feature branch.
  2. For this issue specifically, this is the wrong fix as well. The issue in met.process: ensure host arg is passed on as a list #3492 wasn't the format of the host block, it was how met.process handled it internally, and it's expected for settings$host to contain other items besides name. Removing those other items will break later parts of the workflow that need to use them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for that. As the issue is solved now, this can be safely removed.


# Trait and Meta Analysis

PEcAn retrieves plant trait data and performs meta-analysis to derive parameter distributions for the model.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still need to explain this is optional and what the alternatives are

Model-specific configuration files are generated before running simulations.

```{r run.write.configs}
settings$model$binary <- "~/pecan/models/sipnet/" # Update the path to your model
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates to the settings need to be done higher up (e.g. around where you update outdir) and you need to explain what specifically you are doing in a way that a novice user would be able to update.

Also, this path is very misleading as 1. it points to a folder, not a binary and 2. no one should be installing the model binary inside the model coupler folder (or anywhere else in the PEcAn code itself)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this will point to model path but in setting as the name of the variable is modelbinary,on the first look it seems to be a binary,but it should be model path.I have tried to write the doc in that it will be clear.


---

# Model Analyses
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing the code block that visualizes the output, which should come before the SA and EA. Also, this text should explain that it is run if those bit of the settings are configured (which they are not in a default run) and point the reader to where they would learn about how to configure them. Because this won't run by default it should also be described as optional

@AritraDey-Dev AritraDey-Dev requested a review from mdietze March 24, 2025 18:13
@AritraDey-Dev
Copy link
Member Author

AritraDey-Dev commented Mar 24, 2025

@mdietze could you review this once more?
I have made some changes in documentation and like you mention it's better not to keep entire code.I completely agree with that.I only put some small piece of code which could help user for a understandable explanation.Once this is done successfully,i can start working on the gh action for this.

@dlebauer
Copy link
Member

I think the description may reference the wrong issue (#2784) could you please check?

@AritraDey-Dev
Copy link
Member Author

I think the description may reference the wrong issue (#2784) could you please check?

The discussion regarding this with @mdietze started in the issue #2784,So it's there in the description.I will point the issue to #1866 .Thanks for the suggestion !

@AritraDey-Dev
Copy link
Member Author

@mdietze whenever you have a moment please take a look at the changes once.

@AritraDey-Dev AritraDey-Dev requested a review from infotroph April 1, 2025 12:06
@AritraDey-Dev
Copy link
Member Author

Hi @mdietze,
Just checking in on this PR—your review would be helpful to move things along when you have a moment.

@github-actions github-actions bot added the Base label Oct 7, 2025
…r-workflow' into feat/modular-workflow"

This reverts commit 7018a35, reversing
changes made to cf1fb28.
@mdietze
Copy link
Member

mdietze commented Nov 3, 2025

@AritraDey-Dev should this PR be pulled in or has it been superseded by the (already merged) Demo 1 PR (and this should be closed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Rmd template for basic PEcAn workflow

6 participants