Skip to content

Conversation

@Hhh-hyc
Copy link
Contributor

@Hhh-hyc Hhh-hyc commented Mar 1, 2025

Description

  1. Dockerfile for pecan/fates container is created based on pecan/model:latest image in Dockerhub.
  2. template.job and write.config.fates.R are modified to generate a very simple job.sh. It can be run successfully with local dataset within the standalone pecan/fates container.
  3. Next step: improve write.config file to modify fates parameter file, on going.
  4. DISCUSSION/HELP NEEDED: How to link the standalone pecan/fates container to other components of PEcAn (e.g., BETY) to communicate input and output?

Motivation and Context

Review Time Estimate

  • Immediately
  • Within one week
  • When possible

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation.
  • My name is in the list of CITATION.cff
  • I agree that PEcAn Project may distribute my contribution under any or all of
    • the same license as the existing code,
    • and/or the BSD 3-clause license.
  • I have updated the CHANGELOG.md.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@github-actions github-actions bot added the Models label Mar 1, 2025
## Load CTSM from https://github.com/NorESMhub/noresm-land-sites-platform/blob/main/docker/api/entrypoint_api.sh
## Following the structure of ED, download CTSM
WORKDIR /src
RUN git clone https://github.com/ESCOMP/CTSM.git && \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that there already existing CESM docker container efforts, e.g., https://hub.docker.com/r/openeuler/cesm, I wonder if it would be easier to start from that image and just add the required PEcAn and rabbitmq bits from pecan/models

Copy link

@huitang-earth huitang-earth Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mdietze for pointing this. It is interesting to try. I am not sure if it is easier to install PEcAn related libraries or not.
There are two additional concerns: (1) this is a CESM docker container, I am not sure if PEcAn want the whole cesm or just CLM-FATES. (2) the CLM version used in this image is already two years old (see its dockerfile). Some recent development of CLM-FATES will be missed. I am pretty sure that we need to update the installed libraries such as PIO, netcdf and ESMF, if we want to run the newer version of CLM-FATES.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, not required to go this route, just wanted to bring it up as an option. If you think we can set up the GH Actions to keep a FATES container continuously up-to-date, and other projects aren't doing that, then that's a strong argument in favor of the current route

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if @glemieux, @rosiealice and @serbinsh have any comments about this? Setting up the GH Actions to keep a FATES container continuously up-to-date sounds like a good plan to me.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be good to get @briandobbins insight on what efforts on the NCAR side of things might exist that you could leverage. I know there are container recipes for e3sm that are being developed, although I'm not sure about the details of their goals. The FATES team recently built off of @serbinsh containers for a tutorial that uses a more up-to-date version of e3sm, but we don't have funding currently to keep these up-to-date: https://github.com/NGEET/fates-containers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@glemieux the NGEET fates contains similarly seem to be 6 mo old, similar to the other efforts that @huitang-earth concluded were lagging too far behind model development.

When you say it's built off of @serbinsh container does that mean this PEcAn FATES container (which Shawn helped engineer) or a different container? We definitely can't set up a circular dependency. If a different container, which one?

Finally, I'm not sure what you mean by "funding to keep these up-to-date" as all of our containers are built by GH Actions -- the containers update automatically every time the mainline of the code is updated on github. We're OK with linking our container to our develop branch, but its just as easy to link to updates to the stable release version. But the key point is that it's no anyone's job to update the containers so it doesn't take much time/funding (just occasional tweaks to GH Actions)

Copy link

@glemieux glemieux Mar 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mdietze the containers aren't strictly built off of Shawn's containers in the way you mean; I copied the container recipes and updated/modified them to our NGEET org repo. I realized also that I pointed to the wrong repo in my previous message. This repo is actually a fork of the BNL Test group NGEEA workshop containers that Shawn developed: https://github.com/NGEET/tutorial-containers

Wrt funding, we didn't have time to set up our tutorial containers with github actions at the time is all. I fully agree that setting up a development pipeline to keep things up to date is the way to go, it's just currently not high on the priority list relative to other funded goals. All that said, we hope to run this training again in the future and I would like to use the time in preparation for that to set that up.

ncdf4::ncvar_put(nc=surf.nc, varid='LONGXY', vals=lon)
ncdf4::ncvar_put(nc=surf.nc, varid='LATIXY', vals=lat)
ncdf4::nc_close(surf.nc)
#surf.default <- system.file("surfdata_0.9x1.25_hist_16pfts_Irrig_CMIP6_simyr2000_c190214.nc",package = "PEcAn.FATES")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for SURFDATA file? Is this no longer needed?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SURFDATA is still needed. At the moment, this is set directly in template.job. We are not sure if "SURFDATA file" is assumed to be given directly as a setting variable (from Bety) in PEcAn? Or PEcAn want "SURFDATA file" to be automatically generated using a default global file and then modify the variables (e.g., soil texture) using the exiting tools from CTSM (https://github.com/ESCOMP/CTSM/tree/master/tools/site_and_regional). I think it would be up to you to decide which way would be best. We could then implement the changes accordingly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So either way there definitely shouldn't be any requirement to manually run CTSM tools before or after a PEcAn write.config. I think in the old set up I was just setting the lat/lon of the run. I never got around to coupling soil texture, but that can be provided by pecan here:

mysoil <- PEcAn.data.land::soil_params(sand=soil.data$fraction_of_sand_in_soil,

Like met, PEcAn supports multiple soil texture input sources, and can even ensemble-sample the uncertainties in some of these (e.g., SoilGrids)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ability to change soil texture and make ensemble-sample & experiments of it sounds really cool. Then, could we assume that PEcAn will anyway provide a default surface data file (in netcdf), and also some "observed" soil or vegetation properties (like soil texture, monthly LAI etc...), and then the "write.config." will manage to modify the default surface data file with "observed" or "perturbed" values? Modify an existing netcdf file with new values should be easy and straightforward. We can do this together with modifying the fates parameter file.

datm <- gsub('@START_YEAR@',lubridate::year(start_date), datm)
datm <- gsub('@END_YEAR@',lubridate::year(end_date), datm)
writeLines(datm, con=file.path(local.rundir, "datm_atm_in"))
#datm <- readLines(con=system.file("datm_atm_in.template",package = "PEcAn.FATES"),n=-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also not following all these comment out sections as well. Are none of these needed any more? If so how is information about start/end data, met file, etc being passed in now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still needed but in different script. Start and end data is passed via line 180-182. Domain file is not needed, while met file and also surface file are set in line 104-132 in template. job.


## PATHS
jobsh <- gsub('@RUNDIR@', rundir, jobsh)
#jobsh <- gsub('@RUNDIR@', rundir, jobsh) # default in /.cime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aGiven how PEcAn works, I'm not sure I agree with the decision to not allow pecan to set rundir

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<?xml version="1.0"?>
<config_machines>

    <machine MACH="docker">
        <DESC>
            Containerized development environment (Docker/Singularity) for CESM
        </DESC>
        <OS>LINUX</OS>
        <COMPILERS>gnu</COMPILERS>
        <MPILIBS>mpich</MPILIBS>
        <CIME_OUTPUT_ROOT>$CASEROOT/</CIME_OUTPUT_ROOT>
        <DIN_LOC_ROOT>/src/data</DIN_LOC_ROOT>
        <DOUT_S_ROOT>${CIME_OUTPUT_ROOT}/archive</DOUT_S_ROOT>
        <GMAKE>make</GMAKE>
        <GMAKE_J>4</GMAKE_J>
        <BATCH_SYSTEM>none</BATCH_SYSTEM>
        <SUPPORTED_BY>cgd</SUPPORTED_BY>
        <MAX_TASKS_PER_NODE>256</MAX_TASKS_PER_NODE>
        <MAX_MPITASKS_PER_NODE>256</MAX_MPITASKS_PER_NODE>
        <PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED>
        <mpirun mpilib="mpich">
            <executable>mpiexec</executable>
            <arguments>
                <arg name="anum_tasks">-n {{ total_tasks }}</arg>
            </arguments>
        </mpirun>
        <module_system type="none">
        </module_system>
        <RUNDIR>$CIME_OUTPUT_ROOT/run</RUNDIR>
        <EXEROOT>$CIME_OUTPUT_ROOT/bld</EXEROOT>
        <environment_variables>
            <env name="NETCDF_PATH">/usr/local</env>
            <env name="PNETCDF_PATH">/usr/local</env>
            <env name="FPATH">/usr/lib</env>
            <env name="CPATH">/usr/lib</env>
            <env name="ESMFMKFILE">/usr/local/lib/libO/Linux.gfortran.64.mpiuni.default/esmf.mk</env>
        </environment_variables>
        <resource_limits>
            <resource name="RLIMIT_STACK">-1</resource>
        </resource_limits>
    </machine>
"config_machines.xml" 43L, 1608C 

From the configuration file from ~/.cime folder, in fates, RUNDIR and BLD is set by default, inside the case folder. So, for pecan, is it possible to directly use this information?

jobsh <- gsub('@BLD@', bld, jobsh)
jobsh <- gsub('@BINARY@', binary, jobsh)
#jobsh <- gsub('@BLD@', bld, jobsh) # default in /.cime
#jobsh <- gsub('@BINARY@', binary, jobsh) # default in /.cime
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm really not following the decision to drop the build and binary info. How does PEcAn know where the model is located without this info?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's necessary, we could then directly pass these info to pecan. Here is because they are not needed in running the model, already set in the configuration file above.

clm.param.file <- file.path(local.rundir,paste0("clm_params.",run.id,".nc"))
file.copy(clm.param.default,clm.param.file)
clm.param.nc <- ncdf4::nc_open(clm.param.file,write=TRUE)
## Position of parameters file?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I see all the old parameter file stuff being commented out, but not what's replacing it

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something we are working on. We suppose that the ability to modify the parameter values in fates parameter files with the "prior" distribution set by PEcAN is one of the key functionality of PEcAN workflow. There are also many ways to do it. We have been thinking calling the existing FATES tools. It might also be possible to directly modify the netcdf parameter file using R. We are not sure yet which way would be better. It would be great to learn your opinions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants