Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add example cfg for v3.LR.historical run on LCRC #694

Merged
merged 10 commits into from
Apr 3, 2025

Conversation

chengzhuzhang
Copy link
Collaborator

Summary

Objectives: add example cfg for helping users to transition to use zppy v3 which has user-facing changes that are not backward compatible.

Issue resolution:

Select one: This pull request is...

  • a bug fix: increment the patch version
  • a small improvement: increment the minor version
  • a new feature: increment the minor version
  • an incompatible (non-backwards compatible) API change: increment the major version

Please fill out either the "Small Change" or "Big Change" section (the latter includes the numbered subsections), and delete the other.

Small Change

  • To merge, I will use "Squash and merge". That is, this change should be a single commit.
  • Logic: I have visually inspected the entire pull request myself.
  • Pre-commit checks: All the pre-commits checks have passed.

Big Change

  • To merge, I will use "Create a merge commit". That is, this change is large enough to require multiple units of work (i.e., it should be multiple commits).

1. Does this do what we want it to do?

Required:

  • Product Management: I have confirmed with the stakeholders that the objectives above are correct and complete.
  • Testing: I have added or modified at least one "min-case" configuration file to test this change. Every objective above is represented in at least one cfg.
  • Testing: I have considered likely and/or severe edge cases and have included them in testing.

If applicable:

  • Testing: this pull request introduces an important feature or bug fix that we must test often. I have updated the weekly-test configuration files, not just a "min-case" one.
  • Testing: this pull request adds at least one new possible parameter to the cfg. I have tested using this parameter with and without any other parameter that may interact with it.

2. Are the implementation details accurate & efficient?

Required:

  • Logic: I have visually inspected the entire pull request myself.
  • Logic: I have left GitHub comments highlighting important pieces of code logic. I have had these code blocks reviewed by at least one other team member.

If applicable:

  • Dependencies: This pull request introduces a new dependency. I have discussed this requirement with at least one other team member. The dependency is noted in zppy/conda, not just an import statement.

3. Is this well documented?

Required:

  • Documentation: by looking at the docs, a new user could easily understand the functionality introduced by this pull request.

4. Is this code clean?

Required:

  • Readability: The code is as simple as possible and well-commented, such that a new team member could understand what's happening.
  • Pre-commit checks: All the pre-commits checks have passed.

If applicable:

  • Software architecture: I have discussed relevant trade-offs in design decisions with at least one other team member. It is unlikely that this pull request will increase tech debt.

@forsyth2
Copy link
Collaborator

forsyth2 commented Mar 21, 2025

@chengzhuzhang I added a second commit -- 36d29729e5b8f04c05deef6e11b99dedbcfdb, but this isn't ready to merge yet. In particular, my to-do list is:

  • Actually run the cfg to make sure it works.
  • Add any missing parameters and/or explanatory notes that are made apparent during running.
  • Have you review the final cfg.

@forsyth2
Copy link
Collaborator

After discussions with @thorntonpe, for this example cfg we'll want to add as many of the variables from the land CSV as possible, and turn on Viewers

@forsyth2
Copy link
Collaborator

@chengzhuzhang I'm currently running the cfg with the changes to display the Land Viewers. Once I get that working, I'll have you do a final review.

Just a note: there is actually one user-facing code change here -- I updated the default.ini to set make_viewer as boolean rather than a string. I don't anticipate that causing any issues, but it means I want to merge this PR before making the zppy final release.

@forsyth2
Copy link
Collaborator

@chengzhuzhang There's a bit of a problem with trying to run all available variables (vars = "") in the ts task.

I'm getting the following when running the ts task lnd_monthly_glb subtask:

ncap2: ERROR 1 dimensions of LAKEICEFRAC_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of TLAKE_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of SOILLIQ_ICE_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of TSOI_ICE_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of T_SCALAR_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of O_SCALAR_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of SOILPSI_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of W_SCALAR_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of SOILICE_ICE_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of H2OSOI_tmp belong to template Internally generated template but 1 dimensions do not

Now, in https://github.com/E3SM-Project/zppy/blob/main/zppy/templates/ts.bash#L46:

{% if mapping_file == 'glb' -%}
vars={{ vars }}
# https://unix.stackexchange.com/questions/237297/the-fastest-way-to-remove-a-string-in-a-variable
# https://stackoverflow.com/questions/26457052/remove-a-substring-from-a-bash-variable
# Remove U, since it is a 3D variable and thus will not work with rgn_avg
vars=${vars//,U}
{%- else %}
vars={{ vars }}
{%- endif %}

And at https://github.com/E3SM-Project/zppy/blob/main/zppy/templates/ts.bash#L67:

cat input.txt | ncclimo \
-c {{ case }} \
{%- if vars != '' %}
-v ${vars} \
{%- endif %}

So, I'm seeing a few issues:

  1. We really need a better way of telling if a variable is 3D or not. Hard-coding the removal of U isn't robust.
  2. Adding in all the variables listed above (LAKEICEFRAC_tmp, etc.) after U won't do anything because vars is the empty string. That is, ncclimo is invoked without -v at all, which means we need some way of telling ncclimo to ignore any variables with malformed dimensions. @czender does such a mechanism exist?
  3. If we want an explicit way of specifying "I want the 354 land variables that are invoked by plots_lnd="all" in the global_time_series task, we'd need to load https://github.com/E3SM-Project/zppy-interfaces/blob/main/zppy_interfaces/global_time_series/zppy_land_fields.csv into zppy as well (it currently exists only in zppy-interfaces).

@forsyth2
Copy link
Collaborator

forsyth2 commented Mar 25, 2025

One more note: I believe I set the variables in https://github.com/E3SM-Project/zppy-interfaces/blob/main/tests/integration/global_time_series/cases_global_time_series.py#L11 explicitly to be ones where the data was actually available:

plots_lnd_metric_average = "FSH,RH2M,LAISHA,LAISUN,QINTR,QOVER,QRUNOFF,QSOIL,QVEGE,QVEGT,SOILWATER_10CM,TSA,H2OSNO,"
plots_lnd_metric_total = (
    "TOTLITC,CWDC,SOIL1C,SOIL2C,SOIL3C,SOIL4C,WOOD_HARVESTC,TOTVEGC,NBP,GPP,AR,HR"
)
plots_lnd_all = plots_lnd_metric_average + plots_lnd_metric_total

That's why https://web.lcrc.anl.gov/public/e3sm/diagnostic_output/ac.forsyth2/zi-test-webdir/global_time_series_1985-1995_results_viewers/table_lnd/index.html list far fewer than 354 variables. I remember going through test output, removing any variables that were raising not-found errors.

@chengzhuzhang
Copy link
Collaborator Author

chengzhuzhang commented Mar 25, 2025

2. Adding in all the variables listed above (LAKEICEFRAC_tmp, etc.) after U won't do anything because vars is the empty string. That is, ncclimo is invoked without -v at all, which means we need some way of telling ncclimo to ignore any variables with malformed dimensions. @czender does such a mechanism exist?

@forsyth2 This post from nco repo can be helpful in this case to have ncclimo tolerate missing variables. Specifically, @czender's comment. Though I'm not sure at this point if ncclimo can filter variables based on dimensions.

@czender
Copy link

czender commented Mar 25, 2025

@forsyth2 and @chengzhuzhang
When ncclimo timeseries mode is called without a -v var_lst then it tries to create timeseries of everything. This runs into trouble in global statistics because, as you know, those are only engineered for single-level variables. In the quick term you could try adding the options --xcl_var -v var_lst where var_lst is now the list of variables that, if found, should be excluded from timeseries mode (the --xcl_var flag means exclude rather than extract the variable list). If that works then great. A longer though doable fix would be for me to change timeseries mode with global statistics to only extract single level fields.

@forsyth2
Copy link
Collaborator

@czender @chengzhuzhang Hmm, that didn't seem to make a difference. I still get:

ncap2: ERROR 1 dimensions of LAKEICEFRAC_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of TLAKE_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of PCT_LANDUNIT_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of T_SCALAR_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of SOILLIQ_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of H2OSOI_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of SOILICE_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of SOILICE_ICE_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of W_SCALAR_tmp belong to template Internally generated template but 1 dimensions do not

despite a bash script that includes:

# Generate time series files                                                                                                                                                                                                        
# If the user-defined parameter "vars" is "", then ${vars}, defined above, will be too.                                                                                                                                             
cat input.txt | ncclimo \
-c v3.LR.historical_0051 \
--xcl_var -v PCT_LANDUNIT_tmp,TLAKE_tmp,LAKEICEFRAC_tmp,SOILLIQ_ICE_tmp,W_SCALAR_tmp,T_SCALAR_tmp,SOILICE_ICE_tmp,SOILPSI_tmp,O_SCALAR_tmp,H2OSOI_tmp \
--split \
--yr_srt=1985 \
--yr_end=2014 \
--ypf=30 \
-o output \
--rgn_avg \
--area=area \
--prc_typ=elm

@czender
Copy link

czender commented Mar 25, 2025

@forsyth2 Are you trying this with NCO 5.3.2 from the latest E3SMU RC? This --xcl_var flag only works with Bash 4.0+. Are you using that (ncclimo should print a WARNING if you are not). AFAICT it's working as expected for me. When I exclude all ELM variables except for those that start with "W", for example, it tries to process the eight variables that start with W, then fails on W_SCALAR which is in your list:

zender@spectral:~$ cd ${DATA}/ne30/raw;ls v3.LR.*elm.h0*046[01]-??.nc | ncclimo -P elm --split --dbg=0 --rgn_avg --xcl_var --var='^[a-vA-Vx-zX-Z].?' -s 2013 -e 2014 --drc_out=${DATA}/ne30/foo
Climatology operations invoked with command:
/Users/zender/bin/ncclimo -P elm --split --dbg=0 --rgn_avg --xcl_var --var=^[a-vA-Vx-zX-Z].? -s 2013 -e 2014 --drc_out=/Users/zender/data/ne30/foo
Started climatology splitting at Tue Mar 25 15:16:17 PDT 2025
Running climatology script ncclimo from directory /Users/zender/bin
NCO binaries version 5.3.3 from directory /Users/zender/bin
Parallelism mode = background
Timeseries will be created for each of 8 variables
Background parallelism processing variables in var_nbr/job_nbr = 8/8 = 1 sequential batches each concurrently processing job_nbr = 8 jobs (1 per variable), then remaining 0 jobs/variables simultaneously
Will split data for each variable into one timeseries of length 2 years and 0 months
Splitting climatology from list of 24 raw input files piped to stdin
Each input file assumed to contain statistics for one month
Hemispherically and globally averaged timeseries files to be saved to directory /Users/zender/data/ne30/foo
Split files will not be regridded
Tue Mar 25 15:16:17 PDT 2025: Generated /Users/zender/data/ne30/foo/WA_201301_201412.nc
Tue Mar 25 15:16:17 PDT 2025: Generated /Users/zender/data/ne30/foo/WOODC_201301_201412.nc
Tue Mar 25 15:16:17 PDT 2025: Generated /Users/zender/data/ne30/foo/WASTEHEAT_201301_201412.nc
Tue Mar 25 15:16:18 PDT 2025: Generated /Users/zender/data/ne30/foo/W_SCALAR_201301_201412.nc
Tue Mar 25 15:16:18 PDT 2025: Generated /Users/zender/data/ne30/foo/WOOD_HARVESTC_201301_201412.nc
Tue Mar 25 15:16:18 PDT 2025: Generated /Users/zender/data/ne30/foo/WOOD_HARVESTN_201301_201412.nc
Tue Mar 25 15:16:18 PDT 2025: Generated /Users/zender/data/ne30/foo/WIND_201301_201412.nc
Tue Mar 25 15:16:18 PDT 2025: Generated /Users/zender/data/ne30/foo/WOODC_ALLOC_201301_201412.nc
ncap2: ERROR 1 dimensions of W_SCALAR_tmp belong to template Internally generated template but 1 dimensions do not
/Users/zender/bin/ncclimo: line 3033: 67307 Abort trap: 6           OMP_PROC_BIND=false ncap2 -h -O -s '*rgn_nbr=3;defdim("rgn",rgn_nbr);*W_SCALAR_tmp=0.0f*W_SCALAR.avg($lat,$lon);*W_SCALAR_rgn[time,rgn]=W_SCALAR_tmp;W_SCALAR_rgn@coordinates="region_name";*lat_area=lat+0.0*area;*msk_sth=0*lat_area.int();delete_miss(msk_sth);*msk_nrt=0*lat_area.int();delete_miss(msk_nrt);*idx_glb=0;*idx_nrt=1;*idx_sth=2;*rgn_len=19;defdim("rgn_len",rgn_len);region_name[rgn,rgn_len]=" ";region_name(idx_glb,0:5)="Global";region_name(idx_nrt,:)="Northern Hemisphere";region_name(idx_sth,:)="Southern Hemisphere";if(0) region_name@long_name="W_SCALAR timeseries array contains area-weighted sums over these regions"; else region_name@long_name="W_SCALAR timeseries array contains area-weighted averages over these regions";where(lat_area < 0.0) msk_sth=1; elsewhere msk_nrt=1;W_SCALAR_rgn(:,idx_glb)=((W_SCALAR*area*landfrac).avg($lat,$lon)/(area*landfrac).avg($lat,$lon)).float();W_SCALAR_rgn(:,idx_nrt)=((W_SCALAR*area*landfrac*msk_nrt).avg($lat,$lon)/(area*landfrac*msk_nrt).avg($lat,$lon)).float();W_SCALAR_rgn(:,idx_sth)=((W_SCALAR*area*landfrac*msk_sth).avg($lat,$lon)/(area*landfrac*msk_sth).avg($lat,$lon)).float();if(0){W_SCALAR_rgn(:,idx_glb)=W_SCALAR_rgn(:,idx_glb)*(area*landfrac).total($lat,$lon).float()*1.0f;W_SCALAR_rgn(:,idx_nrt)=W_SCALAR_rgn(:,idx_nrt)*(area*landfrac*msk_nrt).total($lat,$lon).float()*1.0f;W_SCALAR_rgn(:,idx_sth)=W_SCALAR_rgn(:,idx_sth)*(area*landfrac*msk_sth).total($lat,$lon).float()*1.0f;}W_SCALAR=W_SCALAR_rgn;push(&W_SCALAR@cell_methods," area: mean");if(exists(time_bnds)) time_bnds=time_bnds;if(exists(time_bounds)) time_bounds=time_bounds;valid_area_per_gridcell=area*landfrac;' /Users/zender/data/ne30/foo/W_SCALAR_201301_201412.nc /Users/zender/data/ne30/foo/W_SCALAR_201301_201412.nc
Tue Mar 25 15:16:19 PDT 2025: Global and regional statistics /Users/zender/data/ne30/foo/WA_201301_201412.nc
Tue Mar 25 15:16:19 PDT 2025: Global and regional statistics /Users/zender/data/ne30/foo/WOODC_201301_201412.nc
Tue Mar 25 15:16:19 PDT 2025: Global and regional statistics /Users/zender/data/ne30/foo/WASTEHEAT_201301_201412.nc
ncclimo: ERROR Failed in global and regional statistics. cmd_stt[3] failed. Debug this:
 OMP_PROC_BIND=false ncap2 -h -O -s '*rgn_nbr=3;defdim("rgn",rgn_nbr);*W_SCALAR_tmp=0.0f*W_SCALAR.avg($lat,$lon);*W_SCALAR_rgn[time,rgn]=W_SCALAR_tmp;W_SCALAR_rgn@coordinates="region_name";*lat_area=lat+0.0*area;*msk_sth=0*lat_area.int();delete_miss(msk_sth);*msk_nrt=0*lat_area.int();delete_miss(msk_nrt);*idx_glb=0;*idx_nrt=1;*idx_sth=2;*rgn_len=19;defdim("rgn_len",rgn_len);region_name[rgn,rgn_len]=" ";region_name(idx_glb,0:5)="Global";region_name(idx_nrt,:)="Northern Hemisphere";region_name(idx_sth,:)="Southern Hemisphere";if(0) region_name@long_name="W_SCALAR timeseries array contains area-weighted sums over these regions"; else region_name@long_name="W_SCALAR timeseries array contains area-weighted averages over these regions";where(lat_area < 0.0) msk_sth=1; elsewhere msk_nrt=1;W_SCALAR_rgn(:,idx_glb)=((W_SCALAR*area*landfrac).avg($lat,$lon)/(area*landfrac).avg($lat,$lon)).float();W_SCALAR_rgn(:,idx_nrt)=((W_SCALAR*area*landfrac*msk_nrt).avg($lat,$lon)/(area*landfrac*msk_nrt).avg($lat,$lon)).float();W_SCALAR_rgn(:,idx_sth)=((W_SCALAR*area*landfrac*msk_sth).avg($lat,$lon)/(area*landfrac*msk_sth).avg($lat,$lon)).float();if(0){W_SCALAR_rgn(:,idx_glb)=W_SCALAR_rgn(:,idx_glb)*(area*landfrac).total($lat,$lon).float()*1.0f;W_SCALAR_rgn(:,idx_nrt)=W_SCALAR_rgn(:,idx_nrt)*(area*landfrac*msk_nrt).total($lat,$lon).float()*1.0f;W_SCALAR_rgn(:,idx_sth)=W_SCALAR_rgn(:,idx_sth)*(area*landfrac*msk_sth).total($lat,$lon).float()*1.0f;}W_SCALAR=W_SCALAR_rgn;push(&W_SCALAR@cell_methods," area: mean");if(exists(time_bnds)) time_bnds=time_bnds;if(exists(time_bounds)) time_bounds=time_bounds;valid_area_per_gridcell=area*landfrac;' /Users/zender/data/ne30/foo/W_SCALAR_201301_201412.nc /Users/zender/data/ne30/foo/W_SCALAR_201301_201412.nc
/Users/zender/bin/ncclimo: line 3040: kill: (67309) - No such process
/Users/zender/bin/ncclimo: line 3040: kill: (67310) - No such process
zender@spectral:~/data/ne30/raw$ 

And when I add W_SCALAR to that exclusion list, it successfully generates the seven global timeseres for all the single level fields:

zender@spectral:~$ cd ${DATA}/ne30/raw;ls v3.LR.*elm.h0*046[01]-??.nc | ncclimo -P elm --split --dbg=0 --rgn_avg --xcl_var --var='^[a-vA-Vx-zX-Z].?',W_SCALAR -s 2013 -e 2014 --drc_out=${DATA}/ne30/foo
Climatology operations invoked with command:
/Users/zender/bin/ncclimo -P elm --split --dbg=0 --rgn_avg --xcl_var --var=^[a-vA-Vx-zX-Z].?,W_SCALAR -s 2013 -e 2014 --drc_out=/Users/zender/data/ne30/foo
Started climatology splitting at Tue Mar 25 15:14:19 PDT 2025
Running climatology script ncclimo from directory /Users/zender/bin
NCO binaries version 5.3.3 from directory /Users/zender/bin
Parallelism mode = background
Timeseries will be created for each of 7 variables
Background parallelism processing variables in var_nbr/job_nbr = 7/7 = 1 sequential batches each concurrently processing job_nbr = 7 jobs (1 per variable), then remaining 0 jobs/variables simultaneously
Will split data for each variable into one timeseries of length 2 years and 0 months
Splitting climatology from list of 24 raw input files piped to stdin
Each input file assumed to contain statistics for one month
Hemispherically and globally averaged timeseries files to be saved to directory /Users/zender/data/ne30/foo
Split files will not be regridded
Tue Mar 25 15:14:20 PDT 2025: Generated /Users/zender/data/ne30/foo/WA_201301_201412.nc
Tue Mar 25 15:14:20 PDT 2025: Generated /Users/zender/data/ne30/foo/WOODC_201301_201412.nc
Tue Mar 25 15:14:20 PDT 2025: Generated /Users/zender/data/ne30/foo/WASTEHEAT_201301_201412.nc
Tue Mar 25 15:14:20 PDT 2025: Generated /Users/zender/data/ne30/foo/WOOD_HARVESTC_201301_201412.nc
Tue Mar 25 15:14:20 PDT 2025: Generated /Users/zender/data/ne30/foo/WOOD_HARVESTN_201301_201412.nc
Tue Mar 25 15:14:20 PDT 2025: Generated /Users/zender/data/ne30/foo/WIND_201301_201412.nc
Tue Mar 25 15:14:20 PDT 2025: Generated /Users/zender/data/ne30/foo/WOODC_ALLOC_201301_201412.nc
Tue Mar 25 15:14:21 PDT 2025: Global and regional statistics /Users/zender/data/ne30/foo/WA_201301_201412.nc
Tue Mar 25 15:14:21 PDT 2025: Global and regional statistics /Users/zender/data/ne30/foo/WOODC_201301_201412.nc
Tue Mar 25 15:14:21 PDT 2025: Global and regional statistics /Users/zender/data/ne30/foo/WASTEHEAT_201301_201412.nc
Tue Mar 25 15:14:21 PDT 2025: Global and regional statistics /Users/zender/data/ne30/foo/WOOD_HARVESTC_201301_201412.nc
Tue Mar 25 15:14:21 PDT 2025: Global and regional statistics /Users/zender/data/ne30/foo/WOOD_HARVESTN_201301_201412.nc
Tue Mar 25 15:14:21 PDT 2025: Global and regional statistics /Users/zender/data/ne30/foo/WIND_201301_201412.nc
Tue Mar 25 15:14:21 PDT 2025: Global and regional statistics /Users/zender/data/ne30/foo/WOODC_ALLOC_201301_201412.nc
Quick plots of last timeseries segment of last variable split:
ncvis /Users/zender/data/ne30/foo/WOODC_ALLOC_201301_201412.nc &
Completed 2-year climatology operations for input data at Tue Mar 25 15:14:21 PDT 2025
Elapsed time 0m2s
zender@spectral:~/data/ne30/raw$ 

@forsyth2
Copy link
Collaborator

Are you trying this with NCO 5.3.2 from the latest E3SMU RC

@czender I'm running NCO after running source /lcrc/soft/climate/e3sm-unified/test_e3sm_unified_1.11.0rc13_chrysalis.sh, so yes

And the second line of the output file has nco-5.3.2

@czender
Copy link

czender commented Mar 25, 2025

And what is the Bash version in the environment? Needs to be >= 4:

zender@spectral:~/data/ne30/raw$ bash --version
GNU bash, version 5.2.37(1)-release (aarch64-apple-darwin24.0.0)
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

@forsyth2
Copy link
Collaborator

source /lcrc/soft/climate/e3sm-unified/test_e3sm_unified_1.11.0rc13_chrysalis.sh
bash --version

gives:

GNU bash, version 4.4.20(1)-release (x86_64-redhat-linux-gnu)
Copyright (C) 2016 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software; you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

@czender
Copy link

czender commented Mar 25, 2025

I'm stumped. If it's possible, please send me the ncclimo command with full I/O paths that fails so I can tinker with the command, and make the input files readable by me. If I can't figure out a workaround for 5.3.2, then I'll modify ncclimo to avoid attempting regional timeseries on variables with dimensions besides horizontal and time and put that in 5.3.3.

@forsyth2
Copy link
Collaborator

I used the code as of 5e888bd, specifically that includes a bit that updates the variable exclusion:

{%- if vars != '' %}
-v ${vars} \
{%- else %}
--xcl_var -v PCT_LANDUNIT_tmp,TLAKE_tmp,LAKEICEFRAC_tmp,SOILLIQ_ICE_tmp,W_SCALAR_tmp,T_SCALAR_tmp,SOILICE_ICE_tmp,SOILPSI_tmp,O_SCALAR_tmp,H2OSOI_tmp \
{%- endif %}

I'm running zppy -c examples/post.v3.LR.historical.zppy_v3.cfg (i.e., using the cfg in this PR) from a zppy dev environment, but it's important to note ncclimo itself is called from a bash script operating in the Unified environment above, not this dev environment.

Output can be seen in /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250325_try2/v3.LR.historical_0051/post/scripts, in particular ts_lnd_monthly_glb_1985-2014-0030.o716441:

ncap2: ERROR 1 dimensions of LAKEICEFRAC_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of TLAKE_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of PCT_LANDUNIT_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of T_SCALAR_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of SOILLIQ_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of H2OSOI_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of SOILICE_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of SOILICE_ICE_tmp belong to template Internally generated template but 1 dimensions do not
ncap2: ERROR 1 dimensions of W_SCALAR_tmp belong to template Internally generated template but 1 dimensions do not

The bash file used is ts_lnd_monthly_glb_1985-2014-0030.bash, in that directory, where we see:

# Generate time series files                                                                                                                                                                                                        
# If the user-defined parameter "vars" is "", then ${vars}, defined above, will be too.                                                                                                                                             
cat input.txt | ncclimo \
-c v3.LR.historical_0051 \
--xcl_var -v PCT_LANDUNIT_tmp,TLAKE_tmp,LAKEICEFRAC_tmp,SOILLIQ_ICE_tmp,W_SCALAR_tmp,T_SCALAR_tmp,SOILICE_ICE_tmp,SOILPSI_tmp,O_SCALAR_tmp,H2OSOI_tmp \
--split \
--yr_srt=1985 \
--yr_end=2014 \
--ypf=30 \
-o output \
--rgn_avg \
--area=area \
--prc_typ=elm

That input.txt is created a few lines earlier, with:

ls v3.LR.historical_0051.elm.h0.????-*.nc > input.txt

@forsyth2 forsyth2 mentioned this pull request Mar 26, 2025
4 tasks
@forsyth2
Copy link
Collaborator

@chengzhuzhang @czender Please see my comment at #697 (comment) further diving into this.

@czender
Copy link

czender commented Mar 26, 2025

Thanks @forsyth2. I have verified that NCO 5.3.2 works as expected, for me, on Chrysalis with those files. This includes the exclusion list handling. I'm wondering if you inadvertently used the wrong exclusion list in your tests?

You used (from above):
--xcl_var -v PCT_LANDUNIT_tmp,TLAKE_tmp,LAKEICEFRAC_tmp,SOILLIQ_ICE_tmp,W_SCALAR_tmp,T_SCALAR_tmp,SOILICE_ICE_tmp,SOILPSI_tmp,O_SCALAR_tmp,H2OSOI_tmp \
However, the ELM variable names do not have _tmp as suffixes. To get the command to work as expected, I had to change your exclusion list to
--xcl_var -v PCT_LANDUNIT,TLAKE,LAKEICEFRAC,SOILLIQ_ICE,W_SCALAR,T_SCALAR,SOILICE_ICE,SOILPSI,O_SCALAR,H2OSOI \
Please verify you're excluding the correct variable names. Also, this list is interpreted by libregex. So if you want to it to work in cases where one or more of these variables may not be in the input file then you can prefix those names with a dot, i.e., this exlusion list should work whether or not the variables are in the input files:
--xcl_var -v .PCT_LANDUNIT,.TLAKE,.LAKEICEFRAC,.SOILLIQ_ICE,.W_SCALAR,.T_SCALAR,.SOILICE_ICE,.SOILPSI,.O_SCALAR,.H2OSOI \

@forsyth2
Copy link
Collaborator

@chengzhuzhang If we're adding a static example cfg, we should consider adding the auto-generated test files to the .gitignore so they're not constantly showing up in PR diffs.

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 1, 2025

@chengzhuzhang It took 2 hours to run all 300+ land variables on global_time_series in my cfg on https://github.com/E3SM-Project/zppy/pull/697/files#diff-64b61ee3f79e872293a79155b5ca3cf8e74dba56fc25602b407f152a04828559, which was only operating on 10 years of data. The cfg in this PR is actually set for 30 years, so it will take longer.

I'll check on the status in the morning, and if all good, I'll request reviews of both the cfg itself and the output from you, Chris, and Wuyin.

@forsyth2 forsyth2 force-pushed the add_example_cfg_iss_692 branch from 5e888bd to 0917f59 Compare April 1, 2025 00:25
Comment on lines 13 to 17
# Once E3SM Unified 1.11.0 is released, you can use this line instead:
# environment_commands = "source /lcrc/soft/climate/e3sm-unified/load_latest_e3sm_unified_chrysalis.sh"
environment_commands = "source /lcrc/soft/climate/e3sm-unified/test_e3sm_unified_1.11.0rc13_chrysalis.sh"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: before merging, just make this the environment_commands = "source /lcrc/soft/climate/e3sm-unified/load_latest_e3sm_unified_chrysalis.sh" line.

# plot_names (plot names should now be explicitly set via the plots_atm/ice/lnd/ocn parameters)
active = True
climo_years ="1985-2014",
environment_commands = "source /gpfs/fs1/home/ac.forsyth2/miniforge3/etc/profile.d/conda.sh; conda activate zi_plots_lnd_20250326"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: before merging, remove this line.

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 1, 2025

@chengzhuzhang @golaz @wlin7 This is ready for review.

The example cfg itself can be found at https://github.com/E3SM-Project/zppy/pull/694/files. Note there are 2 environment_commands lines that I'll change before merging because the cfg is meant to be run after the final zppy/zppy-interfaces releases.

Output:

cd /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250331_try2/v3.LR.historical_0051/post/scripts
grep -v "OK" *status
# No results, meaning no failed jobs. Good.

Web output:

# scratch
active = True
walltime = "02:00:00"
years = "1985:2014:30",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider to add a note about elapsed time for this 30 years run, and caution about needing to adjust wall-time limit for longer simulations.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For tc_analysis specifically, or in general at the top of the cfg?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For tc_analysis here, can you find the elapsed time from log files, so folks can have a rough estimate.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tc_analysis specifically is 3296 seconds = 54.93 minutes

> grep "Elapsed time" *.o*
climo_atm_monthly_180x360_aave_1985-2014.o719881:Elapsed time 0m40s
climo_atm_monthly_180x360_aave_1985-2014.o719881:Elapsed time: 44 seconds
climo_atm_monthly_diurnal_8xdaily_180x360_aave_1985-2014.o719882:Elapsed time 2m2s
climo_atm_monthly_diurnal_8xdaily_180x360_aave_1985-2014.o719882:Elapsed time: 133 seconds
climo_land_monthly_climo_1985-2014.o719883:Elapsed time 1m21s
climo_land_monthly_climo_1985-2014.o719883:Elapsed time: 83 seconds
e3sm_diags_atm_monthly_180x360_aave_model_vs_obs_1985-2014.o719893:Elapsed time: 4473 seconds
e3sm_diags_atm_monthly_180x360_aave_mvm_model_vs_model_1985-2014_vs_1985-2014.o719894:Elapsed time: 1460 seconds
e3sm_diags_lnd_monthly_mvm_lnd_model_vs_model_1985-2014_vs_0051-0100.o719895:Elapsed time: 1855 seconds
e3sm_to_cmip_atm_monthly_180x360_aave_1985-2014-0030.o719890:Elapsed time: 71 seconds
e3sm_to_cmip_land_monthly_1985-2014-0030.o719891:Elapsed time: 74 seconds
global_time_series_1985-2014.o719959:Elapsed time: 24475 seconds
ilamb_1985-2014.o719898:Elapsed time: 1472 seconds
mpas_analysis_ts_1985-2014_climo_1985-2014.o719896:Elapsed time: 1272 seconds
tc_analysis_1985-2014.o719892:Elapsed time: 3296 seconds
ts_atm_daily_180x360_aave_1985-2014-0030.o719886:Elapsed time 1m18s
ts_atm_daily_180x360_aave_1985-2014-0030.o719886:Elapsed time: 84 seconds
ts_atm_monthly_180x360_aave_1985-2014-0030.o719884:Elapsed time 3m1s
ts_atm_monthly_180x360_aave_1985-2014-0030.o719884:Elapsed time: 192 seconds
ts_atm_monthly_glb_1985-2014-0030.o719888:Elapsed time 0m46s
ts_atm_monthly_glb_1985-2014-0030.o719888:Elapsed time: 51 seconds
ts_land_monthly_1985-2014-0030.o719885:Elapsed time 1m43s
ts_land_monthly_1985-2014-0030.o719885:Elapsed time: 114 seconds
ts_lnd_monthly_glb_1985-2014-0030.o719889:Elapsed time 7m24s
ts_lnd_monthly_glb_1985-2014-0030.o719889:Elapsed time: 449 seconds
ts_rof_monthly_1985-2014-0030.o719887:Elapsed time 0m27s
ts_rof_monthly_1985-2014-0030.o719887:Elapsed time: 32 seconds

input_subdir = "archive/lnd/hist"
mapping_file = "glb"
# vars = "FSH,RH2M,LAISHA,LAISUN,QINTR,QOVER,QRUNOFF,QSOIL,QVEGE,QVEGT,SOILWATER_10CM,TSA,H2OSNO,TOTLITC,CWDC,SOIL1C,SOIL2C,SOIL3C,SOIL4C,WOOD_HARVESTC,TOTVEGC,NBP,GPP,AR,HR"
vars = "" # This will tell zppy to use all available variables
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider to add a note about elapsed time for this 30 years run for this option, and caution about needing to adjust wall-time limit for longer simulations.

# you can do something like the following:
# environment_commands = "source /home/ac.zhang40/y/etc/profile.d/conda.sh; conda activate e3sm_diags_dev"
# `e3sm_diags` is largely driven by which e3sm_diags sets are requested:
sets="lat_lon","zonal_mean_xy","zonal_mean_2d","polar","cosp_histogram","meridional_mean_2d","annual_cycle_zonal_mean","qbo","diurnal_cycle","zonal_mean_2d_stratosphere","aerosol_aeronet","tropical_subseasonal","tc_analysis", "tropical_subseasonal",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix alignment

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@golaz I think that is just how it looks in the diff view. If you look at the file in https://github.com/E3SM-Project/zppy/blob/0917f59e9d48a2f7d246ed99b50eae0b352ed9a2/examples/post.v3.LR.historical.zppy_v3.cfg, it is aligned.

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 2, 2025

That commit (2eb9c6c) appears fine by visual inspection. I'll run the cfg.

@chengzhuzhang
Copy link
Collaborator Author

Now your mvm commit (2eb9c6c) is on top of my commits

Thank you. Sorry for the trouble..

Copy link
Collaborator

@forsyth2 forsyth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still waiting on the global_time_series job, but there are a couple issues with e3sm_diags

> cd /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250403_try1/v3.LR.historical_0051/post/scripts/
> grep -v "OK" *status
e3sm_diags_atm_monthly_180x360_aave_mvm_model_vs_model_1985-2014_vs_0451-0500.status:ERROR (9)
e3sm_diags_lnd_monthly_mvm_lnd_model_vs_model_1985-2014_vs_0451-0500.status:ERROR (2)
global_time_series_1985-2014.status:RUNNING 721767

The first e3sm_diags error:

  File "/lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250403_try1/v3.LR.historical_0051/post/scripts/tmp.e3sm_diags_atm_monthly_180x360_aave_mvm_model_vs_model_1985-2014_vs_0451-0500.721764.pgFA/e3sm.py", line 15
    ref_start_yr = 0451

This can be resolved by changing the parameters in the cfg.

The second:

cp: cannot stat '/lcrc/group/e3sm2/ac.zhang40/E3SMv3/v3.LR.piControl_451-500/post/lnd/native/clim/50yr/20231209.v3.LR.piControl-spinup.chrysalis_*_0451??_0500??_climo.nc': No such file or directory

but:

> cd /lcrc/group/e3sm2/ac.zhang40/E3SMv3/v3.LR.piControl_451-500/post/lnd/native/clim/50yr/
v3.LR.piControl_01_045101_050001_climo.nc  v3.LR.piControl_10_045110_050010_climo.nc
v3.LR.piControl_02_045102_050002_climo.nc  v3.LR.piControl_11_045111_050011_climo.nc
v3.LR.piControl_03_045103_050003_climo.nc  v3.LR.piControl_12_045112_050012_climo.nc
v3.LR.piControl_04_045104_050004_climo.nc  v3.LR.piControl_ANN_045101_050012_climo.nc
v3.LR.piControl_05_045105_050005_climo.nc  v3.LR.piControl_DJF_045101_050012_climo.nc
v3.LR.piControl_06_045106_050006_climo.nc  v3.LR.piControl_JJA_045106_050008_climo.nc
v3.LR.piControl_07_045107_050007_climo.nc  v3.LR.piControl_MAM_045103_050005_climo.nc
v3.LR.piControl_08_045108_050008_climo.nc  v3.LR.piControl_SON_045109_050011_climo.nc
v3.LR.piControl_09_045109_050009_climo.nc

@chengzhuzhang It seems like the data is in fact there??

EDIT: oh I see -spinup.chrysalis is missing. Do we want that or not?

Comment on lines 229 to 231
ref_start_yr = 0451
ref_final_yr = 0500
ref_years = "0451-0500",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will have to remove leading zero's for correct parsing

Comment on lines 246 to 248
ref_final_yr = 0451
ref_start_yr = 0500
ref_years = "0451-0500",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will have to remove leading zero's for correct parsing

Comment on lines 244 to 245
reference_data_path = "/lcrc/group/e3sm2/ac.zhang40/E3SMv3/v3.LR.piControl_451-500/post/lnd/native/clim"
ref_name = "20231209.v3.LR.piControl-spinup.chrysalis"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The source of the second diags error

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ref_name should be "v3.LR.piControl", I must have missed this when updating

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 3, 2025

Actually, I'm extremely confused why global_time_series is still running; I limited the number of variables used...

Even the longest runtimes reported in the earlier thread #694 (comment) are 3 sec/var*year. This is only 25 land vars x 30 years of data. That's 750 var x years x 3 = absolute max of 2,250 seconds = 37.5 minutes.

I checked the output:

> cd /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250403_try1/v3.LR.historical_0051/post/scripts
> emacs global_time_series_1985-2014.bash
--plots_lnd FSH,RH2M,LAISHA,LAISUN,QINTR,QOVER,QRUNOFF,QSOIL,QVEGE,QVEGT,SOILWATER_10CM,TSA,H2OSNO,TOTLITC,CWDC,SOIL1C,SOIL2C,SOIL3C,SOIL4C,WOOD_HARVESTC,TOTVEGC,NBP,GPP,AR,HR
> cat global_time_series_1985-2014.status 
RUNNING 721767
> sq
USER     ACCOUNT NODE PARTITION JOBID   ST     REASON       TIME TIME_LIMIT NAME
ac.forsy    e3sm    1   compute 721767   R       None    2:17:18   10:00:00 global_time_series_1985-2014

So, it's definitely the same job.

@chengzhuzhang
Copy link
Collaborator Author

Actually, I'm extremely confused why global_time_series is still running; I limited the number of variables used...

Even the longest runtimes reported in the earlier thread #694 (comment) are 3 sec/var*year. This is only 25 land vars x 30 years of data. That's 750 var x years x 3 = absolute max of 2,250 seconds = 37.5 minutes.

You should try to use a freshly created output directory that only include requested variables. I'm looking at /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250403_try1/v3.LR.historical_0051/post/lnd/glb/ts/monthly/5yr. it still included all the land variables. And the biggest performance bottleneck i guess is coming from reading in all these variables reside in this directory.

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 3, 2025

Ah thank you @chengzhuzhang, this is a big help. That directory has that many variables because of vars="" in the ts gs-lnd subtask though. So that's either a code change to global_time_series to not read in all the variables available (I need to dive into the code to see where that happens...) or a change in this cfg to vars=<the explicit list>

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 3, 2025

Table below constructed from runs discussed in #694 (comment), #694 (comment)

Code used make_viewer= vars= in the ts glb subtasks ts_num_years plots years runtime (seconds) runtime per (plot x year) output path (look at .settings, .bash files)
Unified rc13 False 21 atm, 25 lnd 5 8 original + 25 lnd 1985-1995 (10) 251 251/(33*10)=0.76 /lcrc/group/e3sm/ac.forsyth2/zppy_weekly_comprehensive_v3_output/test_unified_rc13_20250313/v3.LR.historical_0051/post/scripts
Testing #697 & E3SM-Project/zppy-interfaces#16 (no other global_time_series changes since Unified rc13) True 2 atm, ""=all vars for lnd 5 1 atm, "all" lnd 1985-19995 (10) 8004 8004/(354*10)=2.26 /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_global_time_series_viewers_undefined_variables_output/test_pr697_20250328_try4/v3.LR.historical_0051/post/scripts
Same True 21 atm, ""=all vars for lnd 5 5 original + 4 lnd 1985-1995 (10) 306 306/(9*10)=3.4 /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_global_time_series_viewers_original_atm_plus_land_output/test_pr697_20250328_try4/v3.LR.historical_0051/post/scripts
Example cfg today True 21 atm, ""=all vars for lnd 5 8 original + 1 atm + 25 lnd 1985-2014 (30) At least 2h57m=10620 At least 10620/(34*30)=10.4 /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250403_try1/v3.LR.historical_0051/post/scripts

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 3, 2025

Currently running with changes in a618c64.

from reading in all these variables reside in this directory.

I thought I had the code requesting specific variables, so this confused me. My current best guess for the relevant code is this block in zppy_interfaces:

    def __init__(self, directory):

        self.directory: str = directory

        # `directory` will be of the form `{case_dir}/post/<component>/glb/ts/monthly/{ts_num_years_str}yr/`
        self.dataset: xarray.core.dataset.Dataset = xcdat.open_mfdataset(
            f"{directory}*.nc", center_times=True
        )

        self.area_tuple = None

Is that open_mfdataset just reading in every single nc file there is? I guess we could pass along the specific vars we want included?

@chengzhuzhang
Copy link
Collaborator Author

Is that open_mfdataset just reading in every single nc file there is? I guess we could pass along the specific vars we want included?

I had this question and posed to Tom when we had a meeting. @tomvothecoder will try to see if there can be some performance problem here.

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 3, 2025

Updated table below; last two rows show clearly that the performance issue is related to having too much output from the ts glb subtask.

Code used make_viewer= vars= in the ts glb subtasks ts_num_years plots years runtime (seconds) runtime per (plot x year) output path (look at .settings, .bash files)
Unified rc13 False 21 atm, 25 lnd 5 8 original + 25 lnd 1985-1995 (10) 251 251/(33*10)=0.76 /lcrc/group/e3sm/ac.forsyth2/zppy_weekly_comprehensive_v3_output/test_unified_rc13_20250313/v3.LR.historical_0051/post/scripts
Testing #697 & E3SM-Project/zppy-interfaces#16 (no other global_time_series changes since Unified rc13) True 2 atm, ""=all vars for lnd 5 1 atm, "all" lnd 1985-19995 (10) 8004 8004/(354*10)=2.26 /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_global_time_series_viewers_undefined_variables_output/test_pr697_20250328_try4/v3.LR.historical_0051/post/scripts
Same True 21 atm, ""=all vars for lnd 5 5 original + 4 lnd 1985-1995 (10) 306 306/(9*10)=3.4 /lcrc/group/e3sm/ac.forsyth2/zppy_min_case_global_time_series_viewers_original_atm_plus_land_output/test_pr697_20250328_try4/v3.LR.historical_0051/post/scripts
Example cfg try_1 today True 21 atm, ""=all vars for lnd 5 8 original + 1 atm + 25 lnd 1985-2014 (30) 12424 12424/(34*30)=12.18 /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250403_try1/v3.LR.historical_0051/post/scripts
Example cfg try_2 today True 21 atm, 25 lnd 5 8 original + 1 atm + 25 lnd 1985-2014 (30) 928 928/(34*30)=0.91 /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250403_try2/v3.LR.historical_0051/post/scripts

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 3, 2025

Is that open_mfdataset just reading in every single nc file there is?`

The docs seem to suggest so:

opens the file with read-only access. When you modify values of a Dataset, even one linked to files on disk, only the in-memory copy you are manipulating in xarray is modified: the original file on disk is never touched.

It does allow "List of file paths" though.

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Apr 3, 2025

Is that open_mfdataset just reading in every single nc file there is? I guess we could pass along the specific vars we want included?

I had this question and posed to Tom when we had a meeting. @tomvothecoder will try to see if there can be some performance problem here.

I think there is a bottleneck with xcdat.open_mfdataset(). All .nc files in a given input directory are being opened, when only a subset is required (as far as I'm aware). The issue with this approach is that Xarray/xCDAT will attempt to concatenate all of these files into a single Dataset object, which can be slow depending on the number of files, the file sizes, and the shape/structure of the data.

I added a GitHub issue with my findings and a possible solution: E3SM-Project/zppy-interfaces#21

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 3, 2025

Thanks Tom, I was just coming to that same conclusion. I have code changes I'm testing now.

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 3, 2025

@chengzhuzhang As for this example cfg itself, it looks like the jobs finished without errors:

cd /lcrc/group/e3sm/ac.forsyth2/E3SMv3_20250403_try2/v3.LR.historical_0051/post/scripts
grep -v "OK" *status
# No errors

See www results.

If that looks good to you, I think we can call this example cfg done and merge it. (I do want to remove the extra environment_commands lines before merging though).

@chengzhuzhang
Copy link
Collaborator Author

If that looks good to you, I think we can call this example cfg done and merge it. (I do want to remove the extra environment_commands lines before merging though).

The diags are all generated and I don't see a problem by a brief review!

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 3, 2025

I just made bcad046 to clean up some parameters for what users will actually want to use. @chengzhuzhang, I think if you mark approved on this, I can merge it.

@chengzhuzhang
Copy link
Collaborator Author

I just made bcad046 to clean up some parameters for what users will actually want to use. @chengzhuzhang, I think if you mark approved on this, I can merge it.

The commit looks good!
You are one of the reviewers, feel free to approve and merge. We need to write to Chris, Wuyin and Peter about the finalization and performance issue for some parameter settings.

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 3, 2025

Screenshot 2025-04-03 at 3 27 05 PM
I think because I added commits, it won't let me self-review. (I noticed the settings changed in GitHub though to require reviews; I'll see if I can change that back).

@chengzhuzhang
Copy link
Collaborator Author

@forsyth2 The branch protection is good. Let see if @tomvothecoder or @golaz can approve.

@forsyth2
Copy link
Collaborator

forsyth2 commented Apr 3, 2025

The branch protection is good.

In general yes, I agree, but it is an obstacle if it's a very small change or a non-user-facing change like #701.

Copy link
Collaborator

@forsyth2 forsyth2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it appears I can approve this one. I can't approve #701 though, probably because I'm the author there?

@forsyth2 forsyth2 merged commit 684512d into main Apr 3, 2025
5 checks passed
@forsyth2 forsyth2 deleted the add_example_cfg_iss_692 branch April 7, 2025 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Doc]: Add an example configuration files including all tasks for new zppy v3 release
5 participants