Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add abi_g16 & abi_g18 YAML files into templates #329

Merged
merged 15 commits into from
Mar 28, 2025

Conversation

xyzemc
Copy link
Contributor

@xyzemc xyzemc commented Mar 17, 2025

Description

Provide a detailed description of this PR and what changes were made.
This PR is merging the GOES_16 and GOES_18 clear sky ABI YAML file into RDASApp templates. All validation steps have been processed in the following linked issues. Both bias correction and QC filers work properly in JEDI for assimilating the clear sky ABI clear sky (CSR) radiance.

Issue(s) addressed

Resolves/Results are documented in Issue #
#57
#292
#306
#249

Checklist

  • I have performed a self-review of my own code.

  • I have run rrfs tests before creating the PR.

Ctest with abi_g16

1/1 Test #4: rrfs_mpasjedi_2024052700_Ens3Dvar ... Passed 440.84 sec

The following tests passed:
rrfs_mpasjedi_2024052700_Ens3Dvar

100% tests passed, 0 tests failed out of 1

Label Time Summary:
mpi = 440.84 secproc (1 test)
rdas-bundle = 440.84 sec
proc (1 test)
script = 440.84 sec*proc (1 test)

Total Test time (real) = 440.89 sec

Ctest with abi_g18 on top of abi_g16

1/1 Test #4: rrfs_mpasjedi_2024052700_Ens3Dvar ... Passed 453.52 sec

The following tests passed:
rrfs_mpasjedi_2024052700_Ens3Dvar

100% tests passed, 0 tests failed out of 1

Label Time Summary:
mpi = 453.52 secproc (1 test)
rdas-bundle = 453.52 sec
proc (1 test)
script = 453.52 sec*proc (1 test)

Total Test time (real) = 453.58 sec
[Xiaoyan.Zhang@hfe02 rrfs-test]$

xyzemc and others added 6 commits March 11, 2025 03:51
Sfcshp 282 yaml passed phase 1 validation
… templates (NOAA-EMC#298)

This PR addresses NOAA-EMC#273 to generate the ctest yamls when RDASApp is
built. This will allow us to keep better sync between the ctests and
updates to `rrfs-test/validated_yamls/templates`. It will also make
reviewing PRs easier if we don't also need to make changes to the super
yamls.
…cate action key. (NOAA-EMC#314)

In rrfs-workflow, GSL colleagues (Chunhua and others) found that
aircar_airTemperature_133 assimilation experiments had unrealistic
increments. It was found that this is due to the jediyaml not correctly
handling missing QualityMarker values. Essentially, those observations
were getting into the analysis causing the very large and unreasonable
analysis increments.
Adding phase 1 adpupa (132/232) yamls. There were no obs in the ctest
case to fully validate. This isn't a surprise since these ob types were
marked to be pretty sparse in the observation spreadsheet. These yamls
were developed following the (120/220) yamls exactly with the exception
of the ObsErrors and gross error checks for winds. The point of this PR
is to establish these yamls in RDASApp. This won't affect cycling DA
results because we will only use configurations that have passed phase 3
validation.
Added the abi_g16 and abi_g18 ctest into gen_yaml_ctest.sh.
@xyzemc xyzemc self-assigned this Mar 17, 2025
@xyzemc
Copy link
Contributor Author

xyzemc commented Mar 17, 2025

Analysis Increment

image
image

Copy link
Collaborator

@delippi delippi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xyzemc good work! I just had a few minor and very picky suggestions. It is nice to have a comment for each filter. It helps non radiance people understand what is happening and it also helps to visually break the yaml up when you're viewing it.

xyzemc and others added 2 commits March 26, 2025 02:20
@xyzemc
Copy link
Contributor Author

xyzemc commented Mar 26, 2025

All mpasjedi ctests have been passed on Hera.

+ ctest -j8
Test project /scratch2/NCEPDEV/fv3-cam/Xiaoyan.Zhang/noscrub/JEDI/RDASApp_fork/build/rrfs-test
    Start 1: rrfs_fv3jedi_2024052700_Ens3Dvar
    Start 2: rrfs_fv3jedi_2024052700_getkf_observer
    Start 5: rrfs_mpasjedi_2024052700_getkf_observer
    Start 7: rrfs_mpasjedi_2024052700_bumploc
    Start 4: rrfs_mpasjedi_2024052700_Ens3Dvar
    Start 8: rrfs_bufr2ioda_msonet
1/8 Test #1: rrfs_fv3jedi_2024052700_Ens3Dvar ..........***Failed   23.29 sec
2/8 Test #8: rrfs_bufr2ioda_msonet .....................   Passed   32.30 sec
3/8 Test #2: rrfs_fv3jedi_2024052700_getkf_observer ....***Failed   65.61 sec
    Start 3: rrfs_fv3jedi_2024052700_getkf_solver
4/8 Test #3: rrfs_fv3jedi_2024052700_getkf_solver ......***Failed   11.37 sec
5/8 Test #5: rrfs_mpasjedi_2024052700_getkf_observer ...   Passed  204.55 sec
    Start 6: rrfs_mpasjedi_2024052700_getkf_solver
6/8 Test #7: rrfs_mpasjedi_2024052700_bumploc ..........   Passed  314.90 sec
7/8 Test #4: rrfs_mpasjedi_2024052700_Ens3Dvar .........   Passed  451.69 sec
8/8 Test #6: rrfs_mpasjedi_2024052700_getkf_solver .....   Passed  584.48 sec

63% tests passed, 3 tests failed out of 8

Label Time Summary:
mpi            = 1688.19 sec*proc (8 tests)
rdas-bundle    = 1688.19 sec*proc (8 tests)
script         = 1688.19 sec*proc (8 tests)

Total Test time (real) = 789.06 sec


@rrfsbot
Copy link
Collaborator

rrfsbot commented Mar 26, 2025

FAILED on hercules

started build_and_test on hercules at UTC time: Wed Mar 26 03:05:54 UTC 2025
finished at UTC time: Wed Mar 26 03:57:56 UTC 2025

Test project /work/noaa/wrfruc/rrfsbot/PRs_RDASApp/329/build/rrfs-test
    Start 2: rrfs_fv3jedi_2024052700_getkf_observer
    Start 5: rrfs_mpasjedi_2024052700_getkf_observer
    Start 1: rrfs_fv3jedi_2024052700_Ens3Dvar
    Start 4: rrfs_mpasjedi_2024052700_Ens3Dvar
    Start 7: rrfs_mpasjedi_2024052700_bumploc
    Start 8: rrfs_bufr2ioda_msonet
1/8 Test #2: rrfs_fv3jedi_2024052700_getkf_observer ....***Failed   68.16 sec
    Start 3: rrfs_fv3jedi_2024052700_getkf_solver
2/8 Test #1: rrfs_fv3jedi_2024052700_Ens3Dvar ..........***Failed   91.69 sec
3/8 Test #5: rrfs_mpasjedi_2024052700_getkf_observer ...***Failed  188.56 sec
    Start 6: rrfs_mpasjedi_2024052700_getkf_solver
4/8 Test #8: rrfs_bufr2ioda_msonet .....................   Passed  222.42 sec
5/8 Test #3: rrfs_fv3jedi_2024052700_getkf_solver ......***Failed  219.51 sec
6/8 Test #7: rrfs_mpasjedi_2024052700_bumploc ..........   Passed  356.48 sec
7/8 Test #4: rrfs_mpasjedi_2024052700_Ens3Dvar .........***Failed  380.92 sec
8/8 Test #6: rrfs_mpasjedi_2024052700_getkf_solver .....***Failed  702.38 sec

25% tests passed, 6 tests failed out of 8

Label Time Summary:
mpi            = 2230.12 sec*proc (8 tests)
rdas-bundle    = 2230.12 sec*proc (8 tests)
script         = 2230.12 sec*proc (8 tests)

Total Test time (real) = 890.99 sec

The following tests FAILED:
	  1 - rrfs_fv3jedi_2024052700_Ens3Dvar (Failed)
	  2 - rrfs_fv3jedi_2024052700_getkf_observer (Failed)
	  3 - rrfs_fv3jedi_2024052700_getkf_solver (Failed)
	  4 - rrfs_mpasjedi_2024052700_Ens3Dvar (Failed)
	  5 - rrfs_mpasjedi_2024052700_getkf_observer (Failed)
	  6 - rrfs_mpasjedi_2024052700_getkf_solver (Failed)
Errors while running CTest
Output from these tests are in: /work/noaa/wrfruc/rrfsbot/PRs_RDASApp/329/build/rrfs-test/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

workdir: /work/noaa/wrfruc/rrfsbot/PRs_RDASApp/329

@SamuelDegelia-NOAA
Copy link
Contributor

@xyzemc Would you mind also committing the new test reference files for the fv3-jedi ctests?

Regarding the failed mpas-jedi ctests on Hercules, this could just be related to the tolerance needing to be increased. But I cannot check that right now since Orion/Hercules are in PM right now.

@xyzemc
Copy link
Contributor Author

xyzemc commented Mar 26, 2025

@xyzemc Would you mind also committing the new test reference files for the fv3-jedi ctests?

Regarding the failed mpas-jedi ctests on Hercules, this could just be related to the tolerance needing to be increased. But I cannot check that right now since Orion/Hercules are in PM right now.

@SamuelDegelia-NOAA
Since the fv3-jedi ctests are all failed on Hera, there is no new fv3-jede ctests output generated. For satellite radiance yaml file, there are a few lines are different from fv3_jedi and mpas_jedi. I need to add these two lines in the ' obs operators ' of mpas_jedi yaml file:

SurfaceWindGeoVars: uv
   IRVISlandCoeff: IGBP

So if I use the gen_yaml_ctest.sh to generate the fv3_jedi yaml file, the above two lines will be included in the fv3_jedi yaml file. Then the fv3_jedi run will crash.
Any suggestion about this?

@SamuelDegelia-NOAA
Copy link
Contributor

@xyzemc We can use the commentQC.py tool in the same directory as gen_yaml_ctest.sh to comment out those lines for the fv3-jedi yamls. That script is run through gen_yaml_ctest.sh and is currently designed to comment out filters for the mpas-jedi yamls that aren't ready yet. But we could also use it to solve your problem by removing those two lines for the fv3-jedi yamls.

Can you share one of your super yamls that are generated for the fv3-jedi ctests (which cause the failure)? I'll see if I can edit commentQC.py to comment that block out.

@xyzemc
Copy link
Contributor Author

xyzemc commented Mar 26, 2025

@xyzemc We can use the commentQC.py tool in the same directory as gen_yaml_ctest.sh to comment out those lines for the fv3-jedi yamls. That script is run through gen_yaml_ctest.sh and is currently designed to comment out filters for the mpas-jedi yamls that aren't ready yet. But we could also use it to solve your problem by removing those two lines for the fv3-jedi yamls.

Can you share one of your super yamls that are generated for the fv3-jedi ctests (which cause the failure)? I'll see if I can edit commentQC.py to comment that block out.

@SamuelDegelia-NOAA Sure.
Please check /scratch2/NCEPDEV/fv3-cam/Xiaoyan.Zhang/noscrub/JEDI/RDASApp_fork/build/rrfs-test/rundir-rrfs_fv3jedi_2024052700_Ens3Dvar/rrfs_fv3jedi_2024052700_Ens3Dvar.yaml

@SamuelDegelia-NOAA
Copy link
Contributor

@xyzemc Do those lines need to be commented out for the atms yaml too, or just ABI?

@xyzemc
Copy link
Contributor Author

xyzemc commented Mar 27, 2025

@xyzemc Do those lines need to be commented out for the atms yaml too, or just ABI?

@SamuelDegelia-NOAA atms and amsua need to be commented out too for the fv3_jedi yaml file.

@xyzemc
Copy link
Contributor Author

xyzemc commented Mar 28, 2025

All fv3_jedi ctests have passed on Hera with the updated fv3_jedi yaml file as described above.

/scratch1/NCEPDEV/stmp2/Xiaoyan.Zhang/RDASApp_abi_csr/build/rrfs-test

100% tests passed, 0 tests failed out of 8

Label Time Summary:
mpi = 2934.20 secproc (8 tests)
rdas-bundle = 2934.20 sec
proc (8 tests)
script = 2934.20 sec*proc (8 tests)

Total Test time (real) = 1024.25 sec

* Updated the gen_yaml_ctest and commentQC.py to comments out two unecessary lines in fv3_jedi ctest yaml file
@rrfsbot
Copy link
Collaborator

rrfsbot commented Mar 28, 2025

FAILED on hercules

started build_and_test on hercules at UTC time: Fri Mar 28 01:31:19 UTC 2025
finished at UTC time: Fri Mar 28 02:12:15 UTC 2025

Test project /work/noaa/wrfruc/rrfsbot/PRs_RDASApp/329/build/rrfs-test
    Start 2: rrfs_fv3jedi_2024052700_getkf_observer
    Start 5: rrfs_mpasjedi_2024052700_getkf_observer
    Start 1: rrfs_fv3jedi_2024052700_Ens3Dvar
    Start 4: rrfs_mpasjedi_2024052700_Ens3Dvar
    Start 7: rrfs_mpasjedi_2024052700_bumploc
    Start 8: rrfs_bufr2ioda_msonet
1/8 Test #8: rrfs_bufr2ioda_msonet .....................   Passed   49.40 sec
2/8 Test #2: rrfs_fv3jedi_2024052700_getkf_observer ....   Passed  180.02 sec
    Start 3: rrfs_fv3jedi_2024052700_getkf_solver
3/8 Test #5: rrfs_mpasjedi_2024052700_getkf_observer ...   Passed  194.54 sec
    Start 6: rrfs_mpasjedi_2024052700_getkf_solver
4/8 Test #6: rrfs_mpasjedi_2024052700_getkf_solver .....***Failed    0.01 sec
5/8 Test #7: rrfs_mpasjedi_2024052700_bumploc ..........   Passed  242.46 sec
6/8 Test #4: rrfs_mpasjedi_2024052700_Ens3Dvar .........   Passed  336.55 sec
7/8 Test #3: rrfs_fv3jedi_2024052700_getkf_solver ......***Failed  274.93 sec
8/8 Test #1: rrfs_fv3jedi_2024052700_Ens3Dvar ..........   Passed  677.48 sec

75% tests passed, 2 tests failed out of 8

Label Time Summary:
mpi            = 1955.40 sec*proc (8 tests)
rdas-bundle    = 1955.40 sec*proc (8 tests)
script         = 1955.40 sec*proc (8 tests)

Total Test time (real) = 677.53 sec

The following tests FAILED:
	  3 - rrfs_fv3jedi_2024052700_getkf_solver (Failed)
	  6 - rrfs_mpasjedi_2024052700_getkf_solver (Failed)
Errors while running CTest
Output from these tests are in: /work/noaa/wrfruc/rrfsbot/PRs_RDASApp/329/build/rrfs-test/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

workdir: /work/noaa/wrfruc/rrfsbot/PRs_RDASApp/329

@xyzemc
Copy link
Contributor Author

xyzemc commented Mar 28, 2025

@SamuelDegelia-NOAA It seems the hercules fv3_jedi ctests still failed even with the updated reference files and the gen_yaml*sh. Could you help to check it one more time? Or we don't need to care about the Hercules ctests at this moment.

@SamuelDegelia-NOAA
Copy link
Contributor

@xyzemc The failure for rrfs_fv3jedi_2024052700_getkf_solver is a small TestReferenceFloatMismatchError for the analysis northward_wind. Per our discussion yesterday I do not think we need to worry about this.

The rrfs_mpasjedi_2024052700_getkf_solver test actually never event started. It failed with:

srun: error: Unable to allocate resources: Requested node configuration is not available
<end of output>
Test time =   0.01 sec

This is not a failure due to any reason with JEDI. That might have just been a temporary issue on Hercules? I am not really sure what would suddenly cause an error like that.

But either way, I think these failures on Hercules are not a concern and that this PR can move forward.

@SamuelDegelia-NOAA
Copy link
Contributor

SamuelDegelia-NOAA commented Mar 28, 2025

PS: I just opened #346 to update the test references since they seem to be lagging behind our current develop branch. But if this PR is good to go, then we could just merge this one and ignore #346. They should both fix the issues caught with #344.

@xyzemc
Copy link
Contributor Author

xyzemc commented Mar 28, 2025

@xyzemc The failure for rrfs_fv3jedi_2024052700_getkf_solver is a small TestReferenceFloatMismatchError for the analysis northward_wind. Per our discussion yesterday I do not think we need to worry about this.

The rrfs_mpasjedi_2024052700_getkf_solver test actually never event started. It failed with:

srun: error: Unable to allocate resources: Requested node configuration is not available
<end of output>
Test time =   0.01 sec

This is not a failure due to any reason with JEDI. That might have just been a temporary issue on Hercules? I am not really sure what would suddenly cause an error like that.

But either way, I think these failures on Hercules are not a concern and that this PR can move forward.

Thanks for the check and confirmation.

@ShunLiu-NOAA ShunLiu-NOAA merged commit 0626361 into NOAA-EMC:develop Mar 28, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants