Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update job submissions on WCOSS2 #106

Open
EdwardSafford-NOAA opened this issue Dec 7, 2023 · 3 comments
Open

Update job submissions on WCOSS2 #106

EdwardSafford-NOAA opened this issue Dec 7, 2023 · 3 comments
Assignees

Comments

@EdwardSafford-NOAA
Copy link
Collaborator

EdwardSafford-NOAA commented Dec 7, 2023

Update job submissions on WCOSS2 to conform to latest NCO guidance. Items to be addressed include:

  • eliminate -V option in all submissions. If environmental variables are needed they should be specified individually using -v.
  • all small serial jobs should use -l place=shared,select=1:ncpus=1:mem=100MB to use shared nodes.
  • no job scripts should use a -l option on the shebang line. Only the cron script may use that option (on the first line only) in this form: SHELL=/bin/bash -l. I don't think this is an issue but it should be inspected and confirmed.
@EdwardSafford-NOAA EdwardSafford-NOAA self-assigned this Dec 7, 2023
@EdwardSafford-NOAA
Copy link
Collaborator Author

Items done:
RadMon DE and IG scripts updated. Testing now using nam and NOAA-21 parallel.

EdwardSafford-NOAA added a commit to EdwardSafford-NOAA/GSI-Monitor that referenced this issue Feb 2, 2024
Saving work very much in progress.  Will need to retest from start.
@EdwardSafford-NOAA
Copy link
Collaborator Author

Have picked this back up and have the ConMon DE scripts for gfs/gdas updated. Working now on the DE for nam/regional.

EdwardSafford-NOAA added a commit to EdwardSafford-NOAA/GSI-Monitor that referenced this issue Mar 3, 2025
Add changes to ConMon DE submission, clean up j-job and ex script.
EdwardSafford-NOAA added a commit to EdwardSafford-NOAA/GSI-Monitor that referenced this issue Mar 3, 2025
Save work in progress.
EdwardSafford-NOAA added a commit to EdwardSafford-NOAA/GSI-Monitor that referenced this issue Mar 3, 2025
Save work in progress.
EdwardSafford-NOAA added a commit to EdwardSafford-NOAA/GSI-Monitor that referenced this issue Mar 4, 2025
Complete ConMon changes.
EdwardSafford-NOAA added a commit to EdwardSafford-NOAA/GSI-Monitor that referenced this issue Mar 5, 2025
Update OznMon_CP.sh.
EdwardSafford-NOAA added a commit to EdwardSafford-NOAA/GSI-Monitor that referenced this issue Mar 6, 2025
Add radmon copy
EdwardSafford-NOAA added a commit to EdwardSafford-NOAA/GSI-Monitor that referenced this issue Mar 10, 2025
Convert RadMon regional DE.
EdwardSafford-NOAA added a commit to EdwardSafford-NOAA/GSI-Monitor that referenced this issue Mar 10, 2025
Clean up item.
EdwardSafford-NOAA added a commit to EdwardSafford-NOAA/GSI-Monitor that referenced this issue Mar 12, 2025
Merge branch 'feature/job-sub-106' of https://github.com/EdwardSafford-NOAA/GSI-monitor into feature/job-sub-106
EdwardSafford-NOAA added a commit to EdwardSafford-NOAA/GSI-Monitor that referenced this issue Mar 12, 2025
Update Ozn plot job submissions.
@EdwardSafford-NOAA
Copy link
Collaborator Author

I've spent quite a bit of time on a problem I ran into with the RadMon mk_digital_time.sh script. On wcoss2 it frequently produces an error like this when it moves the output files to the pngs directory:
mv: cannot move 'cris-fsr_n20.403.omgbc.time.txt' to a subdirectory of itself, '/lfs/h2/emc/da/noscrub/edward.safford/nbns/imgn/gfs/gdas/radmon/pngs/time/cris-fsr_n20.403.omgbc.time.txt'
The error most often occurs with one of the mv operations involving a large number of files and is inconsistent run to run -- the files involved differ with each run. The files reported are getting moved to the intended location so things are happening as intended. Best I've been able to figure this is a lfs file system fluke of some sort. There are scattered reports of this sort of behavior in some versions of linux. Since what I need to have happen is happening I'm going to call this good for now. I'll dig back into this when I've got the time, but I need to move on now and complete this issue.

EdwardSafford-NOAA added a commit to EdwardSafford-NOAA/GSI-Monitor that referenced this issue Mar 14, 2025
Add RadMon summary & time plots.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant