MOST test with terrain failing #1453

indra098124 · 2024-02-23T02:25:07Z

Hi there,
I tried most test provided in terrain3d_Hemisphere and WitchOfAgnesi. Both of these tests are failing for me. Are they expected to run from the initial condition defined in prob or we should run it without most first? I am using the latest version of the code and getting "SIGILL Invalid, privileged, or ill-formed instruction" error with these tests.

Many thanks for developing the code and answering my question.

asalmgren · 2024-02-23T02:29:15Z

Hi @indra098124 -- could you try with the inputs files in those directories and see if that works for you? Here https://ccse.lbl.gov/pub/RegressionTesting1/ERF/ is our nightly regression test suite -- all of these should "just work" if you try them-- maybe also try some of these as well so we can rule out issues, then we can see about this particular problem.

AMLattanzi · 2024-02-23T03:31:17Z

Are you running these tests locally on a mac?

indra098124 · 2024-02-23T03:45:34Z

Thank you @asalmgren and @AMLattanzi for looking into this. @AMLattanzi yes, I am running these locally on Mac.
@asalmgren I can run ABL cases (that are also included in nightly tests) with no problem.

baperry2 · 2024-02-23T17:53:00Z

Set amrex.fpe_trap_invalid = 0 in the input files, which turns off some runtime error checking. The Apple Clang compilers sometimes perform optimizations that cause the AMReX checks for divide by zero and similar errors to spuriously fail (conditional branches that don't get used and involve a divide by zero may be still be evaluated). These optimizations aren't performed in debug mode, so if needed you can also run with amrex.fpe_trap_invalid = 1 if you compile with DEBUG = TRUE.

asalmgren · 2024-02-23T17:57:35Z

@baperry2 -- that's really good to know -- could you add that to the docs somewhere?!

indra098124 · 2024-02-23T18:11:13Z

Thanks @baperry2, I was not aware of this.
I tried that but it did not help. I also tried to run this test on a Linux machine and I get an error "erroneous arithmetic operation" . Looking at Backtrace it appears that the error originates in MOST calculation "Source/BoundaryConditions/MOSTAverage.H:143:56"

Here is the code snippet where it fails.
for (int n = 0; n < interp_comp; n++)
interp_vals[n] = sx_lo[0]*sx_lo[1]*sx_lo[2]*interp_array(i-1, j-1, k-1,n) +
sx_lo[0]*sx_lo[1]*sx_hi[2]*interp_array(i-1, j-1, k ,n) +
sx_lo[0]*sx_hi[1]*sx_lo[2]*interp_array(i-1, j , k-1,n) +
sx_lo[0]*sx_hi[1]*sx_hi[2]*interp_array(i-1, j , k ,n) +
sx_hi[0]*sx_lo[1]*sx_lo[2]*interp_array(i , j-1, k-1,n) +
sx_hi[0]*sx_lo[1]*sx_hi[2]*interp_array(i , j-1, k ,n) +
sx_hi[0]*sx_hi[1]*sx_lo[2]*interp_array(i , j , k-1,n) +
sx_hi[0]*sx_hi[1]*sx_hi[2]*interp_array(i , j , k ,n);
}

baperry2 · 2024-02-23T18:30:32Z

@asalmgren will do, even though there appears to be more going on here, I definitely learned about the spurious FPEs on Macs the hard way and it would be good to have the information out there more.

@indra098124 - I tried again and see the same thing as you. For Witch of Agnesi, I see a spurious FPE that resolves with amrex.fpe_trap_invalid = 0 when running with inputs, but the same error as you when running with inputs_most_test, which appears to be a real error

AMLattanzi · 2024-02-23T21:18:36Z

@indra098124 Thank you for sharing the issue with inputs_most_test . The problem had to do with Theta_prim variable not having its ghost cells filled yet and the interpolation routine (where your backtrace points to) had to access that data. The following PR 1455 ran successfully in debug mode on my local machine with single and multiple cores. Please let me know if you have further issues.

indra098124 · 2024-02-23T22:07:13Z

Thank you @AMLattanzi . I modified my copy to have
IntVect ng = Theta_prim[lev]->nGrowVect();
in ERF.cpp and in ERF_Advance.cpp, still failing for me. I will try the version from PR.

AMLattanzi · 2024-02-23T22:30:38Z

@indra098124 Yes it should fail still with that revision. The creation of the MOST class and the calls to the MOST averaging needed to be moved later after the ghost cells were populated by FillPatch. If you see the issue arise, or a new issue, with the current development (e9bcaa0) let me know.

indra098124 · 2024-02-24T00:14:07Z

@AMLattanzi unfortunately, it is still failing for me with the latest version. I tried debug version as well. With debug I get the following error (on Mac and on Linux).

amrex::Abort::1:: (127,-1,-1,0) is out of bound (125:258,-3:10,0:63,0:0) !!!
SIGABRT
amrex::Abort::0:: (117,1,-1,0) is out of bound (-3:130,-3:10,0:63,0:0) !!!
SIGABRT

I tried running realclean and also a fresh download.

indra098124 · 2024-02-25T21:22:54Z

@AMLattanzi and @asalmgren there are other cases as well that are failing for me. I am not sure if I am doing something wrong.

ABL/inputs.write -> The input filed needed prob.T_0 = 300.0, after that it worked.
ABL/inputs.read -> This has been giving segfault. Backtrace points to if (input_bndry_planes && m_r2d->ingested_velocity()) in ERF_init_bcs.cpp:86). Debug or Assertion don't tell anything more. I did generate boundary files using inputs.write before trying this.
ABL_input_sounding does not compile. I just needed input_sounding that put me on track on finding the issue with this code compilation. This error is related to "USE_POISSON_SOLVE = TRUE". It gives an error /TI_headers.H:270:30: error: 'Vector' does not name a type 270 | const Vectoramrex::Real* d_rayleigh_ptrs_at_lev); I realized that it is do with USE_POISSON_SOLVE = TRUE. I think it should be amrex::Vector. There was another error about use_rayleigh_damping not being declared which might be a typo as other places I find it is referenced as solverChoice.use_rayleigh_damping. At TI_no_substep_fun.H:133:13 the code complains that incompressible is not declared. Lastly, At TI_slow_rhs_fun.H:357:25: I get an error: cannot convert 'std::unique_ptramrex::MultiFab' to 'const amrex::MultiFab*' erf_slow_rhs_inc(level, nrk, slow_dt. I could use input_sounding when I disable poisson_solve.

Thank you!

asalmgren · 2024-02-26T02:15:27Z

I believe we didn’t mean to build with USE_POISSON_SOLVE on. If you set that to false does it build ok? Thank you for all the great feedback! We need to do a better job of making sure the jnputs files in the repo work correctly Ann Almgren Senior Scientist; Dept. Head, Applied Mathematics Pronouns: she/her/hers

…

On Sun, Feb 25, 2024 at 1:23 PM indra098124 ***@***.***> wrote: @AMLattanzi <https://github.com/AMLattanzi> and @asalmgren <https://github.com/asalmgren> there are other cases as well that are failing for me as well. I am not sure if I am doing something wrong. 1. ABL/inputs.write -> The input filed needed prob.T_0 = 300.0, after that it worked. 2. ABL/inputs.read -> This has been giving segfault. Backtrace points to if (input_bndry_planes && m_r2d->ingested_velocity()) in ERF_init_bcs.cpp:86). Debug or Assertion don't tell anything more. 3. ABL_input_sounding does not compile. I just needed input_sounding that put me on track on finding the issue with this code compilation. This error is related to "USE_POISSON_SOLVE = TRUE". It gives an error /TI_headers.H:270:30: error: 'Vector' does not name a type 270 | const Vectoramrex::Real* d_rayleigh_ptrs_at_lev); I realized that it is do with USE_POISSON_SOLVE = TRUE. I think it should be amrex::Vector. There was another error about use_rayleigh_damping not being declared which might be a typo as other places I find it is referenced as solverChoice.use_rayleigh_damping. At TI_no_substep_fun.H:133:13 the code complains that incompressible is not declared. Lastly, At TI_slow_rhs_fun.H:357:25: I get an error: cannot convert 'std::unique_ptramrex::MultiFab' to 'const amrex::MultiFab*' erf_slow_rhs_inc(level, nrk, slow_dt. I could use input_sounding when I disable poisson_solve. Thank you! — Reply to this email directly, view it on GitHub <#1453 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACRE6YVUYY3YDP2HIW6T47TYVOTTVAVCNFSM6AAAAABDV747CWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRTGA3DKNRZHE> . You are receiving this because you were mentioned.Message ID: ***@***.***>

indra098124 · 2024-02-26T02:29:07Z

Thank you @asalmgren and thank you ERF development team for making the software available open source. Yes, after disabling the poisson solver, I can build and run this.

Last thing I am figuring out is to use boundary input.

AMLattanzi · 2024-02-27T19:04:58Z

@indra098124 sounds like things are alright on this front? Are we good to close this particular issue?

AMLattanzi · 2024-02-28T00:39:15Z

I believe the inputs.write and inputs.read should work once PR 1461 goes through.

indra098124 · 2024-02-28T15:52:46Z

@AMLattanzi thanks for following up. I am not sure, but the most with terrain still fails for me with the following error?

amrex::Abort::1:: (127,-1,-1,0) is out of bound (125:258,-3:10,0:63,0:0) !!!
SIGABRT
amrex::Abort::0:: (117,1,-1,0) is out of bound (-3:130,-3:10,0:63,0:0) !!!
SIGABRT

I am not sure. May I confirm if you were able to run terrain3d_Hemisphere successfully?

AMLattanzi · 2024-02-28T15:58:45Z

Ah, I have not tested hemisphere with MOST! Let me give that a go and I can either follow up with the results or create a PR to alleviate the issue. Thanks for clarifying.

AMLattanzi · 2024-02-28T18:28:19Z

@indra098124 I believe I have corrected the issue with MOST and the 3d hemisphere in PR 1465. Thank you again for bringing these issues to our attention, we greatly appreciate the feedback.

indra098124 · 2024-02-29T14:42:50Z

Thank you @AMLattanzi for your help.

indra098124 · 2024-02-29T16:14:57Z

@AMLattanzi after the new fix, the inputs_most_test in ABL seems to be broken. I find that if used erf.most.average_policy = 0, the code diverges at first time step with the error "0::Assertion `cell_data(i,j,k,RhoTheta_comp) > 0.' failed, file "../../Source/TimeIntegration/ERF_slow_rhs_pre.cpp", line 566" . most_average_policy =1 works fine. Would you mind having a look?

Many thanks

indra098124 · 2024-02-29T18:50:33Z

Additionally, looks like there is some issue with MOST with surface temperature. It always gives SIGILL Invalid, privileged, or ill-formed instruction. For e.g. see GABLS1 case.

AMLattanzi · 2024-02-29T21:32:16Z

@indra098124 The issue with the hemisphere should be corrected in PR 1468. The salient problem was that the turbulent viscosity was 0 for the given initialization; this is inconsistent with the MOST BC and the limiting we did with 1e-16 was not sufficient for stability. I also added an option for small perturbations in the IC to give finite strain and thus non-zero turbulent viscosity with Smagorinsky (the fluctuations seem to dissipate quickly). This ran for planar and local average for 10 steps.

With respect to the GABLS case, I am unable to replicate that issue. The instruction error you mention sounds like the mac issue Bruce explained. I have yet to see that error on a Linux machine with ERF. Perhaps try in DEBUG mode.

indra098124 · 2024-02-29T22:00:11Z

Thanks @AMLattanzi . This PR seems to have fixed the other issues (GABLS and ABLMost). I can see the ABLMost regression test ran successfully (https://ccse.lbl.gov/pub/RegressionTesting1/ERF/) while it was failing earlier today.
Also thank you for explaining what was wrong.

Many thanks

asalmgren · 2024-03-02T21:21:11Z

@indra098124 -- are we good to close this issue?

indra098124 · 2024-03-03T19:11:08Z

Thank you @AMLattanzi. Yes @asalmgren we can close this.

indra098124 changed the title ~~most test with terrain failing~~ MOST test with terrain failing Feb 23, 2024

baperry2 mentioned this issue Feb 23, 2024

add docs on amrex fpe trapping options #1456

Merged

indra098124 closed this as completed Feb 29, 2024

indra098124 reopened this Feb 29, 2024

indra098124 closed this as completed Mar 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MOST test with terrain failing #1453

MOST test with terrain failing #1453

indra098124 commented Feb 23, 2024

asalmgren commented Feb 23, 2024

AMLattanzi commented Feb 23, 2024

indra098124 commented Feb 23, 2024

baperry2 commented Feb 23, 2024

asalmgren commented Feb 23, 2024

indra098124 commented Feb 23, 2024 •

edited

Loading

baperry2 commented Feb 23, 2024

AMLattanzi commented Feb 23, 2024

indra098124 commented Feb 23, 2024

AMLattanzi commented Feb 23, 2024

indra098124 commented Feb 24, 2024

indra098124 commented Feb 25, 2024 •

edited

Loading

asalmgren commented Feb 26, 2024 via email

indra098124 commented Feb 26, 2024

AMLattanzi commented Feb 27, 2024

AMLattanzi commented Feb 28, 2024 •

edited

Loading

indra098124 commented Feb 28, 2024 •

edited

Loading

AMLattanzi commented Feb 28, 2024

AMLattanzi commented Feb 28, 2024

indra098124 commented Feb 29, 2024

indra098124 commented Feb 29, 2024

indra098124 commented Feb 29, 2024

AMLattanzi commented Feb 29, 2024

indra098124 commented Feb 29, 2024

asalmgren commented Mar 2, 2024

indra098124 commented Mar 3, 2024

MOST test with terrain failing #1453

MOST test with terrain failing #1453

Comments

indra098124 commented Feb 23, 2024

asalmgren commented Feb 23, 2024

AMLattanzi commented Feb 23, 2024

indra098124 commented Feb 23, 2024

baperry2 commented Feb 23, 2024

asalmgren commented Feb 23, 2024

indra098124 commented Feb 23, 2024 • edited Loading

baperry2 commented Feb 23, 2024

AMLattanzi commented Feb 23, 2024

indra098124 commented Feb 23, 2024

AMLattanzi commented Feb 23, 2024

indra098124 commented Feb 24, 2024

indra098124 commented Feb 25, 2024 • edited Loading

asalmgren commented Feb 26, 2024 via email

indra098124 commented Feb 26, 2024

AMLattanzi commented Feb 27, 2024

AMLattanzi commented Feb 28, 2024 • edited Loading

indra098124 commented Feb 28, 2024 • edited Loading

AMLattanzi commented Feb 28, 2024

AMLattanzi commented Feb 28, 2024

indra098124 commented Feb 29, 2024

indra098124 commented Feb 29, 2024

indra098124 commented Feb 29, 2024

AMLattanzi commented Feb 29, 2024

indra098124 commented Feb 29, 2024

asalmgren commented Mar 2, 2024

indra098124 commented Mar 3, 2024

indra098124 commented Feb 23, 2024 •

edited

Loading

indra098124 commented Feb 25, 2024 •

edited

Loading

AMLattanzi commented Feb 28, 2024 •

edited

Loading

indra098124 commented Feb 28, 2024 •

edited

Loading