Skip to content

feat: Adding signal and external error detection & output #3722

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 89 commits into
base: develop
Choose a base branch
from

Conversation

MelReyCG
Copy link
Contributor

@MelReyCG MelReyCG commented Jul 8, 2025

This PR is based on Amandine work on Adding error YAML file in GEOS (PR #3690), and aims at adding a detection & management inside GEOS of 1. Error signals, 2. External errors from dependencies, in order to be able to manage & output them in the log & error YAML file.

Managing those external errors gives us the opportunity to:

  • detect any kernel / system allocator errors,
  • add the stack-trace of the error,
  • output them reliably in the log, even if the stderr get lost or used for another reason,
  • factorize them with external tools / scripts, thus highlighting which are the source rank(s) of the issue.

This PR also prevent the stacktrace to be cut by other ranks message, which could previously happen on a signal.

We can imagine adding later some tag for each dependency (system, LvArray, Hypre, ...) to quickly identify / filter issues source.


Message to reviewers:
As this PR is based on the #3690, which is on a fork of GEOS, the modifications of both PR are visible. I also opened a PR on the fork, showing only the modification of this PR, but all systems of this repo are missing (code owners, CI...).
Did I miss something?

Still, the keypoints are:

  • Added ExternalErrorHandler, a singleton which redirects stderr messages into internal GEOS logger & ErrorHander,
  • ExternalErrorHandler is based on OutputStreamDeviation to redirect external messages,
  • added 1. signal handling lambda and external error handling lambda in src/coreComponents/common/initializeEnvironment.cpp,
  • Extended ErrorHandler / ErrorMsg for signals.

… link between GEOS_THROW_CTX_IF and LVARRAY_THROW_IF_TEST( EXP, MSG, TYPE )
… in try/catch statements

Problem: Retrieves everything that was thrown, so not just the message.
…y spaces.

The previous condition checked whether an argument was present and whether the option was immediately followed by a value like -test"value", which excluded valid cases like -test "value" et -test     "value".
@MelReyCG MelReyCG changed the title feat: Adding signal and external error detection & reliable output feat: Adding signal and external error detection & output Jul 8, 2025
@MelReyCG MelReyCG marked this pull request as draft July 8, 2025 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants