Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GOBBLIN-2194] Close temporal metrics scope on job completion #4097

Merged

Conversation

abhishekmjain
Copy link
Contributor

@abhishekmjain abhishekmjain commented Feb 13, 2025

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

Description

  • Here are some details about my PR, including screenshots (if applicable):

As part of #4095, a bug was introduced where the Metrics Scope object was not being closed, resulting in a non-daemon thread being alive. This resulted in AM container not getting shut down on application completion.

This PR implements the fix to close metrics scope wherever it is initialized.
It also covers calling close on JobLauncher upon its completion, which now results in cleanly closing the resources launched by JobLauncher.
Instead of calling executeCancellation() directly we now call cancelJob() in GobblinJobLauncher.close() since the latter cancels the job in synchronized manner which in turn calls executeCancellation().

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:
    Tested in a local project where the application master shuts down.
    image

ApplicationLauncher has already stopped gets printed when the shutdown hook gets called from ServiceBasedAppLauncher indicating a JVM shutdown.
Having the same state as the AM shutdown container fix PR. See the Tests section here.

Commits

  • My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

@abhishekmjain abhishekmjain force-pushed the temporal-metrics-closure-fix branch from 6aed406 to feb870c Compare February 13, 2025 18:25
@abhishekmjain abhishekmjain force-pushed the temporal-metrics-closure-fix branch from feb870c to 8a48450 Compare February 13, 2025 19:23
@@ -137,13 +137,9 @@ public GobblinJobLauncher(Properties jobProps, Path appWorkDir,
@Override
public void close() throws IOException {
try {
executeCancellation();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change impacts many child classes
image

Even though none of them implement the executeCancellation method but this is still a divergence which should be documented IMO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate on what documentation you feel should be added?
The executeCancellation is a No-Op function implemented in GobblinJobLauncher

@abhishekmjain abhishekmjain force-pushed the temporal-metrics-closure-fix branch 2 times, most recently from 468b824 to 27ae5c7 Compare February 14, 2025 12:20
@abhishekmjain abhishekmjain force-pushed the temporal-metrics-closure-fix branch from 27ae5c7 to cd1e492 Compare February 14, 2025 15:42
@abhishekmjain abhishekmjain merged commit 23c4481 into apache:master Feb 17, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants