Skip to content

Conversation

edolstra
Copy link
Member

Motivation

If the garbage collector has acquired the global GC lock, but hasn't created the GC socket yet, then a client attempting to connect would get ENOENT. Note that this only happens when the GC runs for the first time on a machine. Subsequently clients will get ECONNREFUSED which was already handled.

Fixes #7370.

Context

Checklist for maintainers

Maintainers: tick if completed or explain if not relevant

  • agreed on idea
  • agreed on implementation strategy
  • tests, as appropriate
    • functional tests - tests/**.sh
    • unit tests - src/*/tests
    • integration tests - tests/nixos/*
  • documentation in the manual
  • documentation in the internal API docs
  • code and comments are self-explanatory
  • commit message explains why the change was made
  • new feature or incompatible change: updated release notes

Priorities

Add 👍 to pull requests you find important.

@edolstra edolstra requested a review from thufschmitt as a code owner June 19, 2023 14:58
@github-actions github-actions bot added the with-tests Issues related to testing. PRs with tests have some priority label Jun 19, 2023
Copy link
Member

@Ericson2314 Ericson2314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add comments in the test explaining what is supposed to happen / why that excises the feature?

These things are quite subtle so I think making it super clear and spelled-out helps :).

If the garbage collector has acquired the global GC lock, but hasn't
created the GC socket yet, then a client attempting to connect would
get ENOENT. Note that this only happens when the GC runs for the first
time on a machine. Subsequently clients will get ECONNREFUSED which
was already handled.

Fixes NixOS#7370.
@edolstra edolstra force-pushed the handle-missing-gc-socket branch from 3823f7a to c5fdbda Compare June 20, 2023 09:15
@edolstra edolstra enabled auto-merge January 12, 2024 11:40
@edolstra edolstra force-pushed the handle-missing-gc-socket branch from 5cabab5 to d005bad Compare January 16, 2024 14:26
@edolstra edolstra merged commit 7115edc into NixOS:master Jan 16, 2024
Copy link

Backport failed because this pull request contains merge commits. You can either backport this pull request manually, or configure the action to skip merge commits.

@edolstra edolstra deleted the handle-missing-gc-socket branch January 16, 2024 15:07
@Ericson2314
Copy link
Member

Ericson2314 commented Jan 16, 2024

@edolstra should we do bind too?

Ericson2314 added a commit to obsidiansystems/nix that referenced this pull request Jan 16, 2024
It is good to propogate the underlying error so whether or not we use a
proccess to deal with path length issues is not observable.

Also, as these helper functions got more and more complex, the code
duplication got worse and worse. Deduplicate.
Ericson2314 added a commit to obsidiansystems/nix that referenced this pull request Jan 16, 2024
It is good to propagate the underlying error so whether or not we use a
process to deal with path length issues is not observable.

Also, as these wrapper functions got more and more complex, the code
duplication got worse and worse. The new `bindConnectProcHelper`
function deduplicates them.
@Ericson2314
Copy link
Member

Did bind in #9787

Copy link

Backport failed because this pull request contains merge commits. You can either backport this pull request manually, or configure the action to skip merge commits.

1 similar comment
Copy link

Backport failed because this pull request contains merge commits. You can either backport this pull request manually, or configure the action to skip merge commits.

Ericson2314 added a commit to obsidiansystems/nix that referenced this pull request Jan 17, 2024
It is good to propagate the underlying error so whether or not we use a
process to deal with path length issues is not observable.

Also, as these wrapper functions got more and more complex, the code
duplication got worse and worse. The new `bindConnectProcHelper`
function deduplicates them.
Ericson2314 added a commit to obsidiansystems/nix that referenced this pull request Jan 18, 2024
It is good to propagate the underlying error so whether or not we use a
process to deal with path length issues is not observable.

Also, as these wrapper functions got more and more complex, the code
duplication got worse and worse. The new `bindConnectProcHelper`
function deduplicates them.
thufschmitt added a commit that referenced this pull request Jan 18, 2024
`bind`: give same treatment as `connect` in #8544, dedup
tebowy pushed a commit to tebowy/nix that referenced this pull request Jul 11, 2024
LocalStore: :addTempRoot(): Handle ENOENT
(cherry picked from commit 7115edc)
Change-Id: Ie6b1596049c3fde09b98f2f0727899f98e48e6b1
tebowy pushed a commit to tebowy/nix that referenced this pull request Jul 11, 2024
`bind`: give same treatment as `connect` in NixOS#8544, dedup

(cherry picked from commit 2867424)
Change-Id: I1ac5fc43fa10ec5f37a226730c3d84033fdbfd52
@GrahamDennis
Copy link

Backport to Nix-2.18 here: #13896. There was an attempt by @roberth to backport this to 2.18 and 2.15 above (#8544 (comment)) but that failed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
with-tests Issues related to testing. PRs with tests have some priority
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crash during concurrent garbage collection
4 participants