-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid creating unnecessary temporary cat file sub process #33942
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The design is not right.
You should make the cat-file
reader be thread-safe and be shared by different callers. But not keeping asking developers to strictly arrange the callers orders.
I do not see why the cat-file
reader can't be shared, if I understand correctly, it is a standard request-response service provider. Correct me if I am wrong.
There is no new design in this PR. The new design could be in #33934 or other new PRs. This PR just fix the unnecessary temporary subprocess according to original design so that it could be backport to v1.23. Regarding the original design: since cat-file relies on a strict input-output order — for example, input1 corresponds to output1, input2 to output2, and so on — I don’t believe it can be easily shared between two goroutines without guaranteed order. The tight coupling between input and output makes concurrent access without coordination problematic. Regarding the drawback of the original design: it spawns at least one Git subprocess for every Git-related http request. This frequent creation and destroy of subprocesses can be inefficient. Introducing a managed subprocess pool in the background could mitigate this by reusing existing subprocesses, thereby reducing overhead. This approach would function similarly to how Nginx manages worker processes. However, implementing such a mechanism adds significant complexity. |
OK, I won't block it, but I do not think it's worth to use this temporary fix or backport since it isn't fully tested or proven to fix any known user end bugs. The risk of backporting it is much higher than the benefit. If let me decide, just use a complete fix in main branch and just keep 1.23 as-is. Do not use this temp fix. |
Isn't cat-file command tied to a repository? I don't see how a pool would help, unless there's a constant (or on demand) command running per repo with req-reply interface in front. Question then is what if single Current solution, though inefficient when concurrent requests happen has a clearly defined lifecycle (matching context, so in turn request). I'm kind of stuck on the idea of NATS replacing queue system in gitea but the req-reply it provides would fit here - assuming we figure out how to bring the |
Extract from #33934
In the same goroutine, we should reuse the exist cat file sub process which exist in
git.Repository
to avoid creating a unnecessary temporary subprocess.This PR reuse the exist cate file writer and reader in
getCommitFromBatchReader
.It also move
prepareLatestCommitInfo
before creating dataRc which will hold the writer so other git operation will create a temporary cat file subprocess.