Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid creating unnecessary temporary cat file sub process #33942

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

lunny
Copy link
Member

@lunny lunny commented Mar 19, 2025

Extract from #33934

In the same goroutine, we should reuse the exist cat file sub process which exist in git.Repository to avoid creating a unnecessary temporary subprocess.

This PR reuse the exist cate file writer and reader in getCommitFromBatchReader.
It also move prepareLatestCommitInfo before creating dataRc which will hold the writer so other git operation will create a temporary cat file subprocess.

@lunny lunny added type/bug backport/v1.23 This PR should be backported to Gitea 1.23 labels Mar 19, 2025
@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Mar 19, 2025
@pull-request-size pull-request-size bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Mar 19, 2025
@github-actions github-actions bot added the modifies/go Pull requests that update Go code label Mar 19, 2025
@lunny lunny changed the title Avoid to create uncessary temporary cat file sub process Avoid creating unnecessary temporary cat file sub process Mar 19, 2025
Copy link
Contributor

@wxiaoguang wxiaoguang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design is not right.

You should make the cat-file reader be thread-safe and be shared by different callers. But not keeping asking developers to strictly arrange the callers orders.

I do not see why the cat-file reader can't be shared, if I understand correctly, it is a standard request-response service provider. Correct me if I am wrong.

@GiteaBot GiteaBot added lgtm/blocked A maintainer has reservations with the PR and thus it cannot be merged and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Mar 20, 2025
@lunny
Copy link
Member Author

lunny commented Mar 20, 2025

The design is not right.

You should make the cat-file reader be thread-safe and be shared by different callers. But not keeping asking developers to strictly arrange the callers orders.

I do not see why the cat-file reader can't be shared, if I understand correctly, it is a standard request-response service provider. Correct me if I am wrong.

There is no new design in this PR. The new design could be in #33934 or other new PRs. This PR just fix the unnecessary temporary subprocess according to original design so that it could be backport to v1.23.

Regarding the original design: since cat-file relies on a strict input-output order — for example, input1 corresponds to output1, input2 to output2, and so on — I don’t believe it can be easily shared between two goroutines without guaranteed order. The tight coupling between input and output makes concurrent access without coordination problematic.

Regarding the drawback of the original design: it spawns at least one Git subprocess for every Git-related http request. This frequent creation and destroy of subprocesses can be inefficient. Introducing a managed subprocess pool in the background could mitigate this by reusing existing subprocesses, thereby reducing overhead. This approach would function similarly to how Nginx manages worker processes. However, implementing such a mechanism adds significant complexity.

@GiteaBot GiteaBot added lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. and removed lgtm/blocked A maintainer has reservations with the PR and thus it cannot be merged labels Mar 20, 2025
@wxiaoguang
Copy link
Contributor

wxiaoguang commented Mar 20, 2025

OK, I won't block it, but I do not think it's worth to use this temporary fix or backport since it isn't fully tested or proven to fix any known user end bugs. The risk of backporting it is much higher than the benefit.

If let me decide, just use a complete fix in main branch and just keep 1.23 as-is. Do not use this temp fix.

@TheFox0x7
Copy link
Contributor

Isn't cat-file command tied to a repository? I don't see how a pool would help, unless there's a constant (or on demand) command running per repo with req-reply interface in front. Question then is what if single cat-file instance is overloaded (I'm assuming it's possible - but I might be wrong)?

Current solution, though inefficient when concurrent requests happen has a clearly defined lifecycle (matching context, so in turn request).


I'm kind of stuck on the idea of NATS replacing queue system in gitea but the req-reply it provides would fit here - assuming we figure out how to bring the cat-file instances up and down on demand and scale if needed. I'm not sure if the OVHD would be worth it though. Just a thought

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/v1.23 This PR should be backported to Gitea 1.23 lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. modifies/go Pull requests that update Go code size/M Denotes a PR that changes 30-99 lines, ignoring generated files. type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants