Skip to content

Conversation

@jgalmes2
Copy link

GITSHA1 is the digest function used by git.
Details here: https://git-scm.com/book/be/v2/Git-Internals-Git-Objects.

GITSHA1 is the digest function used by git.
Details here: https://git-scm.com/book/be/v2/Git-Internals-Git-Objects.
@sluongng
Copy link
Collaborator

@jgalmes2 could you provide a but more context here? What is your use case?

this was previously discussed in the working group and the decision was to hold until there are more interest

@jgalmes2
Copy link
Author

We're trying to integrate Bazel with a remote execution engine whose native hashing algorithm is GITSHA1.

@sluongng
Copy link
Collaborator

Did you have a conversation with Bazel team about adding such a digest function?
It would be a lot easier to know that they are open to such a change so that we have a pair of client/server implementations willing to adopt the proposal.

Im also curious, how do you do the collision detection? afaik SHA1 in Git is modified with collision in mind.

Btw, if you are just looking for git-interop, worth noting that Git upstream dev branches have a few patch sets that include sha256 inter-op with sha1. There has been a bit of progress there in recent months. I wonder if we can do GITSHA256 instead?

@jgalmes2
Copy link
Author

jgalmes2 commented Nov 24, 2025

Oh yeah Lukács Berki and Chi Wang are fully aware of this change. @lberki @coeuvre
The GITSHA1 implementation is just the SHA1 of the bytes with the "blob <size>\0" header prefix prepended to the bytes.
Details here: https://git-scm.com/book/be/v2/Git-Internals-Git-Objects

@sluongng
Copy link
Collaborator

The GITSHA1 implementation is just the SHA1 of the bytes with the "blob \0" header prefix prepended to the bytes.
Details here: https://git-scm.c

This used to be the standard until 2017.
See https://github.com/git/git/blob/6ab38b7e9cc7adafc304f3204616a4debd49c6e9/Documentation/technical/hash-function-transition.adoc#background for more info.

The problem with probable collision attacks is that most CAS implementations today tend to assume a strong, collision-free cryptographic hash in use, thus skipping on collision check. At the very least, GITSHA1 should be defined as https://github.com/git/git/blob/master/sha1dc/sha1.h, which comes with additional checks against collision attacks.

Im also curious: Do you use git tree objects or DirectoryNode proto to present file trees here?

@lberki
Copy link

lberki commented Nov 26, 2025

My $.02: I do realize SHA-1 is quite long in the tooth these days. However, my understanding is that migrating a large code base is non-trivial since it changes every commithash and it looks like GitHub doesn't even support SHA-256?

In addition to the above adoption concerns, SHA-1 will always be an opt-in on the Bazel side so the default is a secure hash. Does this allay your concerns?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants