-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add some GitHub commit message guidance #110
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,6 +6,46 @@ This document describes the merge strategy used for Pull Requests within all rep | |
|
||
Every Pull Request that is made to an InstructLab repository should meet the below requirements - certain repositories such as [taxonomy](https://github.com/instruct-lab/taxonomy) may have additional requirements. | ||
|
||
### Commit Messages and PR Description | ||
|
||
PRs should be formatted as a series of logical commits that implement the | ||
overall change proposed in the PR. Each commit should represent a logical step | ||
in the progression of the change. The commit message for each should clearly | ||
explain that step of the progression. | ||
|
||
Here are a few examples of PRs that contained a series of commits: | ||
|
||
- <https://github.com/instructlab/instructlab-bot/pull/125/commits> | ||
- <https://github.com/instructlab/instructlab/pull/994/commits> | ||
- <https://github.com/instructlab/instructlab/pull/951/commits> | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm sure there are even better examples out there. I'm happy to change this over time to represent some of the best examples people can find in our repos. These are a few I had dug up some weeks ago. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Apologies if this is a personal preference nit. Noting that many/most PRs are small in scope and (I believe) should usually only have 1 commit. I generally think of a commit as a unit of work that you might choose to un-commit. And breaking down work into revertable logic is a great practice. But stating multiple commits is the general case might lead to a lot of commits for work that doesn't need to be broken up. My ask would be to adjust the wording to not promote multiple commits quite so strongly but still indicate how to use them effectively. If there is a fundamental disagreement on approach, I'm happy to be a chameleon. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're certainly not alone in this preference, and I think there's a bit of a difference between a camp of people who developed practices before github existed, and another camp who developed practices around github and maybe are less reliant on e.g. Here's a concrete example of why I'm firmly in the "more is more" (within reason) on splitting up commits: https://github.com/instructlab/sdg/pull/74/commits Totally trivial PR, but 4 distinctly logical changes, and being able to look at those changes (and their commit message) individually is actually really valuable - e.g. separating the reordering of the parameters from the deleting unused parameters makes it trivial to review. First you make sure the reordering didn't change anything, then you confirm that the 4 parameters being dropped aren't in use. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here are a few thoughts, we might consider adding: *Splitting commits is crucial because it provides "ease of revertability." If a specific commit causes an issue, reverting a single logical unit is much easier and less disruptive than reverting an entire feature.
This strategy helps maintain the project's health and efficiency by ensuring changes are easier to manage, review, and revert if necessary. |
||
|
||
Here are some external resources that provide guidance on writing good commit messages: | ||
|
||
- <https://www.berrange.com/tags/commit-message/> | ||
- <https://docs.kernel.org/process/submitting-patches.html#separate-your-changes> | ||
- <https://github.com/kubernetes/community/blob/master/contributors/guide/pull-requests.md#smaller-is-better-small-commits-small-pull-requests> | ||
|
||
While not a requirement right now, we may want to consider following a spec like | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps we can also mention 50/72 rule, since Conventional Commits don't seem to cover it. E.g. this: https://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html |
||
[Conventional Commits](https://www.conventionalcommits.org/en/v1.0.0/) in the | ||
future. It's worth a look for inspiration, at least. | ||
|
||
When your PR contains multiple commits, it is helpful to reuse the work you put | ||
into crafting high-quality commit messages. Here is a script you can use to generate | ||
a summary of the commits in your PR. It starts with a list of the commits one | ||
line at a time and then follows that with the full commit messages. The format works | ||
for pasting directly into GitHub. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The GitHub CLI tools are also good for this. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, I actually use If it's just one commit, it already does what I want by default, which is include the full commit message as the PR title and description. That reminds me of another point to make -- for a PR that is a single commit, the commit message should be equal to the PR description in most cases. Otherwise, you're probably leaving out info that should be in the commit message, as well. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would prefer if the PR description contained a hand crafted description of the overall fix. This way the user pushing the PR has to think about the overall fix or feature being pushed. It provides a good overview to the reader of the PR afterwards. It should contain breakdown of the different parts if helpful or can be read in the commits. |
||
|
||
```sh | ||
#!/bin/bash | ||
|
||
git log --reverse --oneline origin/main..HEAD | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While the script is useful, it highly depends on how the remote is set up. I'd add a note about making sure the remote points to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes. Is |
||
echo | ||
git log --reverse origin/main..HEAD | ||
``` | ||
|
||
Here is an example PR that uses a PR description that was generated using this | ||
script: <https://github.com/instructlab/sdg/pull/67>. | ||
|
||
### CI checks | ||
|
||
We should require that all CI checks pass on a Pull Request before it can be considered for merge. Every repository should have at mimimum the following checks: | ||
|
@@ -28,7 +68,7 @@ There are [three different merge methods offered by GitHub](https://docs.github. | |
|
||
We use the default merge method of creating merge commits for PRs. This is to ensure we retain the full commit history as intentionally structured by the PR author while also retaining metadata about the PR itself in the merge commit. | ||
|
||
This requires project maintainers to include commit messages and the overall structure of the commit series as part of their review. When multiple commits are present, they should represent a logical series of changes that implement the overall change proposed in the PR. The commit message for each should clearly explain that step of the progression. | ||
This requires project maintainers to include commit messages and the overall structure of the commit series as part of their review. | ||
|
||
It is common that a PR author may need to do a final rebase to clean up their proposed commit series before a PR can be merged. It is also fine for a project maintainer to perform this step when the changes necessary are straight forward enough to do so. This includes doing a final rebase on `main` if necessary. The PR itself should NOT include any merge commits of `main` back into the developer's branch. We expect the proposed commit series to be a clean set of commits against `main` without conflicts or merge commit history. We only use a merge commit to record the PR's inclusion into `main`. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to be explicit here about squashing fixes and cleanups into those logical commits?
I look for high-level context about "what" a change is to frame my review, as well as some reason "why" the change is needed to understand prioritization and benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. I have some older text at the bottom of this doc that could be moved up here as a start. I already moved up some of it. Here's what could be moved: