1y ago

Rebase Supremacy

You're viewing a single thread.

238 comments

I know this is a meme post, but can someone succinctly explain rebase vs merge?
I am an amateur trying to learn my tool.
- Merge keeps the original timeline. Your commits go in along with anything else that happened relative to the branch you based your work off (probably main). This generates a merge commit.
  Rebase will replay all the commits that happened while you were doing your work before your commits happen, and then put yours at the HEAD, so that they are the most recent commits. You have to mitigate any conflicts that impact the same files as these commits are replayed, if any conflicts arise. These are resolved the same way any merge conflict is. There is no frivolous merge commit in this scenario.
  TlDR; End result, everything that happened to the branch minus your work, happens. Then your stuff happens after. Much tidy and clean.
  
  Thanks for the explanation. It makes sense. To my untrained eyes, it feels like both merge and rebase have their use. I will try to keep that in mind.
  
  Yes. They do. A lot of people will use vacuous terms like "clean history" when arguing for one over the other. In my opinion, most repositories have larger problems than rebase versus merge. Like commit messages.
  Also, remember, even if your team/repository prefers merges over rebases for getting changes into the main branch, that doesn't mean you shouldn't be using rebase locally for various things.
  
  You nailed it with the critique of commit messages. We use gitmoji to convey at-a-glance topic for commits and otherwise adhere to Tim Pope's school of getting to the point
  
  I must have read that blog post in the past because that's exactly the style I use. Much of it is standard though.
  One MAJOR pet peeve of mine (and I admit it is just an opinion either way) is when people use lower case letters for the first line of the commit message. They typically argue that it is a sentence fragment so shouldn't be capitalized. My counter is that the start of sentences, even fragmented ones, should be capitalized. Also, and more relevant, is that I view the first line of the commit more like the title of something than a sentence. So I use the Wikipedia style of capitalizing.
  
  Gitmoji?
  
  https://gitmoji.dev/
  Quasi parallel reply to your other post, this would kind of echo the want for a capital letter at the start of the commit message. Icon indicates overall topic nature of commits.
  Lets say I am adding a database migration and my commit is the migration file and the schema. My commit message might be:
  
  🗃️ Add notes to Users table
  So anyone looking at the eventual pr will see the icon and know that this bunch of work will affect db without all that tedious "reading the code" part of the review, or for team members who didn't participate in reviews.
  I was initially hesitant to adopt it but I have very reasonable, younger team mates for whom emojis are part of the standard vocabulary. I gradually came to appreciate and value the ability to convey more context in my commits this way. I'm still guilty of the occasionally overusing:
  
  ♻️ Fix the thing
  type messages when I'm lazy; doesn't fix that bad habit, but I'm generally much happier reading mine or someone else's PR commit summary with this extra bit of context added.
  
  I looked at it and there's a lot of them!
  I see things like adding dependencies but I would add the dependency along with the code that's using it so I have that context. Is the Gitmoji way to break your commits up so that it matches a single category?
  
  Yes, that is another benefit, once you start getting muscle memory with the library. You start to parcel things by context a bit more. It's upped my habit of discrete commit-by-hunks, which also serves as a nice self-review of the work.
  
  I don't see that as a benefit tbh - if I have a dependency, I want to see why it's there as part of the commit. I'm imagining running blame on Cargo.toml and seeing "Add feature x" vs "Add dependency". I guess the idea is it's "➕ Add dep y for feature x" but I'd still rather be able to see the related code in the same commit instead of having to find the useful commit in the log.
  I suppose you could squash them together later, but then why bother splitting it out in the first place?
  I see that some use a subset of Gitmoji and that does make sense to me - after all, you wouldn't use all of them in every project anyway, e.g. 🏷️ types is only relevant for a few languages.
  
  How would rebasing my own branch work? Do I rebase the main into my branch, or make a copy of the main branch and then rebase? I have trouble grasping how that would work.
  
  You're still rebasing your branch onto main (or whatever you originally branched it off of), but you aren't then doing a fast forward merge of main to your branch.
  The terminology gets weird. When people say "merge versus rebase" they really mean it in the context of brining changes into main. You (or the remote repository) cannot do this without a merge. People usually mean "merge commit versus rebase with fast forward merge"
  
  Yeah I was confused because you are right, merge is usually refered as the git merge and then git commit.
  It makes sense. Thanks for the clarification
  
  Here's an example
  Say I work on authentication under feature/auth Monday and get some done. Tuesday an urgent feature request for some logging work comes in and I complete it on feature/logging and merge clean to main. To make sure all my code from Monday will work, I will then switch to feature/auth and then git pull --rebase origin main. Now my auth commits start after the merge commit from the logging pr.
  
  Thanks for the example. Rebase use is clearer now.
  
  100% they do. Rebase is an everyday thing, merge is for PRs (for me anyway). Or merges are for regular branches if you roll that way. The only wrong answer is the one that causes you to lose commits and have to use reflog, cos....well, then you done messed up now son... (but even then hope lives on!)
  
  Yes. My rule of thumb is that generally rebasing is the better approach, in part because if your commit history is relatively clean then it is easier to merge in changes one commit at a time than all at once. However, sometimes so much has changed that replaying your commits puts you in the position of having to solve so many problems that it is more trouble than it is worth, in which case you should feel no qualms about aborting the rebase (git rebase --abort) and using a merge instead.
  
  I have the bad habit of leaving checkpoints everywhere because of merge squash that I am trying to fix. I think that forcing myself to rebase would help get rid of that habit. And the good thing is that I am the sole FW dev at work, so I can do whatever I want with the repos.
  
  Never use rebase for any branch that has left your machine (been pushed) and which another entity may have a local copy of (especially if that entity may have committed edits to it).
- Merge gives an accurate view of the history but tends to be "cluttered" with multiple lines and merge commits. Rebase cleans that up and gives you a simple A->B->C view.
  Personally I prefer merge because when I'm tracking down a bug and narrow it down to a specific commit, I get to see what change was made in what context. With rebase commits that change is in there, but it's out of context and cluttered up with zillions of other changes from the inherent merges and squashes that are included in that commit, making it harder to see what was changed and why. The same cluttered history is still in there but it's included in the commits instead of existing separately outside the commits.
  I honestly can't see the point of a rebased A->B->C history because (a) it's inaccurate and (b) it makes debugging harder. Maybe I'm missing some major benefit? I'm willing to learn.
  
  I feel the opposite, but for similar logic? Merge is the one that is cluttered up with other merges.
  With rebase you get A->B->C for the main branch, and D->E->F for the patch branch, and when submitting to main you get a nice A->B->C->D->E->F and you can find your faulty commit in the D->E->F section.
  For merge you end up with this nonsense of mixed commits and merge commits like A->D->B->B'->E->F->C->C' where the ones with the apostrophe are merge commits. And worse, in a git lot there is no clear "D E F" so you don't actually know if A, D or B came from the feature branch, you just know a branch was merged at commit B'. You'd have to try to demangle it by looking at authors and dates.
  The final code ought to look the same, but now if you're debugging you can't separate the feature patch from the main path code to see which part was at fault. I always rebase because it's equivalent to checking out the latest changes and re-branching so I'm never behind and the patch is always a unique set of commits.
  
  For merge you end up with this nonsense of mixed commits and merge commits like A->D->B->B’->E->F->C->C’ where the ones with the apostrophe are merge commits.
  Your notation does not make sense. You're representing a multi-dimensional thing in one dimension. Of course it's a mess if you do that.
  Your example is also missing a crucial fact required when reasoning about merges: The merge base.
  Typically a branch is "branched off" from some commit M. D's and A's parent would be M (though there could be any amount of commits between A and M). Since A is "on the main branch", you can conclude that D is part of a "patch branch". It's quite clear if you don't omit this fact.
  I also don't understand why your example would have multiple merges.
  Here's my example of a main branch with a patch branch; in 2D because merges can't properly be represented in one dimension:
  
  M - A - B - C - C' \ / D - E - F
  The final code ought to look the same, but now if you’re debugging you can’t separate the feature patch from the main path code to see which part was at fault.
  If you use a feature branch workflow and your main branch is merged into, you typically want to use first-parent bisects. They're much faster too.
  
  You're right, I'm not representing the merge correctly. I was thinking of having multiple merges because for a long running patch branch you might merge main into the patch branch several times before merging the patch branch into main.
  I'm so used to rebasing I forgot there's tools that correctly show all the branching and merges and things.
  Idk, I just like rebase's behavior over merge.
  
  The thing is, you can get your cake and eat it too. Rebase your feature branches while in development and then merge them to the main branch when they're done.
  
  👏 Super duper this is the way. No notes!
  
  I would advocate for using each tool, where it makes sense, to achieve a more intelligible graph. This is what I've been moving towards on my personal projects (am solo). I imagine with any moderately complex group project it becomes very difficult to keep things neat.
  In order of expected usage frequency:
  Rebase: everything that's not 2 or 3. keep main and feature lines clean.
  Merge: ideally, merge should only be used to bring feature branches into main at stable sequence points.
  Squash: only use squash to remove history that truly is useless. (creating a bug on a feature branch and then solving it two commits later prior to merge).
  History should be viewable from log --all --decorate --oneline --graph; not buried in squash commits.
  
  Folks should make sure the final series of commits in pull requests have atomic changes and that each individual commit works and builds successfully alone. Things like fixup commits with auto squash rebase. THIS WAY you can still narrow it down to one commit regardless of the approach.

238 comments