Colin Stagner 28a7e27cff contrib/subtree: detect rewritten subtree commits
git subtree split --prefix P

detects splits that are outside of path prefix `P` and prunes
them from history graph processing. This improves the performance
of repeated `split --rejoin` with many different prefixes.

Both before and after 83f9dad7d6 (contrib/subtree: fix split with
squashed subtrees, 2025-09-09), the pruning logic does not detect
**rebased** or **cherry-picked** git-subtree commits. If `split`
encounters any of these commits, the split output may have
incomplete history.

All commits authored by

    git subtree merge [--squash] --prefix Q

have a first or second parent that has *only* subtree commits
as ancestors. When splitting a completely different path `P/`,
it is safe to ignore:

1. the merged tree
2. the subtree parent
3. *all* of that parent's ancestry, which applies only to
   path `Q/` and not `P/`.

But this relationship no longer holds if the git-subtree commit
is rebased or otherwise reauthored. After a rebase, the former
git-subtree commit will have other unrelated commits as ancestors.
Ignoring these commits may exclude the history of `P/`,
leading to incomplete `subtree split` output.

The pruning logic relies solely on the `git-subtree-*:` trailers
to detect git-subtree commits, which it blindly accepts without
further validation. The split logic also takes its time about
being wrong: `cmd_split()` execs a `git show` for *every* commit
in the split range… twice. This is inefficient in a shell script.

Add a "reality check" to ignore rebased or rewritten commits:

* Rewrites of non-merge commits cannot be detected, so the new
  detector no longer looks for them.

* Merges carry a `git-subtree-mainline:` trailer with the hash of
  the **first parent**. If this hash differs, or if the "merge"
  commit no longer has multiple parents, a rewrite has occurred.

To increase speed, package this logic in a new method,
`find_other_splits()`. Perform the check up-front by iterating
over a single `git log`. Add ignored subtrees to:

1. the `notree` cache, which excludes them from the `split` history

2. a `prune` negative refs list. The negative refs prevent
   recursing into other subtrees. Since there are potentially a
   *lot* of these, cache them on disk and use rev-list's
   `--stdin` mode.

Reported-by: George <george@mail.dietrich.pub>
Signed-off-by: Colin Stagner <ask+git@howdoi.land>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 20:21:43 -08:00
..
2025-01-17 09:56:37 -08:00
2012-04-09 22:26:19 -05:00
2025-03-01 10:00:52 -08:00
2021-04-28 16:47:19 +09:00

Please read git-subtree.adoc for documentation.

Please don't contact me using github mail; it's slow, ugly, and worst of
all, redundant. Email me instead at apenwarr@gmail.com and I'll be happy to
help.

Avery