git/contrib
Colin Stagner 28a7e27cff contrib/subtree: detect rewritten subtree commits
git subtree split --prefix P

detects splits that are outside of path prefix `P` and prunes
them from history graph processing. This improves the performance
of repeated `split --rejoin` with many different prefixes.

Both before and after 83f9dad7d6 (contrib/subtree: fix split with
squashed subtrees, 2025-09-09), the pruning logic does not detect
**rebased** or **cherry-picked** git-subtree commits. If `split`
encounters any of these commits, the split output may have
incomplete history.

All commits authored by

    git subtree merge [--squash] --prefix Q

have a first or second parent that has *only* subtree commits
as ancestors. When splitting a completely different path `P/`,
it is safe to ignore:

1. the merged tree
2. the subtree parent
3. *all* of that parent's ancestry, which applies only to
   path `Q/` and not `P/`.

But this relationship no longer holds if the git-subtree commit
is rebased or otherwise reauthored. After a rebase, the former
git-subtree commit will have other unrelated commits as ancestors.
Ignoring these commits may exclude the history of `P/`,
leading to incomplete `subtree split` output.

The pruning logic relies solely on the `git-subtree-*:` trailers
to detect git-subtree commits, which it blindly accepts without
further validation. The split logic also takes its time about
being wrong: `cmd_split()` execs a `git show` for *every* commit
in the split range… twice. This is inefficient in a shell script.

Add a "reality check" to ignore rebased or rewritten commits:

* Rewrites of non-merge commits cannot be detected, so the new
  detector no longer looks for them.

* Merges carry a `git-subtree-mainline:` trailer with the hash of
  the **first parent**. If this hash differs, or if the "merge"
  commit no longer has multiple parents, a rewrite has occurred.

To increase speed, package this logic in a new method,
`find_other_splits()`. Perform the check up-front by iterating
over a single `git log`. Add ignored subtrees to:

1. the `notree` cache, which excludes them from the `split` history

2. a `prune` negative refs list. The negative refs prevent
   recursing into other subtrees. Since there are potentially a
   *lot* of these, cache them on disk and use rev-list's
   `--stdin` mode.

Reported-by: George <george@mail.dietrich.pub>
Signed-off-by: Colin Stagner <ask+git@howdoi.land>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 20:21:43 -08:00
..
2025-12-16 11:08:35 +09:00
2024-04-05 09:49:38 -07:00

Contributed Software

Although these pieces are available as part of the official git
source tree, they are in somewhat different status.  The
intention is to keep interesting tools around git here, maybe
even experimental ones, to give users an easier access to them,
and to give tools wider exposure, so that they can be improved
faster.

I am not expecting to touch these myself that much.  As far as
my day-to-day operation is concerned, these subdirectories are
owned by their respective primary authors.  I am willing to help
if users of these components and the contrib/ subtree "owners"
have technical/design issues to resolve, but the initiative to
fix and/or enhance things _must_ be on the side of the subtree
owners.  IOW, I won't be actively looking for bugs and rooms for
enhancements in them as the git maintainer -- I may only do so
just as one of the users when I want to scratch my own itch.  If
you have patches to things in contrib/ area, the patch should be
first sent to the primary author, and then the primary author
should ack and forward it to me (git pull request is nicer).
This is the same way as how I have been treating gitk, and to a
lesser degree various foreign SCM interfaces, so you know the
drill.

I expect things that start their life in the contrib/ area
to graduate out of contrib/ once they mature, either by becoming
projects on their own, or moving to the toplevel directory.  On
the other hand, I expect I'll be proposing removal of disused
and inactive ones from time to time.

If you have new things to add to this area, please first propose
it on the git mailing list, and after a list discussion proves
there is general interest (it does not have to be a
list-wide consensus for a tool targeted to a relatively narrow
audience -- for example I do not work with projects whose
upstream is svn, so I have no use for git-svn myself, but it is
of general interest for people who need to interoperate with SVN
repositories in a way git-svn works better than git-svnimport),
submit a patch to create a subdirectory of contrib/ and put your
stuff there.

-jc