79459 Commits

Author SHA1 Message Date
Junio C Hamano
79e3055bab Merge branch 'cs/rebased-subtree-split'
The split command in "git subtree" (in contrib/) has been taught to
deal better with rebased history.

* cs/rebased-subtree-split:
  contrib/subtree: detect rewritten subtree commits
2026-01-21 08:28:57 -08:00
Junio C Hamano
9813aace1e Merge branch 'je/doc-reset'
Documentation updates.

* je/doc-reset:
  doc: git-reset: clarify `git reset <pathspec>`
  doc: git-reset: clarify `git reset [mode]`
  doc: git-reset: clarify intro
  doc: git-reset: reorder the forms
2026-01-21 08:28:57 -08:00
Junio C Hamano
6edbb7b1d0 Merge branch 'en/fsck-snapshot-ref-state'
"git fsck" used inconsistent set of refs to show a confused
warning, which has been corrected.

* en/fsck-snapshot-ref-state:
  fsck: snapshot default refs before object walk
2026-01-21 08:28:57 -08:00
Junio C Hamano
b5c409c40f Merge a handful more topics after -rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-16 12:40:28 -08:00
Junio C Hamano
3f202b56ee Merge branch 'ml/doc-blame-markup'
Doc mark-up update.

* ml/doc-blame-markup:
  doc: git-blame: convert to new doc format
  doc: blame-options: convert to new doc format
2026-01-16 12:40:28 -08:00
Junio C Hamano
ffae4da012 Merge branch 'kh/doc-patch-id'
"git patch-id" documentation updates.

* kh/doc-patch-id:
  doc: patch-id: --verbatim locks in --stable
  doc: patch-id: spell out the git-diff-tree(1) form
  doc: patch-id: use definite article for the result
  patch-id: use “patch ID” throughout
  doc: patch-id: capitalize Git version
  doc: patch-id: don’t use semicolon between bullet points
2026-01-16 12:40:28 -08:00
Junio C Hamano
1cb041f795 Merge branch 'bc/doc-stash-import-export'
Update a FAQ entry on synching two separate repositories using the
"git stash export/import" recently introduced.

* bc/doc-stash-import-export:
  gitfaq: document using stash import/export to sync working tree
2026-01-16 12:40:27 -08:00
Junio C Hamano
a7bbe907e3 Merge branch 'kj/t7101-modernize'
Test update.

* kj/t7101-modernize:
  t7101: modernize test path checks
2026-01-16 12:40:27 -08:00
Junio C Hamano
c813513c73 Merge branch 'ds/builtin-doc-update'
Update in-code comment doc to match the current API.

* ds/builtin-doc-update:
  builtin.h: update documentation
2026-01-16 12:40:27 -08:00
Junio C Hamano
5058dc1ec1 Merge branch 'ac/t1420-use-more-direct-check'
Test update.

* ac/t1420-use-more-direct-check:
  t1420: modernize the lost-found test
2026-01-16 12:40:27 -08:00
Junio C Hamano
8c5f0adf21 Merge branch 'jk/cat-file-avoid-bitmap-when-unneeded'
Fix for a performance regression in "git cat-file".

* jk/cat-file-avoid-bitmap-when-unneeded:
  cat-file: only use bitmaps when filtering
2026-01-16 12:40:27 -08:00
Junio C Hamano
18d7d02088 Merge branch 'jk/t-perf-fixes'
Perf-test fixes.

* jk/t-perf-fixes:
  t/perf/run: preserve GIT_PERF_* from environment
  t/perf/perf-lib: fix assignment of TEST_OUTPUT_DIRECTORY
2026-01-16 12:40:26 -08:00
Junio C Hamano
a3d1f391d3 Revert "Merge branch 'ar/run-command-hook'"
This reverts commit f406b8955295d01089ba2baf35eceadff2d11cae,
reversing changes made to 1627809eeff75e6ec936fc609e7be46d5eb2fa9e.

It seems to have caused a few regressions, two of the three known
ones we have proposed solutions for.  Let's give ourselves a bit
more room to maneuver during the pre-release freeze period and
restart once the 2.53 ships.
2026-01-15 13:02:38 -08:00
Junio C Hamano
7264e61d87 Git 2.53-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
v2.53.0-rc0
2026-01-15 07:12:41 -08:00
Junio C Hamano
87e278d837 Merge branch 'ps/clar-integers'
Import newer version of "clar", unit testing framework.

* ps/clar-integers:
  gitattributes: disable blank-at-eof errors for clar test expectations
  t/unit-tests: demonstrate use of integer comparison assertions
  t/unit-tests: update clar to 39f11fe
2026-01-15 07:12:41 -08:00
Junio C Hamano
e7b120c357 Merge branch 'kh/replay-invalid-onto-advance'
Improve the error message when a bad argument is given to the
`--onto` option of "git replay".  Test coverage of "git replay" has
been improved.

* kh/replay-invalid-onto-advance:
  t3650: add more regression tests for failure conditions
  replay: die if we cannot parse object
  replay: improve code comment and die message
  replay: die descriptively when invalid commit-ish is given
  replay: find *onto only after testing for ref name
  replay: remove dead code and rearrange
2026-01-15 07:12:41 -08:00
Junio C Hamano
24c43fb10b Merge branch 'ps/odb-misc-fixes'
Miscellaneous fixes on object database layer.

* ps/odb-misc-fixes:
  odb: properly close sources before freeing them
  builtin/gc: fix condition for whether to write commit graphs
2026-01-15 07:12:41 -08:00
Junio C Hamano
c0453835ab Merge branch 'pt/t7800-difftool-test-racefix'
Test fixup.

* pt/t7800-difftool-test-racefix:
  t7800: fix racy "difftool --dir-diff syncs worktree" test
2026-01-15 07:12:40 -08:00
Junio C Hamano
8745eae506 The 17th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-12 05:19:53 -08:00
Junio C Hamano
5323fafdf4 Merge branch 'js/mailmap-karsten-blees'
Mailmap update for Karsten

* js/mailmap-karsten-blees:
  .mailmap: replace Karsten Blees' default address
2026-01-12 05:19:52 -08:00
Junio C Hamano
6506be0304 Merge branch 'ps/t1300-2021-use-test-path-is-helpers'
Test updates.

* ps/t1300-2021-use-test-path-is-helpers:
  t1300: use test helpers instead of `test` command
2026-01-12 05:19:52 -08:00
Junio C Hamano
3235ef374e Merge branch 'rs/commit-stack'
Code clean-up, unifying various hand-rolled "list of commit
objects" and use the commit_stack API.

* rs/commit-stack:
  commit-reach: use commit_stack
  commit-graph: use commit_stack
  commit: add commit_stack_grow()
  shallow: use commit_stack
  pack-bitmap-write: use commit_stack
  commit: add commit_stack_init()
  test-reach: use commit_stack
  remote: use commit_stack for src_commits
  remote: use commit_stack for sent_tips
  remote: use commit_stack for local_commits
  name-rev: use commit_stack
  midx: use commit_stack
  log: use commit_stack
  revision: export commit_stack
2026-01-12 05:19:52 -08:00
Junio C Hamano
0320bcd743 Merge branch 'sb/bundle-uri-without-uri'
Diagnose invalid bundle-URI that lack the URI entry, instead of
crashing.

* sb/bundle-uri-without-uri:
  bundle-uri: validate that bundle entries have a uri
2026-01-12 05:19:52 -08:00
Junio C Hamano
6a4b4e7880 Merge branch 'ja/doc-synopsis-style-more'
More doc style updates.

* ja/doc-synopsis-style-more:
  doc: convert git-remote to synopsis style
  doc: convert git stage to use synopsis block
  doc: convert git-status tables to AsciiDoc format
  doc: convert git-status to synopsis style
  doc: fix t0450-txt-doc-vs-help to select only first synopsis block
2026-01-12 05:19:52 -08:00
Johannes Schindelin
e97678c4ef .mailmap: replace Karsten Blees' default address
As per a recent email by Karsten, the @dcon.de address no longer works:
https://lore.kernel.org/git/77e768b2-6693-454f-9e11-fb0acdec703c@gmail.com

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-10 09:32:51 -08:00
Colin Stagner
28a7e27cff contrib/subtree: detect rewritten subtree commits
git subtree split --prefix P

detects splits that are outside of path prefix `P` and prunes
them from history graph processing. This improves the performance
of repeated `split --rejoin` with many different prefixes.

Both before and after 83f9dad7d6 (contrib/subtree: fix split with
squashed subtrees, 2025-09-09), the pruning logic does not detect
**rebased** or **cherry-picked** git-subtree commits. If `split`
encounters any of these commits, the split output may have
incomplete history.

All commits authored by

    git subtree merge [--squash] --prefix Q

have a first or second parent that has *only* subtree commits
as ancestors. When splitting a completely different path `P/`,
it is safe to ignore:

1. the merged tree
2. the subtree parent
3. *all* of that parent's ancestry, which applies only to
   path `Q/` and not `P/`.

But this relationship no longer holds if the git-subtree commit
is rebased or otherwise reauthored. After a rebase, the former
git-subtree commit will have other unrelated commits as ancestors.
Ignoring these commits may exclude the history of `P/`,
leading to incomplete `subtree split` output.

The pruning logic relies solely on the `git-subtree-*:` trailers
to detect git-subtree commits, which it blindly accepts without
further validation. The split logic also takes its time about
being wrong: `cmd_split()` execs a `git show` for *every* commit
in the split range… twice. This is inefficient in a shell script.

Add a "reality check" to ignore rebased or rewritten commits:

* Rewrites of non-merge commits cannot be detected, so the new
  detector no longer looks for them.

* Merges carry a `git-subtree-mainline:` trailer with the hash of
  the **first parent**. If this hash differs, or if the "merge"
  commit no longer has multiple parents, a rewrite has occurred.

To increase speed, package this logic in a new method,
`find_other_splits()`. Perform the check up-front by iterating
over a single `git log`. Add ignored subtrees to:

1. the `notree` cache, which excludes them from the `split` history

2. a `prune` negative refs list. The negative refs prevent
   recursing into other subtrees. Since there are potentially a
   *lot* of these, cache them on disk and use rev-list's
   `--stdin` mode.

Reported-by: George <george@mail.dietrich.pub>
Signed-off-by: Colin Stagner <ask+git@howdoi.land>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 20:21:43 -08:00
Elijah Newren
f6b262581a fsck: snapshot default refs before object walk
Fsck has a race when operating on live repositories; consider the
following simple script that writes new commits as fsck runs:

    #!/bin/bash
    git fsck &
    PID=$!

    while ps -p $PID >/dev/null; do
        sleep 3
        git commit -q --allow-empty -m "Another commit"
    done

Since fsck walks objects for connectivity and then reads the refs at the
end to check, this can cause fsck to get confused and think that the new
refs refer to missing commits and that new reflog entries are invalid.
Running the above script in a clone of git.git results in the following
(output ellipsized to remove additional errors of the same type):

    $ ./fsck-while-writing.sh
    Checking ref database: 100% (1/1), done.
    Checking object directories: 100% (256/256), done.
    warning in tag d6602ec5194c87b0fc87103ca4d67251c76f233a: missingTaggerEntry: invalid format - expected 'tagger' line
    Checking objects: 100% (835091/835091), done.
    error: HEAD: invalid reflog entry 2aac9f9286e2164fbf8e4f1d1df53044ace2b310
    error: HEAD: invalid reflog entry 2aac9f9286e2164fbf8e4f1d1df53044ace2b310
    error: HEAD: invalid reflog entry da0f5b80d61844a6f0ad2ddfd57e4fdfa246ea68
    error: HEAD: invalid reflog entry da0f5b80d61844a6f0ad2ddfd57e4fdfa246ea68
    [...]
    error: HEAD: invalid reflog entry 87c8a5c2f6b79d9afa9e941590b9a097b6f7ac09
    error: HEAD: invalid reflog entry d80887a48865e6ad165274b152cbbbed29f8a55a
    error: HEAD: invalid reflog entry d80887a48865e6ad165274b152cbbbed29f8a55a
    error: HEAD: invalid reflog entry 6724f2dfede88bfa9445a333e06e78536c0c6c0d
    error: refs/heads/mybranch invalid reflog entry 2aac9f9286e2164fbf8e4f1d1df53044ace2b310
    error: refs/heads/mybranch: invalid reflog entry 2aac9f9286e2164fbf8e4f1d1df53044ace2b310
    error: refs/heads/mybranch: invalid reflog entry da0f5b80d61844a6f0ad2ddfd57e4fdfa246ea68
    error: refs/heads/mybranch: invalid reflog entry da0f5b80d61844a6f0ad2ddfd57e4fdfa246ea68
    [...]
    error: refs/heads/mybranch: invalid reflog entry 87c8a5c2f6b79d9afa9e941590b9a097b6f7ac09
    error: refs/heads/mybranch: invalid reflog entry d80887a48865e6ad165274b152cbbbed29f8a55a
    error: refs/heads/mybranch: invalid reflog entry d80887a48865e6ad165274b152cbbbed29f8a55a
    error: refs/heads/mybranch: invalid reflog entry 6724f2dfede88bfa9445a333e06e78536c0c6c0d
    Checking connectivity: 833846, done.
    missing commit 6724f2dfede88bfa9445a333e06e78536c0c6c0d
    Verifying commits in commit graph: 100% (242243/242243), done.

We can minimize the race opportunities by taking a snapshot of refs at
program invocation, doing the connectivity check, and then checking the
snapshotted refs afterward.  This avoids races with regular refs between
fsck and adding objects to the database, though it still leaves a race
between a gc and fsck.  We are less concerned about folks simultaneously
running gc with fsck; though, if it becomes an issue, we could lock fsck
during gc.  We definitely do not want to lock fsck during operations
that may add objects to the object store; that would be problematic for
forges.

Note that refs aren't the only problem, though; reflog entries and index
entries could be problematic as well.  For now we punt on index entries
just leaving a TODO comment, and for reflogs we use a coarse solution of
taking the time at the beginning of the program and ignoring reflog
entries newer than that time.  That may be imperfect if dealing with a
network filesystem, so we leave TODO comment for those that want to
improve that handling as well.

As a high level overview:
  * In addition to fsck_handle_ref(), which now is only a few lines long
    to process a ref, there's also a snapshot_ref() which is called
    early in the program for each ref and takes all the error checking
    logic.
  * The iterating over refs that used to be in get_default_heads() plus
    a loop over the arguments now appears in shapshot_refs().
  * There's a new process_refs() as well that kind of looks like the old
    get_default_heads() though it is streamlined due to the work done by
    snapshot_refs().

This combination of changes modifies the output of running the script
(from the beginning of this commit message) to:

    $ ./fsck-while-writing.sh
    Checking ref database: 100% (1/1), done.
    Checking object directories: 100% (256/256), done.
    warning in tag d6602ec5194c87b0fc87103ca4d67251c76f233a: missingTaggerEntry: invalid format - expected 'tagger' line
    Checking objects: 100% (835091/835091), done.
    Checking connectivity: 833846, done.
    Verifying commits in commit graph: 100% (242243/242243), done.

While worries about live updates while running fsck is likely of most
interest for forge operators, it may also benefit those with
automated jobs (such as git maintenance) or even casual users who want
to do other work in their clone while fsck is running.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 18:21:37 -08:00
Derrick Stolee
2ac93bfcbc builtin.h: update documentation
The documentation for the builtin API was moved from the technical
documentation and into a comment in builtin.h by ec14d4ecb5 (builtin.h: take
over documentation from api-builtin.txt, 2017-08-02). This documentation
wasn't updated as part of the major overhaul to include a repository struct
in 9b1cb5070f (builtin: add a repository parameter for builtin functions,
2024-09-13).

There was a brief update regarding the move from *.txt to *.adoc by
e8015223c7 (builtin.h: *.txt -> *.adoc fixes, 2025-03-03).

I noticed that there was quite a bit missing from the old documentation,
which is still visible on git-scm.com [1].

[1] https://github.com/git/git-scm.com/issues/2124

This change updates the documentation in the following ways:

 1. Updates the cmd_foo() prototype to include a repository.
 2. Adds some newlines to have uniformity in the list of flags.
 3. Adds a description of the NO_PARSEOPT flag.
 4. Describes the tests that perform checks on all builtins, which may trip
    up a contributor working on a new builtin.

I double-checked these instructions against a toy example in my local branch
to be sure that it was complete.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:37:02 -08:00
K Jayatheerth
dbbf6a901b t7101: modernize test path checks
Replace old-style `test -[df]` and `! test -[df]` assertions with
the modern `test_path_is_file`, `test_path_is_dir`, and
`test_path_is_missing` helpers.

These helpers provide more informative error messages in case of
failure (e.g., "File 'foo' is missing" instead of just exit code 1).

While at it, fix a typo and an incorrect path
reference in one of the test descriptions.

Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:36:07 -08:00
brian m. carlson
02fc44a989 gitfaq: document using stash import/export to sync working tree
Git 2.51 learned how to import and export stashes.  This is a
secure and robust way to transfer working tree states across machines
and comes with almost none of the pitfalls of rsync or other tools.
Recommend this as an alternative in the FAQ.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:32:56 -08:00
Michael Lyons
e40e01a75a doc: git-blame: convert to new doc format
- Use _<placeholder>_ instead of <placeholder> in the description
- Use _underscores_ around math associated with <placeholders>
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.

Signed-off-by: Michael Lyons <git@michael.lyo.nz>
Acked-by: Jean-Noël AVILA <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:15:53 -08:00
Michael Lyons
2bfc69e648 doc: blame-options: convert to new doc format
- Use _<placeholder>_ instead of <placeholder> in the description
- Modify some samples to use <placeholders>
- Use `backticks` for keywords and more complex option
descriptions. The new rendering engine will apply synopsis rules to
these spans.

Signed-off-by: Michael Lyons <git@michael.lyo.nz>
Acked-by: Jean-Noël AVILA <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:15:31 -08:00
Kristoffer Haugsbakk
3f051fc9c9 doc: patch-id: --verbatim locks in --stable
The default `--unstable` is a legacy format that predates `--stable`.
That’s why 2871f4d4 (builtin: patch-id: add --verbatim as a command mode,
2022-10-24) made `--verbatim` lock in[1] `--stable`:

    Users of --unstable mainly care about compatibility with old git
    versions, which unstripping the whitespace would break. Thus there
    isn't a usecase for the combination of --verbatim and --unstable,
    and we don't expose this so as to not add maintainence burden.

† 1: imply `--stable`, disallow `--unstable`

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:08:37 -08:00
Kristoffer Haugsbakk
89d4f3af16 doc: patch-id: spell out the git-diff-tree(1) form
You specifically need `--patch` since the default output is a raw diff.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:07:22 -08:00
Kristoffer Haugsbakk
f671f5a83b doc: patch-id: use definite article for the result
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:07:21 -08:00
Kristoffer Haugsbakk
285659cc98 patch-id: use “patch ID” throughout
The “Description” section decided to introduce and use the term “patch
ID” for the ID value itself.  Let’s use the same term on the options as
well.

Also make to sure to use bare “ID” instead of “id”.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:07:21 -08:00
Kristoffer Haugsbakk
92a61fe44d doc: patch-id: capitalize Git version
Git versions are always capitalized.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:07:21 -08:00
Kristoffer Haugsbakk
3d61c1988b doc: patch-id: don’t use semicolon between bullet points
These bullet points are full-fledged paragraphs with sentences.  It’s
best to restrict semicolon-termination to the case when the bullet list
amounts to a list of items.[1]

† 1: Like “List: ... • first; ... • second; and ... • third.”

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-09 06:07:20 -08:00
Junio C Hamano
d529f3a197 The 16th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-08 16:40:12 +09:00
Junio C Hamano
2db806d817 Merge branch 'en/ort-recursive-d-f-conflict-fix'
The ort merge machinery hit an assertion failure in a history with
criss-cross merges renamed a directory and a non-directory, which
has been corrected.

* en/ort-recursive-d-f-conflict-fix:
  merge-ort: fix corner case recursive submodule/directory conflict handling
2026-01-08 16:40:12 +09:00
Junio C Hamano
512351f2a8 Merge branch 'dd/t5403-modernise'
Test micro-clean-up.

* dd/t5403-modernise:
  t5403: use test_path_is_file instead of test -f
2026-01-08 16:40:12 +09:00
Junio C Hamano
c0754dc423 Merge branch 'ds/diff-lazy-fetch-with-name-only-fix'
Running "git diff" with "--name-only" and other options that allows
us not to look at the blob contents, while objects that are lazily
fetched from a promisor remote, caused use-after-free, which has
been corrected.

* ds/diff-lazy-fetch-with-name-only-fix:
  diff: avoid segfault with freed entries
2026-01-08 16:40:11 +09:00
Junio C Hamano
d28d2be5f2 Merge branch 'rs/tag-wo-the-repository'
Code clean-up.

* rs/tag-wo-the-repository:
  tag: stop using the_repository
  tag: support arbitrary repositories in parse_tag()
  tag: support arbitrary repositories in gpg_verify_tag()
  tag: use algo of repo parameter in parse_tag_buffer()
2026-01-08 16:40:11 +09:00
Andrew Chitester
6c5c7e7071 t1420: modernize the lost-found test
This test indirectly checks that the lost-found folder has 2 files in it
and then checks that the expected two files exist. Make this more
deliberate by removing the old test -f and compare the actual ls of the
lost-found directory with the expected files.

Signed-off-by: Andrew Chitester <andchi@fastmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 09:19:41 +09:00
Patrick Steinhardt
3d09968656 odb: properly close sources before freeing them
It is possible to hit a memory leak when reading data from a submodule
via git-grep(1):

  Direct leak of 192 byte(s) in 1 object(s) allocated from:
    #0 0x55555562e726 in calloc (git+0xda726)
    #1 0x555555964734 in xcalloc ../wrapper.c:154:8
    #2 0x555555835136 in load_multi_pack_index_one ../midx.c:135:2
    #3 0x555555834fd6 in load_multi_pack_index ../midx.c:382:6
    #4 0x5555558365b6 in prepare_multi_pack_index_one ../midx.c:716:17
    #5 0x55555586c605 in packfile_store_prepare ../packfile.c:1103:3
    #6 0x55555586c90c in packfile_store_reprepare ../packfile.c:1118:2
    #7 0x5555558546b3 in odb_reprepare ../odb.c:1106:2
    #8 0x5555558539e4 in do_oid_object_info_extended ../odb.c:715:4
    #9 0x5555558533d1 in odb_read_object_info_extended ../odb.c:862:8
    #10 0x5555558540bd in odb_read_object ../odb.c:920:6
    #11 0x55555580a330 in grep_source_load_oid ../grep.c:1934:12
    #12 0x55555580a13a in grep_source_load ../grep.c:1986:10
    #13 0x555555809103 in grep_source_is_binary ../grep.c:2014:7
    #14 0x555555807574 in grep_source_1 ../grep.c:1625:8
    #15 0x555555807322 in grep_source ../grep.c:1837:10
    #16 0x5555556a5c58 in run ../builtin/grep.c:208:10
    #17 0x55555562bb42 in void* ThreadStartFunc<false>(void*) lsan_interceptors.cpp.o
    #18 0x7ffff7a9a979 in start_thread (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x9a979) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
    #19 0x7ffff7b22d2b in __GI___clone3 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x122d2b) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)

The root caues of this leak is the way we set up and release the
submodule:

  1. We use `repo_submodule_init()` to initialize a new repository. This
     repository is stored in `repos_to_free`.

  2. We now read data from the submodule repository.

  3. We then call `repo_clear()` on the submodule repositories.

  4. `repo_clear()` calls `odb_free()`.

  5. `odb_free()` calls `odb_free_sources()` followed by `odb_close()`.

The issue here is the 5th step: we call `odb_free_sources()` _before_ we
call `odb_close()`. But `odb_free_sources()` already frees all sources,
so the logic that closes them in `odb_close()` now becomes a no-op. As a
consequence, we never explicitly close sources at all.

Fix the leak by closing the store before we free the sources.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 09:16:51 +09:00
Patrick Steinhardt
b3449b1517 builtin/gc: fix condition for whether to write commit graphs
When performing auto-maintenance we check whether commit graphs need to
be generated by counting the number of commits that are reachable by any
reference, but not covered by a commit graph. This search is performed
by iterating through all references and then doing a depth-first search
until we have found enough commits that are not present in the commit
graph.

This logic has a memory leak though:

  Direct leak of 16 byte(s) in 1 object(s) allocated from:
      #0 0x55555562e433 in malloc (git+0xda433)
      #1 0x555555964322 in do_xmalloc ../wrapper.c:55:8
      #2 0x5555559642e6 in xmalloc ../wrapper.c:76:9
      #3 0x55555579bf29 in commit_list_append ../commit.c:1872:35
      #4 0x55555569f160 in dfs_on_ref ../builtin/gc.c:1165:4
      #5 0x5555558c33fd in do_for_each_ref_iterator ../refs/iterator.c:431:12
      #6 0x5555558af520 in do_for_each_ref ../refs.c:1828:9
      #7 0x5555558ac317 in refs_for_each_ref ../refs.c:1833:9
      #8 0x55555569e207 in should_write_commit_graph ../builtin/gc.c:1188:11
      #9 0x55555569c915 in maintenance_is_needed ../builtin/gc.c:3492:8
      #10 0x55555569b76a in cmd_maintenance ../builtin/gc.c:3542:9
      #11 0x55555575166a in run_builtin ../git.c:506:11
      #12 0x5555557502f0 in handle_builtin ../git.c:779:9
      #13 0x555555751127 in run_argv ../git.c:862:4
      #14 0x55555575007b in cmd_main ../git.c:984:19
      #15 0x5555557523aa in main ../common-main.c:9:11
      #16 0x7ffff7a2a4d7 in __libc_start_call_main (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a4d7) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
      #17 0x7ffff7a2a59a in __libc_start_main@GLIBC_2.2.5 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a59a) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
      #18 0x5555555f0934 in _start (git+0x9c934)

The root cause of this memory leak is our use of `commit_list_append()`.
This function expects as parameters the item to append and the _tail_ of
the list to append. This tail will then be overwritten with the new tail
of the list so that it can be used in subsequent calls. But we call it
with `commit_list_append(parent->item, &stack)`, so we end up losing
everything but the new item.

This issue only surfaces when counting merge commits. Next to being a
memory leak, it also shows that we're in fact miscounting as we only
respect children of the last parent. All previous parents are discarded,
so their children will be disregarded unless they are hit via another
reference.

While crafting a test case for the issue I was puzzled that I couldn't
establish the proper border at which the auto-condition would be
fulfilled. As it turns out, there's another bug: if an object is at the
tip of any reference we don't mark it as seen. Consequently, if it is
the tip of or reachable via another ref, we'd count that object multiple
times.

Fix both of these bugs so that we properly count objects without leaking
any memory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 09:16:50 +09:00
Jeff King
9e8b448dd8 cat-file: only use bitmaps when filtering
Commit 8002e8ee18 (builtin/cat-file: use bitmaps to efficiently filter
by object type, 2025-04-02) introduced a performance regression when we
are not filtering objects: it uses bitmaps even when they won't help,
incurring extra costs. For example, running the new perf tests from this
commit, which check the performance of listing objects by oid:

  $ export GIT_PERF_LARGE_REPO=/path/to/linux.git
  $ git -C "$GIT_PERF_LARGE_REPO" repack -adb
  $ GIT_SKIP_TESTS=p1006.1 ./run 8002e8ee18^ 8002e8ee18 p1006-cat-file.sh
  [...]
  Test                                  8002e8ee18^       8002e8ee18
  -------------------------------------------------------------------------------
  1006.2: list all objects (sorted)     1.48(1.44+0.04)   6.39(6.35+0.04) +331.8%
  1006.3: list all objects (unsorted)   3.01(2.97+0.04)   3.40(3.29+0.10) +13.0%
  1006.4: list blobs                    4.85(4.67+0.17)   1.68(1.58+0.10) -65.4%

An invocation that filters, like listing all blobs (1006.4), does
benefit from using the bitmaps; it now doesn't have to check the type of
each object from the pack data, so the tradeoff is worth it.

But for listing all objects in sorted idx order (1006.2), we otherwise
would never open the bitmap nor the revindex file. Worse, our sorting
step gets much worse. Normally we append into an array in pack .idx
order, and the sort step is trivial. But with bitmaps, we get the
objects in pack order, which is apparently random with respect to oid,
and have to sort the whole thing. (Note that this freshly-packed state
represents the best case for .idx sorting; if we had two packs, then
we'd have their objects one after the other and qsort would have to
interleave them).

The unsorted test in 1006.3 is interesting: there we are going in pack
order, so we load the revindex for the pack anyway. And though we don't
sort the result, we do use an oidset to check for duplicates. So we can
see in the 8002e8ee18^ timings that those two things cost ~1.5s over the
sorted case (mostly the oidset hash cost). We also incur the extra cost
to open the bitmap file as of 8002e8ee18, which seems to be ~400ms.
(This would probably be faster with a bitmap lookup table, but writing
that out is not yet the default).

So we know that bitmaps help when there's filtering to be done, but
otherwise make things worse. Let's only use them when there's a filter.

The perf script shows that we've fixed the regressions without hurting
the bitmap case:

  Test                                  8002e8ee18^       8002e8ee18                HEAD
  --------------------------------------------------------------------------------------------------------
  1006.2: list all objects (sorted)     1.56(1.53+0.03)   6.44(6.37+0.06) +312.8%   1.62(1.54+0.06) +3.8%
  1006.3: list all objects (unsorted)   3.04(2.98+0.06)   3.45(3.38+0.07) +13.5%    3.04(2.99+0.04) +0.0%
  1006.4: list blobs                    5.14(4.98+0.15)   1.76(1.68+0.06) -65.8%    1.73(1.64+0.09) -66.3%

Note that there's another related case: we might have a filter that
cannot be used with bitmaps. That check is handled already for us in
for_each_bitmapped_object(), though we'd still load the bitmap and
revindex files pointlessly in that case. I don't think it can happen in
practice for cat-file, though, since it allows only blob:none,
blob:limit, and object:type filters, all of which work with bitmaps.

It would be easy-ish to insert an extra check like:

  can_filter_bitmap(&opt->objects_filter);

into the conditional, but I didn't bother here. It would be redundant
with the call in for_each_bitmapped_object(), and the can_filter helper
function is static local in the bitmap code (so we'd have to make it
public).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 09:05:12 +09:00
Jeff King
79d301c767 t/perf/run: preserve GIT_PERF_* from environment
If you run:

  GIT_PERF_LARGE_REPO=/some/path ./p1006-cat-file.sh

it will use the repo in /some/path. But if you use the "run" helper
script to aggregate and compare results, like this:

  GIT_PERF_LARGE_REPO=/some/path ./run HEAD^ HEAD p1006-cat-file.sh

it will ignore that variable. This is because the presence of the
LARGE_REPO variable in GIT-BUILD-OPTIONS overrides what's in the
environment. This started with 4638e8806e (Makefile: use common template
for GIT-BUILD-OPTIONS, 2024-12-06), which now writes even empty
variables (though arguably it was wrong even before with a non-empty
value, as we generally prefer the environment to take precedence over
on-disk config).

We had the same problem in perf-lib.sh itself, and we hacked around it
with 32b74b9809 (perf: do allow `GIT_PERF_*` to be overridden again,
2025-04-04). That's what lets the direct invocation of "./p1006" work
above.

And in fact that was sufficient for "./run", too, until it started
loading GIT-BUILD-OPTIONS itself in 5756ccd181 (t/perf: fix benchmarks
with out-of-tree builds, 2025-04-28). Now it has the same problem: it
clobbers any incoming GIT_PERF options from the environment.

We can use the same hack here in the "run" script. It's quite ugly, but
it's just short enough that I don't think it's worth trying to factor it
out into a common shell library.

In the long run, we might consider teaching GIT-BUILD-OPTIONS to be more
gentle in overwriting existing entries. There are probably other
GIT_TEST_* variables which would need the same treatment. And if and
when we come up with a more complete solution, we can use it in both
spots.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 08:56:36 +09:00
Jeff King
aad1d1c0d5 t/perf/perf-lib: fix assignment of TEST_OUTPUT_DIRECTORY
Using the perf suite's "run" helper in a vanilla build fails like this:

  $ make && (cd t/perf && ./run p0000-perf-lib-sanity.sh)
  === Running 1 tests in this tree ===
  perf 1 - test_perf_default_repo works: 1 2 3 ok
  perf 2 - test_checkout_worktree works: 1 2 3 ok
  ok 3 - test_export works
  perf 4 - export a weird var: 1 2 3 ok
  perf 5 - éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś: 1 2 3 ok
  ok 6 - test_export works with weird vars
  perf 7 - important variables available in subshells: 1 2 3 ok
  perf 8 - test-lib-functions correctly loaded in subshells: 1 2 3 ok
  # passed all 8 test(s)
  1..8
  cannot open test-results/p0000-perf-lib-sanity.subtests: No such file or directory at ./aggregate.perl line 159.

It is trying to aggregate results written into t/perf/test-results, but
the p0000 script did not write anything there.

The "run" script looks in $TEST_OUTPUT_DIRECTORY/test-results, or if
that variable is not set, in test-results in the current working
directory (which should be t/perf itself). It pulls the value of
$TEST_OUTPUT_DIRECTORY from the GIT-BUILD-OPTIONS file.

But that doesn't quite match the setup in perf-lib.sh (which is what
scripts like p0000 use). There we do this at the top of the script:

  TEST_OUTPUT_DIRECTORY=$(pwd)

and then let test-lib.sh append "/test-results" to that. Historically,
that made the vanilla case work: we'd always use t/perf/test-results.
But when $TEST_OUTPUT_DIRECTORY was set, it would break.

Commit 5756ccd181 (t/perf: fix benchmarks with out-of-tree builds,
2025-04-28) fixed that second case by loading GIT-BUILD-OPTIONS
ourselves. But that broke the vanilla case!

Now our setting of $TEST_OUTPUT_DIRECTORY in perf-lib.sh is ignored,
because it is overwritten by GIT-BUILD-OPTIONS. And when test-lib.sh
sees that the output directory is empty, it defaults to t/test-results,
rather than t/perf/test-results.

Nobody seems to have noticed, probably for two reasons:

  1. It only matters if you're trying to aggregate results (like the
     "run" script does). Just running "./p0000-perf-lib-sanity.sh"
     manually still produces useful output; the stored result files are
     just in an unexpected place.

  2. There might be leftover files in t/perf/test-results from previous
     runs (before 5756ccd181). In particular, the ".subtests" files
     don't tend to change, and the lack of that file is what causes it
     to barf completely. So it's possible that the aggregation could
     have been showing stale results that did not match the run that
     just happened.

We can fix it by setting TEST_OUTPUT_DIRECTORY only after we've loaded
GIT-BUILD-OPTIONS, so that we override its value and not the other way
around. And we'll do so only when the variable is not set, which should
retain the fix for that case from 5756ccd181.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 08:56:36 +09:00
Junio C Hamano
e0bfec3dfc The 15th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06 16:33:53 +09:00