79421 Commits

Author SHA1 Message Date
Junio C Hamano
6d0ac2147d Merge branch 'ps/clar-integers' into jch
Import newer version of "clar", unit testing framework.

* ps/clar-integers:
  gitattributes: disable blank-at-eof errors for clar test expectations
  t/unit-tests: demonstrate use of integer comparison assertions
  t/unit-tests: update clar to 39f11fe
2026-01-08 16:40:32 +09:00
Junio C Hamano
6856aa4803 Merge branch 'kh/replay-invalid-onto-advance' into jch
Test coverage of "git replay" has been improved.

* kh/replay-invalid-onto-advance:
  t3650: add more regression tests for failure conditions
  replay: die if we cannot parse object
  replay: improve code comment and die message
  replay: die descriptively when invalid commit-ish is given
  replay: find *onto only after testing for ref name
  replay: remove dead code and rearrange
2026-01-08 16:40:32 +09:00
Junio C Hamano
3d198e2b10 Merge branch 'ps/odb-misc-fixes' into jch
Miscellaneous fixes on object database layer.

* ps/odb-misc-fixes:
  odb: properly close sources before freeing them
  builtin/gc: fix condition for whether to write commit graphs
2026-01-08 16:40:31 +09:00
Junio C Hamano
f9c04f1e97 Merge branch 'pt/t7800-difftool-test-racefix' into jch
Test fixup.

* pt/t7800-difftool-test-racefix:
  t7800: fix racy "difftool --dir-diff syncs worktree" test
2026-01-08 16:40:31 +09:00
Junio C Hamano
fb9120bfa7 Merge branch 'ps/t1300-2021-use-test-path-is-helpers' into jch
Test updates.

* ps/t1300-2021-use-test-path-is-helpers:
  t1300: use test helpers instead of `test` command
2026-01-08 16:40:31 +09:00
Junio C Hamano
6d7996535d Merge branch 'rs/commit-stack' into jch
Code clean-up, unifying various hand-rolled "list of commit
objects" and use the commit_stack API.

* rs/commit-stack:
  commit-reach: use commit_stack
  commit-graph: use commit_stack
  commit: add commit_stack_grow()
  shallow: use commit_stack
  pack-bitmap-write: use commit_stack
  commit: add commit_stack_init()
  test-reach: use commit_stack
  remote: use commit_stack for src_commits
  remote: use commit_stack for sent_tips
  remote: use commit_stack for local_commits
  name-rev: use commit_stack
  midx: use commit_stack
  log: use commit_stack
  revision: export commit_stack
2026-01-08 16:40:31 +09:00
Junio C Hamano
96ba419a76 Merge branch 'sb/bundle-uri-without-uri' into jch
Diagnose invalid bundle-URI that lack the URI entry, instead of
crashing.

* sb/bundle-uri-without-uri:
  bundle-uri: validate that bundle entries have a uri
2026-01-08 16:40:31 +09:00
Junio C Hamano
e2124d6fe0 Merge branch 'ja/doc-synopsis-style-more' into jch
More doc style updates.

* ja/doc-synopsis-style-more:
  doc: convert git-remote to synopsis style
  doc: convert git stage to use synopsis block
  doc: convert git-status tables to AsciiDoc format
  doc: convert git-status to synopsis style
  doc: fix t0450-txt-doc-vs-help to select only first synopsis block
2026-01-08 16:40:30 +09:00
Junio C Hamano
d529f3a197 The 16th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-08 16:40:12 +09:00
Junio C Hamano
2db806d817 Merge branch 'en/ort-recursive-d-f-conflict-fix'
The ort merge machinery hit an assertion failure in a history with
criss-cross merges renamed a directory and a non-directory, which
has been corrected.

* en/ort-recursive-d-f-conflict-fix:
  merge-ort: fix corner case recursive submodule/directory conflict handling
2026-01-08 16:40:12 +09:00
Junio C Hamano
512351f2a8 Merge branch 'dd/t5403-modernise'
Test micro-clean-up.

* dd/t5403-modernise:
  t5403: use test_path_is_file instead of test -f
2026-01-08 16:40:12 +09:00
Junio C Hamano
c0754dc423 Merge branch 'ds/diff-lazy-fetch-with-name-only-fix'
Running "git diff" with "--name-only" and other options that allows
us not to look at the blob contents, while objects that are lazily
fetched from a promisor remote, caused use-after-free, which has
been corrected.

* ds/diff-lazy-fetch-with-name-only-fix:
  diff: avoid segfault with freed entries
2026-01-08 16:40:11 +09:00
Junio C Hamano
d28d2be5f2 Merge branch 'rs/tag-wo-the-repository'
Code clean-up.

* rs/tag-wo-the-repository:
  tag: stop using the_repository
  tag: support arbitrary repositories in parse_tag()
  tag: support arbitrary repositories in gpg_verify_tag()
  tag: use algo of repo parameter in parse_tag_buffer()
2026-01-08 16:40:11 +09:00
Patrick Steinhardt
3d09968656 odb: properly close sources before freeing them
It is possible to hit a memory leak when reading data from a submodule
via git-grep(1):

  Direct leak of 192 byte(s) in 1 object(s) allocated from:
    #0 0x55555562e726 in calloc (git+0xda726)
    #1 0x555555964734 in xcalloc ../wrapper.c:154:8
    #2 0x555555835136 in load_multi_pack_index_one ../midx.c:135:2
    #3 0x555555834fd6 in load_multi_pack_index ../midx.c:382:6
    #4 0x5555558365b6 in prepare_multi_pack_index_one ../midx.c:716:17
    #5 0x55555586c605 in packfile_store_prepare ../packfile.c:1103:3
    #6 0x55555586c90c in packfile_store_reprepare ../packfile.c:1118:2
    #7 0x5555558546b3 in odb_reprepare ../odb.c:1106:2
    #8 0x5555558539e4 in do_oid_object_info_extended ../odb.c:715:4
    #9 0x5555558533d1 in odb_read_object_info_extended ../odb.c:862:8
    #10 0x5555558540bd in odb_read_object ../odb.c:920:6
    #11 0x55555580a330 in grep_source_load_oid ../grep.c:1934:12
    #12 0x55555580a13a in grep_source_load ../grep.c:1986:10
    #13 0x555555809103 in grep_source_is_binary ../grep.c:2014:7
    #14 0x555555807574 in grep_source_1 ../grep.c:1625:8
    #15 0x555555807322 in grep_source ../grep.c:1837:10
    #16 0x5555556a5c58 in run ../builtin/grep.c:208:10
    #17 0x55555562bb42 in void* ThreadStartFunc<false>(void*) lsan_interceptors.cpp.o
    #18 0x7ffff7a9a979 in start_thread (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x9a979) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
    #19 0x7ffff7b22d2b in __GI___clone3 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x122d2b) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)

The root caues of this leak is the way we set up and release the
submodule:

  1. We use `repo_submodule_init()` to initialize a new repository. This
     repository is stored in `repos_to_free`.

  2. We now read data from the submodule repository.

  3. We then call `repo_clear()` on the submodule repositories.

  4. `repo_clear()` calls `odb_free()`.

  5. `odb_free()` calls `odb_free_sources()` followed by `odb_close()`.

The issue here is the 5th step: we call `odb_free_sources()` _before_ we
call `odb_close()`. But `odb_free_sources()` already frees all sources,
so the logic that closes them in `odb_close()` now becomes a no-op. As a
consequence, we never explicitly close sources at all.

Fix the leak by closing the store before we free the sources.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 09:16:51 +09:00
Patrick Steinhardt
b3449b1517 builtin/gc: fix condition for whether to write commit graphs
When performing auto-maintenance we check whether commit graphs need to
be generated by counting the number of commits that are reachable by any
reference, but not covered by a commit graph. This search is performed
by iterating through all references and then doing a depth-first search
until we have found enough commits that are not present in the commit
graph.

This logic has a memory leak though:

  Direct leak of 16 byte(s) in 1 object(s) allocated from:
      #0 0x55555562e433 in malloc (git+0xda433)
      #1 0x555555964322 in do_xmalloc ../wrapper.c:55:8
      #2 0x5555559642e6 in xmalloc ../wrapper.c:76:9
      #3 0x55555579bf29 in commit_list_append ../commit.c:1872:35
      #4 0x55555569f160 in dfs_on_ref ../builtin/gc.c:1165:4
      #5 0x5555558c33fd in do_for_each_ref_iterator ../refs/iterator.c:431:12
      #6 0x5555558af520 in do_for_each_ref ../refs.c:1828:9
      #7 0x5555558ac317 in refs_for_each_ref ../refs.c:1833:9
      #8 0x55555569e207 in should_write_commit_graph ../builtin/gc.c:1188:11
      #9 0x55555569c915 in maintenance_is_needed ../builtin/gc.c:3492:8
      #10 0x55555569b76a in cmd_maintenance ../builtin/gc.c:3542:9
      #11 0x55555575166a in run_builtin ../git.c:506:11
      #12 0x5555557502f0 in handle_builtin ../git.c:779:9
      #13 0x555555751127 in run_argv ../git.c:862:4
      #14 0x55555575007b in cmd_main ../git.c:984:19
      #15 0x5555557523aa in main ../common-main.c:9:11
      #16 0x7ffff7a2a4d7 in __libc_start_call_main (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a4d7) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
      #17 0x7ffff7a2a59a in __libc_start_main@GLIBC_2.2.5 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a59a) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab)
      #18 0x5555555f0934 in _start (git+0x9c934)

The root cause of this memory leak is our use of `commit_list_append()`.
This function expects as parameters the item to append and the _tail_ of
the list to append. This tail will then be overwritten with the new tail
of the list so that it can be used in subsequent calls. But we call it
with `commit_list_append(parent->item, &stack)`, so we end up losing
everything but the new item.

This issue only surfaces when counting merge commits. Next to being a
memory leak, it also shows that we're in fact miscounting as we only
respect children of the last parent. All previous parents are discarded,
so their children will be disregarded unless they are hit via another
reference.

While crafting a test case for the issue I was puzzled that I couldn't
establish the proper border at which the auto-condition would be
fulfilled. As it turns out, there's another bug: if an object is at the
tip of any reference we don't mark it as seen. Consequently, if it is
the tip of or reachable via another ref, we'd count that object multiple
times.

Fix both of these bugs so that we properly count objects without leaking
any memory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-07 09:16:50 +09:00
Junio C Hamano
e0bfec3dfc The 15th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06 16:33:53 +09:00
Junio C Hamano
d39e3ed716 Merge branch 'rs/parse-config-expiry-simplify'
Code clean-up.

* rs/parse-config-expiry-simplify:
  config: use git_parse_int() in git_config_get_expiry_in_days()
2026-01-06 16:33:53 +09:00
Junio C Hamano
f406b89552 Merge branch 'ar/run-command-hook'
Use hook API to replace ad-hoc invocation of hook scripts with the
run_command() API.

* ar/run-command-hook:
  receive-pack: convert receive hooks to hook API
  receive-pack: convert update hooks to new API
  hooks: allow callers to capture output
  run-command: allow capturing of collated output
  hook: allow overriding the ungroup option
  reference-transaction: use hook API instead of run-command
  transport: convert pre-push to hook API
  hook: convert 'post-rewrite' hook in sequencer.c to hook API
  hook: provide stdin via callback
  run-command: add stdin callback for parallelization
  run-command: add first helper for pp child states
2026-01-06 16:33:53 +09:00
Junio C Hamano
1627809eef Merge branch 'rs/show-branch-prio-queue'
Code clean-up.

* rs/show-branch-prio-queue:
  show-branch: use prio_queue
2026-01-06 16:33:52 +09:00
Junio C Hamano
b39aad0b0d Merge branch 'rs/macos-iconv-workaround'
Workaround the "iconv" shipped as part of macOS, which is broken
handling stateful ISO/IEC 2022 encoded strings.

* rs/macos-iconv-workaround:
  macOS: use iconv from Homebrew if needed and present
  macOS: make Homebrew use configurable
2026-01-06 16:33:52 +09:00
Junio C Hamano
8fb86e1a42 Merge branch 'bc/checkout-error-message-fix'
Message fix.

* bc/checkout-error-message-fix:
  checkout: quote invalid treeish in error message
2026-01-06 16:33:52 +09:00
Kristoffer Haugsbakk
56b77a687e t3650: add more regression tests for failure conditions
There isn’t much test coverage for basic failure conditions. Let’s add
a few more since these are simple to write and remove if they become
obsolete.

Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06 07:30:16 +09:00
Kristoffer Haugsbakk
6f693364cc replay: die if we cannot parse object
`parse_object` can return `NULL`. That will in turn make
`repo_peel_to_type` return the same.

Let’s die fast and descriptively with the `*_or_die` variant.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06 07:30:16 +09:00
Kristoffer Haugsbakk
f67f7ddbbd replay: improve code comment and die message
Suggested-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06 07:30:16 +09:00
Kristoffer Haugsbakk
3074d08cfa replay: die descriptively when invalid commit-ish is given
Giving an invalid commit-ish to `--onto` makes git-replay(1) fail with:

    fatal: Replaying down to root commit is not supported yet!

Going backwards from this point:

1. `onto` is `NULL` from `set_up_replay_mode`;
2. that function in turn calls `peel_committish`; and
3. here we return `NULL` if `repo_get_oid` fails.

Let’s die immediately with a descriptive error message instead.

Doing this also provides us with a descriptive error if we “forget” to
provide an argument to `--onto` (but we really do unintentionally):[1]

    $ git replay --onto ^main topic1
    fatal: '^main' is not a valid commit-ish

Note that the `--advance` case won’t be triggered in practice because
of the “argument to --advance must be a reference” check (see the
previous test, and commit).

† 1: The argument to `--onto` is mandatory and the option parser accepts
     both `--onto=<name>` (stuck form) and `--onto name`. The latter
     form makes it easy to unintentionally pass something to the option
     when you really meant to pass a positional argument.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06 07:30:16 +09:00
Kristoffer Haugsbakk
17b7965a03 replay: find *onto only after testing for ref name
We are about to make `peel_committish` die when it cannot find
a commit-ish instead of returning `NULL`. But that would make e.g.
`git replay --advance=refs/non-existent` die with a less descriptive
error message; the highest-level error message is that the name does
not exist as a ref, not that we cannot find a commit-ish based on
the name.

Let’s try to find the ref and only after that try to peel to
as a commit-ish.

Also add a regression test to protect this error order from future
modifications.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06 07:30:16 +09:00
Kristoffer Haugsbakk
76eab50f75 replay: remove dead code and rearrange
22d99f01 (replay: add --advance or 'cherry-pick' mode, 2023-11-24) both
added `--advance` and made one of `--onto` or `--advance` mandatory.
But `determine_replay_mode` claims that there is a third alternative;
neither of `--onto` or `--advance` were given:

    if (onto_name) {
    ...
    } else if (*advance_name) {
    ...
    } else {
    ...
    }

But this is false—the fallthrough else-block is dead code.

Commit 22d99f01 was iterated upon by several people.[1] The initial
author wrote code for a sort of *guess mode*, allowing for shorter
commands when that was possible. But the next person instead made one
of the aforementioned options mandatory. In turn this code was dead on
arrival in git.git.

[1]: https://lore.kernel.org/git/CABPp-BEcJqjD4ztsZo2FTZgWT5ZOADKYEyiZtda+d0mSd1quPQ@mail.gmail.com/

Let’s remove this code. We can also join the if-block with the
condition `!*advance_name` into the `*onto` block since we do not set
`*advance_name` in this function. It only looked like we might set it
since the dead code has this line:

    *advance_name = xstrdup_or_null(last_key);

Let’s also rename the function since we do not determine the
replay mode here. We just set up `*onto` and refs to update.

Note that there might be more dead code caused by this *guess mode*.
We only concern ourselves with this function for now.

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-06 07:30:15 +09:00
Pushkar Singh
404b677229 t1300: use test helpers instead of test command
Replace `test -f` and `test -h` checks with `test_path_is_file` and
`test_path_is_symlink`. Using the test framework helpers provides
clearer diagnostics and keeps tests consistent across the suite.

Signed-off-by: Pushkar Singh <pushkarkumarsingh1970@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-05 10:58:58 +09:00
Paul Tarjan
0b495cd390 t7800: fix racy "difftool --dir-diff syncs worktree" test
The "difftool --dir-diff syncs worktree without unstaged change" test
fails intermittently on Windows CI, as seen at:

  https://github.com/git/git/actions/runs/20624095002/job/59231745784#step:5:416

The root cause is that the original file content and the replacement
content have identical sizes:

  - Original: "main\ntest\na\n" = 12 bytes
  - New:      "new content\n"   = 12 bytes

When difftool's sync-back mechanism checks for changes, it compares
stat data between the temporary index and the modified files. If the
modification happens within the same timestamp granularity window and
file size stays the same, the change goes undetected.

On Windows, this is more likely to manifest because Git relies on
inode changes as a fallback when other stat fields match, but Windows
filesystems lack inodes. This is a real bug that could affect users
scripting difftool similarly, as seen at:

  https://github.com/git-for-windows/git/issues/5132

Fix the test by changing the replacement content to "modified content"
(17 bytes), ensuring the size difference is detected regardless of
timestamp resolution or platform-specific stat behavior.

Note: This fixes the test flakiness but not the underlying issue in
difftool's change detection. Other tests with same-size file patterns
(t0010-racy-git.sh, t2200-add-update.sh) are not affected because they
use normal index operations with proper racy-git detection.

Signed-off-by: Paul Tarjan <github@paulisageek.com>
Reviewed-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2026-01-04 11:28:14 +09:00
Junio C Hamano
68cb7f9e92 The 14th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-30 12:58:22 +09:00
Junio C Hamano
a37bb2ae6c Merge branch 'jk/test-curl-updates'
Update HTTP tests to adjust for changes in curl 8.18.0

* jk/test-curl-updates:
  t5563: add missing end-of-line in HTTP header
  t5551: handle trailing slashes in expected cookies output
2025-12-30 12:58:22 +09:00
Junio C Hamano
e7b1925381 Merge branch 'jc/object-read-stream-fix'
Fix a performance regression in recently graduated topic.

* jc/object-read-stream-fix:
  odb: do not use "blank" substitute for NULL
2025-12-30 12:58:22 +09:00
Junio C Hamano
a194cdc8f3 Merge branch 'js/test-func-comment-fix'
Comment fix.

* js/test-func-comment-fix:
  test_detect_ref_format: fix comment
2025-12-30 12:58:21 +09:00
Junio C Hamano
68dce01807 Merge branch 'gf/clear-path-cache-cleanup'
Code clean-up.

* gf/clear-path-cache-cleanup:
  repository: remove duplicate free of cache->squash_msg
2025-12-30 12:58:21 +09:00
Junio C Hamano
2365d4f612 Merge branch 'gf/maintenance-is-needed-fix'
Brown-paper-bag fix to a recently graduated
'kn/maintenance-is-needed' topic.

* gf/maintenance-is-needed-fix:
  refs: dereference the value of the required pointer
2025-12-30 12:58:20 +09:00
Junio C Hamano
b006b84119 Merge branch 'dk/ci-rust-fix'
Build fix.

* dk/ci-rust-fix:
  rust: build correctly without GNU sed
2025-12-30 12:58:20 +09:00
Junio C Hamano
148c8f38ee Merge branch 'mh/doc-core-attributesfile'
Doc update.

* mh/doc-core-attributesfile:
  docs: note the type of core.attributesfile
2025-12-30 12:58:19 +09:00
Junio C Hamano
4a8ee50c77 Merge branch 'ps/repack-avoid-noop-midx-rewrite'
Even when there is no changes in the packfile and no need to
recompute bitmaps, "git repack" recomputed and updated the MIDX
file, which has been corrected.

* ps/repack-avoid-noop-midx-rewrite:
  midx-write: skip rewriting MIDX with `--stdin-packs` unless needed
  midx-write: extract function to test whether MIDX needs updating
  midx: fix `BUG()` when getting preferred pack without a reverse index
2025-12-30 12:58:19 +09:00
Junio C Hamano
d8e9716b91 Merge branch 'js/test-symlink-windows'
Prepare test suite for Git for Windows that supports symbolic
links.

* js/test-symlink-windows:
  t7800: work around the MSYS path conversion on Windows
  t6423: introduce Windows-specific handling for symlinking to /dev/null
  t1305: skip symlink tests that do not apply to Windows
  t1006: accommodate for symlink support in MSYS2
  t0600: fix incomplete prerequisite for a test case
  t0301: another fix for Windows compatibility
  t0001: handle `diff --no-index` gracefully
  mingw: special-case `open(symlink, O_CREAT | O_EXCL)`
  apply: symbolic links lack a "trustable executable bit"
  t9700: accommodate for Windows paths
2025-12-30 12:58:19 +09:00
Junio C Hamano
b1792f5116 Merge branch 'jt/doc-rev-list-filter-provided-objects'
Document "rev-list --filter-provided-objects" better.

* jt/doc-rev-list-filter-provided-objects:
  docs: clarify git-rev-list(1) --filter behavior
2025-12-30 12:58:19 +09:00
Junio C Hamano
02e9bc3392 Merge branch 'jt/repo-struct-more-objinfo'
More object database related information are shown in "git repo
structure" output.

* jt/repo-struct-more-objinfo:
  builtin/repo: add object disk size info to structure table
  builtin/repo: add disk size info to keyvalue stucture output
  builtin/repo: add inflated object info to structure table
  builtin/repo: add inflated object info to keyvalue structure output
  builtin/repo: humanise count values in structure output
  strbuf: split out logic to humanise byte values
  builtin/repo: group per-type object values into struct
2025-12-30 12:58:19 +09:00
Derrick Stolee
56d388e6ad diff: avoid segfault with freed entries
When computing a diff in a partial clone, there is a chance that we
could trigger a prefetch of missing objects at the same time as we are
freeing entries from the global diff queue. This is difficult to
reproduce, as we need to have some objects be freed from the queue
before triggering the prefetch of missing objects. There is a new test
in t4067 that does trigger the segmentation fault that results in this
case.

The fix is to set the queue pointer to NULL after it is freed, and then
to be careful about NULL values in the prefetch.

The more elaborate explanation is that within diffcore_std(), we may
skip the initial prefetch due to the output format (--name-only in the
test) and go straight to diffcore_skip_stat_unmatch(). In that method,
the index entries that have been invalidated by path changes show up as
entries but may be deleted because they are not actually content diffs
and only newer timestamps than expected. As those entries are deleted,
later entries are checked with diff_filespec_check_stat_unmatch(), which
uses diff_queued_diff_prefetch() as the missing_object_cb in its diff
options. That can trigger downloading missing objects if the appropriate
scenario occurs to trigger a call to diff_popoulate_filespec(). It's
finally within that callback to diff_queued_diff_prefetch() that the
segfault occurs.

The test was hard to find because it required some real differences,
some not-different files that had a newer modified time, and the order
of those files alphabetically was important to trigger the deletion
before the prefetch was triggered.

I briefly considered a "lock" member for the diff queue, but it was a
much larger diff and introduced many more possible error scenarios.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-30 10:53:47 +09:00
Deveshi Dwivedi
861dbb1586 t5403: use test_path_is_file instead of test -f
Replace 'test -f' with the test_path_is_file in
t5403-post-checkout-hook.sh. This helper provides better error
messages when tests fail, making it easier to debug issues.

Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-30 09:23:00 +09:00
Elijah Newren
979ee83e8a merge-ort: fix corner case recursive submodule/directory conflict handling
At GitHub, a few repositories were triggering errors of the form:

    git: merge-ort.c:3037: process_renames: Assertion `newinfo && !newinfo->merged.clean' failed.
    Aborted (core dumped)

While these may look similar to both
    a562d90a350d (merge-ort: fix failing merges in special corner case,
                  2025-11-03)
and
    f6ecb603ff8a (merge-ort: fix directory rename on top of source of other
                  rename/delete, 2025-08-06)
the cause is different and in this case the problem is not an
over-conservative assertion, but a bug before the assertion where we did
not update all relevant state appropriately.

It sadly took me a really long time to figure out how to get a simple
reproducer for this one.  It doesn't really have that many moving parts,
but there are multiple pieces of background information needed to
understand it.

First of all, when we have two files added at the same path, merge-ort
does a two-way merge of those files.  If we have two directories added
at the same path, we basically do the same thing (taking the union of
files, and two-way merging files with the same name).  But two-way
merging requires components of the same type.  We can't merge the
contents of a regular file with a directory, or with a symlink, or with
a submodule.  Nor can any of those other types be merged with each
other, e.g. merging a submodule with a directory is a bad idea.  When
two paths have the same name but their types do not match, merge-ort is
forced to move one of them to an alternate filename (using the
unique_path() function).

Second, if two commits being merged have more than one merge-base,
merge-ort will merge the merge-bases to create a virtual merge-base, and
use that as the base commit.

Third, one of the really important optimizations in merge-ort is trivial
tree-level resolution (roughly meaning merging trees without recursing
into them).  This optimization has some nuance to it that is important
to the current bug, and to understand it, it helps to first look at the
high-level overview of how merge-ort runs; there are basically three
high-level functions that the work is divided between:
    collect_merge_info() - walks the top-level trees getting individual
                           paths of interest
    detect_renames() - detect renames between paths in order to match up
                       paths for three-way merging
    process_entries() - does a few things of interest:
      * three-way merging of files,
      * other special handling (e.g. adjusting paths with conflicting
        types to avoid path collisions)
      * as it finishes handling all the files within a subdirectory,
        writes out a new tree object for that directory

If it were not for renames, we could just always do tree-level merging
whenever the tree on at least one side was unmodified.  Unfortunately,
we need to recurse into trees to determine whether there are renames.
However, we can also do tree-level merging so long as there aren't any
*relevant* renames (another merge-ort optimization), which we can
determine without recursing into trees.

We would also be able to do tree-level merging if we somehow apriori
knew what renames existed, by only recursing into the trees which we
could otherwise trivially merge if they contained files involved in
renames.  That might not seem useful, because we need to find out the
renames and we have to recurse into trees to do so, but when you find
out that the process_entries() step is more computationally expensive
than the collect_merge_info() step, it yields an interesting strategy:
   * run collect_merge_info()
   * run detect_renames()
   * cache the renames()
   * restart -- rerun collect_merge_info(), using the cached renames to
     only recurse into the needed trees
   * we already have the renames cached so no need to re-detect
   * run process_entries() on the reduced list of paths
which was implemented back in 7bee6c100431 (merge-ort: avoid recursing
into directories when we don't need to, 2021-07-16)  Crucially, this
restarting only occurs if the number of paths we could skip recursing
into exceeds the number we still need to recurse into by some safety
factor (wanted_factor in handle_deferred_entries()); forgetting this
fact is a great way to repeatedly fail to create a minimal testcase for
several days and go down alternate wrong paths).

Now, I earlier summarized this optimization as "merging trees without
recursing into them", but this optimization does not require that all
three sides of history has a directory at a given path.  So long as the
tree on one side matches the tree in the base version, we can decide to
resolve in favor of whatever the other side of history has at that path
-- be it a directory, a file, a submodule, or a symlink.  Unfortunately,
the code in question didn't fully realize this, and was written assuming
the base version and both sides would have a directory at the given
path, as can be seen by the "ci->filemask == 0" comment in
resolve_trivial_directory_merge() that was added as part of 7bee6c100431
(merge-ort: avoid recursing into directories when we don't need to,
2021-07-16).  A few additional lines of code are needed to handle cases
where we have something other than a directory on the other side of
history.

But, knowing that resolve_trivial_directory_merge() doesn't have
sufficient state updating logic doesn't show us how to trigger a bug
without combining with the other bits of information we provided above.
Here's a relevant testcase:
   * branches A & B
   * commit A1: adds "folder" as a directory with files tracked under it
   * commit B1: adds "folder" as a submodule
   * commit A2: merges B1 into A1, keeping "folder" as a directory
     (and in fact, with no changes to "folder" since A1), discarding the
     submodule
   * commit B2: merges A1 into B1, keeping "folder" as a submodule
     (and in fact, with no changes to "folder" since B1), discarding the
     directory
Here, if we try to merge A2 & B2, the logic proceeds as follows:
   * we have multiple merge-bases: A1 & B1.  So we have to merge those
     to get a virtual merge base.
   * due to "folder" as a directory and "folder" as a submodule, the
     path collision logic triggers and renames "folder" as a submodule
     to "folder~Temporary merge branch 2" so we can keep it alongside
     "folder" as a directory.
   * we now have a virtual merge base (containing both "folder"
     directory and a "folder~Temporary merge branch 2" submodule) and
     can now do the outer merge
   * in the first step of the outer merge, we attempt to defer recursing
     into folder/ as a directory, but find we need to for rename
     detection.
   * in rename detection, we note that "folder~Temporary merge branch 2"
     has the same hash as "folder" as a submodule in B2, which means we
     have an exact rename.
   * after rename detection, we discover no path in folder/ is needed
     for renames, and so we can cache renames and restart.
   * after restarting, we avoid recursing into "folder/" and realize we
     can resolve it trivially since it hasn't been modified.  The
     resolution removes "folder/", leaving us only "folder" as a
     submodule from commit B2.
   * After this point, we should have a rename/delete conflict on
     "folder~Temporary merge branch 2" -> "folder", but our marking of
     the merge of "folder" as clean broke our ability to handle that and
     in fact triggers an assertion in process_renames().

When there was a df_conflict (directory/"file" conflict, where "file"
could be submodule or regular file or symlink), ensure
resolve_trivial_directory_merge() handles it properly.  In particular:
  * do not pre-emptively mark the path as cleanly merged if the
    remaining path is a file; allow it to be processed in
    process_entries() later to determine if it was clean
  * clear the parts of dirmask or filemask corresponding to the matching
    sides of history, since we are resolving those away
  * clear the df_conflict bit afterwards; since we cleared away the two
    matching sides and only have one side left, that one side can't
    have a directory/file conflict with itself.

Also add the above minimal testcase showcasing this bug to t6422, **with
a sufficient number of paths under the folder/ directory to actually
trigger it**.  (I wish I could have all those days back from all the
wrong paths I went down due to not having enough files under that
directory...)

I know this commit has a very high ratio of lines in the commit message
to lines of comments, and a relatively high ratio of comments to actual
code, but given how long it took me to track down, on the off chance
that we ever need to further modify this logic, I wanted it thoroughly
documented for future me and for whatever other poor soul might end up
needing to read this commit message.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-30 08:59:52 +09:00
René Scharfe
009fceeda2 tag: stop using the_repository
gpg_verify_tag() shows the passed in object name on error.  Both callers
provide one.  It falls back to abbreviated hashes for future callers
that pass in a NULL name.  DEFAULT_ABBREV is default_abbrev, which in
turn is a global variable that's populated by git_default_config() and
only available with USE_THE_REPOSITORY_VARIABLE.

Don't let that hypothetical hold us back from getting rid of
the_repository in tag.c.  Fall back to full hashes, which are more
appropriate for error messages anyway.  This allows us to stop setting
USE_THE_REPOSITORY_VARIABLE.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-29 22:02:54 +09:00
René Scharfe
b6e4cc8c32 tag: support arbitrary repositories in parse_tag()
Allow callers of parse_tag() pass in the repository to use.  Let most of
them pass in the_repository to get the same result as before.  One of
them has stopped using the_repository in ef9b0370da (sha1-name.c: store
and use repo in struct disambiguate_state, 2019-04-16); let it pass in
its stored repository.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-29 22:02:54 +09:00
René Scharfe
154717b3b0 tag: support arbitrary repositories in gpg_verify_tag()
Allow callers of gpg_verify_tag() specify the repository to use by
providing a parameter for that.  One of the two has not been using
the_repository since 43a8391977 (builtin/verify-tag: stop using
`the_repository`, 2025-03-08); let it pass in the correct repository.
The other simply passes the_repository to get the same result as before.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-29 22:02:53 +09:00
René Scharfe
e61f227d06 tag: use algo of repo parameter in parse_tag_buffer()
Stop using "the_hash_algo" explicitly and implictly via parse_oid_hex()
and instead use the "hash_algo" member of the passed in repository,
which is more correct.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-29 22:02:53 +09:00
Junio C Hamano
7c7698a654 The 13th batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-28 17:36:17 +09:00
Junio C Hamano
d480fd08f8 Merge branch 'ap/packfile-promisor-object-optim'
The code path that enumerates promisor objects have been optimized
to skip pointlessly parsing blob objects.

* ap/packfile-promisor-object-optim:
  packfile: skip hash checks in add_promisor_object()
  object: apply skip_hash and discard_tree optimizations to unknown blobs too
2025-12-28 17:36:17 +09:00