24285 Commits

Author SHA1 Message Date
Karthik Nayak
8ff2eef8ad fetch: fix non-conflicting tags not being committed
The commit 0e358de64a (fetch: use batched reference updates, 2025-05-19)
updated the 'git-fetch(1)' command to use batched updates. This batches
updates to gain performance improvements. When fetching references, each
update is added to the transaction. Finally, when committing, individual
updates are allowed to fail with reason, while the transaction itself
succeeds.

One scenario which was missed here, was fetching tags. When fetching
conflicting tags, the `fetch_and_consume_refs()` function returns '1',
which skipped committing the transaction and directly jumped to the
cleanup section. This mean that no updates were applied. This also
extends to backfilling tags which is done when fetching specific
refspecs which contains tags in their history.

Fix this by committing the transaction when we have an error code and
not using an atomic transaction. This ensures other references are
applied even when some updates fail.

The cleanup section is reached with `retcode` set in several scenarios:

   - `truncate_fetch_head()`, `open_fetch_head()` and `prune_refs()` set
     `retcode` before the transaction is created, so no commit is
     attempted.

   - `fetch_and_consume_refs()` and `backfill_tags()` are the primary
     cases this fix targets, both setting a positive `retcode` to
     trigger the committing of the transaction.

This simplifies error handling and ensures future modifications to
`do_fetch()` don't need special handling for batched updates.

Add tests to check for this regression. While here, add a missing
cleanup from previous test.

Reported-by: David Bohman <debohman@gmail.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-10 20:59:58 +09:00
Junio C Hamano
fe0e6ffa19 Merge branch 'bc/zsh-testsuite'
A few tests have been updated to work under the shell compatible
mode of zsh.

* bc/zsh-testsuite:
  t5564: fix test hang under zsh's sh mode
  t0614: use numerical comparison with test_line_count
2025-12-09 07:54:54 +09:00
Junio C Hamano
48176f953f connect: plug protocol capability leak
When pushing to a set of remotes using a nickname for the group, the
client initializes the connection to each remote, talks to the
remote and reads and parses capabilities line, and holds the
capabilities in a file-scope static variable server_capabilities_v1.

There are a few other such file-scope static variables, and these
connections cannot be parallelized until they are refactored to a
structure that keeps track of active connections.

Which is *not* the theme of this patch ;-)

For a single connection, the server_capabilities_v1 variable is
initialized to NULL (at the program initialization), populated when
we talk to the other side, used to look up capabilities of the other
side possibly multiple times, and the memory is held by the variable
until program exit, without leaking.  When talking to multiple remotes,
however, the server capabilities from the second connection overwrites
without freeing the one from the first connection, which leaks.

    ==1080970==ERROR: LeakSanitizer: detected memory leaks

    Direct leak of 421 byte(s) in 2 object(s) allocated from:
	#0 0x5615305f849e in strdup (/home/gitster/g/git-jch/bin/bin/git+0x2b349e) (BuildId: 54d149994c9e85374831958f694bd0aa3b8b1e26)
	#1 0x561530e76cc4 in xstrdup /home/gitster/w/build/wrapper.c:43:14
	#2 0x5615309cd7fa in process_capabilities /home/gitster/w/build/connect.c:243:27
	#3 0x5615309cd502 in get_remote_heads /home/gitster/w/build/connect.c:366:4
	#4 0x561530e2cb0b in handshake /home/gitster/w/build/transport.c:372:3
	#5 0x561530e29ed7 in get_refs_via_connect /home/gitster/w/build/transport.c:398:9
	#6 0x561530e26464 in transport_push /home/gitster/w/build/transport.c:1421:16
	#7 0x561530800bec in push_with_options /home/gitster/w/build/builtin/push.c:387:8
	#8 0x5615307ffb99 in do_push /home/gitster/w/build/builtin/push.c:442:7
	#9 0x5615307fe926 in cmd_push /home/gitster/w/build/builtin/push.c:664:7
	#10 0x56153065673f in run_builtin /home/gitster/w/build/git.c:506:11
	#11 0x56153065342f in handle_builtin /home/gitster/w/build/git.c:779:9
	#12 0x561530655b89 in run_argv /home/gitster/w/build/git.c:862:4
	#13 0x561530652cba in cmd_main /home/gitster/w/build/git.c:984:19
	#14 0x5615308dda0a in main /home/gitster/w/build/common-main.c:9:11
	#15 0x7f051651bca7 in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:16

    SUMMARY: AddressSanitizer: 421 byte(s) leaked in 2 allocation(s).

Free the capablities data for the previous server before overwriting
it with the next server to plug this leak.

The added test fails without the freeing with SANITIZE=leak; I
somehow couldn't get it fail reliably with SANITIZE=leak,address
though.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-09 07:11:42 +09:00
Junio C Hamano
9d442ce2e2 Merge branch 'ps/object-source-management'
Code refactoring around object database sources.

* ps/object-source-management:
  odb: handle recreation of quarantine directories
  odb: handle changing a repository's commondir
  chdir-notify: add function to unregister listeners
  odb: handle initialization of sources in `odb_new()`
  http-push: stop setting up `the_repository` for each reference
  t/helper: stop setting up `the_repository` repeatedly
  builtin/index-pack: fix deferred fsck outside repos
  oidset: introduce `oidset_equal()`
  odb: move logic to disable ref updates into repo
  odb: refactor `odb_clear()` to `odb_free()`
  odb: adopt logic to close object databases
  setup: convert `set_git_dir()` to have file scope
  path: move `enter_repo()` into "setup.c"
2025-12-05 14:49:58 +09:00
Junio C Hamano
1b40ddc1a5 Merge branch 'cc/fast-import-strip-if-invalid'
"git fast-import" learns "--strip-if-invalid" option to drop
invalid cryptographic signature from objects.

* cc/fast-import-strip-if-invalid:
  fast-import: add 'strip-if-invalid' mode to --signed-commits=<mode>
  commit: refactor verify_commit_buffer()
  fast-import: refactor finalize_commit_buffer()
2025-12-05 14:49:58 +09:00
Junio C Hamano
0534b78576 Merge branch 'jc/optional-path'
"git config get --path" segfaulted on an ":(optional)path" that
does not exist, which has been corrected.

* jc/optional-path:
  config: really treat missing optional path as not configured
  config: really pretend missing :(optional) value is not there
  config: mark otherwise unused function as file-scope static
2025-12-05 14:49:56 +09:00
Lucas Seiki Oshiro
76c0704bdf repo: add -z as an alias for --format=nul to git-repo-structure
Other Git commands that have nul-terminated output, such as git-config,
git-status, git-ls-files, and git-repo-info have a flag `-z` for using
the null character as the record separator.

Add the `-z` flag to git-repo-structure as an alias for `--format=nul`,
making it consistent with the behavior of the other commands.

Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-05 11:39:19 +09:00
Johannes Schindelin
05491b90ce last-modified: support sparse checkouts
In a sparse checkout, a user might want to run `last-modified` on a
directory outside the worktree.

And even in non-sparse checkouts, a user might need to run that command
on a directory that does not exist in the worktree.

These use cases should be supported via the `--` separator between
revision and file arguments, which is even advertised in the
documentation. This patch fixes a tiny bug that prevents that from
working.

This fixes https://github.com/git-for-windows/git/issues/5978

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-03 14:20:18 -08:00
Kristoffer Haugsbakk
b14f1df9f2 branch: advice using git-help(1) instead of man(1)
8fbd903e (branch: advise about ref syntax rules, 2024-03-05) added
an advice about checking git-check-ref-format(1) for the ref syntax
rules. The advice uses man(1). But git(1) is a multi-platform tool and
man(1) may not be available on some platforms. It might also be slightly
jarring to see a suggestion for running a command which is not from
the Git suite.

Let’s instead use git-help(1) in order to stay inside the land of
git(1). This also means that `help.format` (for `man`, `html` or other
formats) will be used if set.

Also change to using single quotes (') to quote the command since that
is more conventional.

While here let’s also update the test to use `{SQ}`, which is more
readable and easier to edit.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-12-03 00:16:05 -08:00
Junio C Hamano
aea8cc3a10 Merge branch 'jk/asan-bonanza'
Various issues detected by Asan have been corrected.

* jk/asan-bonanza:
  t: enable ASan's strict_string_checks option
  fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated
  fsck: remove redundant date timestamp check
  fsck: avoid strcspn() in fsck_ident()
  fsck: assert newline presence in fsck_ident()
  cache-tree: avoid strtol() on non-string buffer
  Makefile: turn on NO_MMAP when building with ASan
  pack-bitmap: handle name-hash lookups in incremental bitmaps
  compat/mmap: mark unused argument in git_munmap()
2025-11-30 18:31:41 -08:00
Junio C Hamano
3b212a83fe Merge branch 'jc/whitespace-incomplete-line'
Both "git apply" and "git diff" learn a new whitespace error class,
"incomplete-line".

* jc/whitespace-incomplete-line:
  attr: enable incomplete-line whitespace error for this project
  diff: highlight and error out on incomplete lines
  apply: check and fix incomplete lines
  whitespace: allocate a few more bits and define WS_INCOMPLETE_LINE
  apply: revamp the parsing of incomplete lines
  diff: update the way rewrite diff handles incomplete lines
  diff: call emit_callback ecbdata everywhere
  diff: refactor output of incomplete line
  diff: keep track of the type of the last line seen
  diff: correct suppress_blank_empty hack
  diff: emit_line_ws_markup() if/else style fix
  whitespace: correct bit assignment comments
2025-11-30 18:31:40 -08:00
Junio C Hamano
0fec747d59 Merge branch 'lo/repo-info-all'
"git repo info" learned "--all" option.

* lo/repo-info-all:
  repo: add --all to git-repo-info
  repo: factor out field printing to dedicated function
2025-11-30 18:31:39 -08:00
brian m. carlson
a92f243a94 t5564: fix test hang under zsh's sh mode
This test starts a SOCKS server in Perl in the background and then kills
it after the tests are done.  However, when using zsh (in sh mode) in
the tests, the start_socks function hangs until the background process
is killed.

Note that this does not reproduce in a simple shell script, so there is
likely some interaction between job handling, our heavy use of eval in
the test framework, and possibly other complexities of our test
framework.  What is clear, however, is that switching from a compound
statement to a subshell fixes the problem entirely and the test passes
with no problem, so do that.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-27 19:06:03 -08:00
brian m. carlson
bf25fca31c t0614: use numerical comparison with test_line_count
In this comparison, we want to know whether the number of lines is
greater than 1.  Our test_line_count function passes the first argument
as the comparison operator to test, so what we want is a numerical
comparison, not a string comparison.  While this does not produce a
functional problem now, it could very well if we expected two or more
items, in which case the value "10" would not match when it should.

Furthermore, the "<" and ">" comparisons are new in POSIX 1003.1-2024
and we don't want to require such a new version of POSIX since many
popular and supported operating systems were released before that
version of POSIX was released.

Finally, zsh's builtin test operator does not like the greater-than sign
in "test", since it is only supported in the double-bracket extension.
This has been reported and will be addressed in a future version, but
since our code is also technically incorrect, as well as not very
compatible, let's fix it by using a numeric comparison.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-27 19:06:01 -08:00
Junio C Hamano
536d284f3b Merge branch 'jk/ci-windows-meson-test-fix'
"Windows+meson" job at the GitHub Actions CI was hard to debug, as
it did not show and save failed test artifacts, which has been
corrected.

* jk/ci-windows-meson-test-fix:
  ci(windows-meson-test): handle options and output like other test jobs
  unit-test: ignore --no-chain-lint
2025-11-26 10:32:43 -08:00
Junio C Hamano
d65eab5d30 Merge branch 'pw/worktree-list-display-width-fix'
"git worktree list" attempts to show paths to worktrees while
aligning them, but miscounted display columns for the paths when
non-ASCII characters were involved, which has been corrected.

* pw/worktree-list-display-width-fix:
  worktree list: quote paths
  worktree list: fix column spacing
2025-11-26 10:32:42 -08:00
Junio C Hamano
24ddb3f1fc Merge branch 'jk/test-mktemp-leakfix'
Test leakfix.

* jk/test-mktemp-leakfix:
  test-mktemp: plug memory and descriptor leaks
2025-11-26 10:32:41 -08:00
Junio C Hamano
1b93acd13a Merge branch 'ad/blame-diff-algorithm'
"git blame" learns "--diff-algorithm=<algo>" option.

* ad/blame-diff-algorithm:
  blame: make diff algorithm configurable
  xdiff: add 'minimal' to XDF_DIFF_ALGORITHM_MASK
2025-11-26 10:32:40 -08:00
Junio C Hamano
716e871d50 Merge branch 'en/ort-rename-another-fix'
Yet another corner case fix around renames in the "ort" merge
strategy.

* en/ort-rename-another-fix:
  merge-ort: fix failing merges in special corner case
  merge-ort: remove debugging crud
  t6429: update comment to mention correct tool
2025-11-26 10:32:40 -08:00
Christian Couder
c20f112e51 fast-import: add 'strip-if-invalid' mode to --signed-commits=<mode>
Tools like `git filter-repo`[1] use `git fast-export` and
`git fast-import` to rewrite repository history. When rewriting
history using one such tool though, commit signatures might become
invalid because the commits they sign changed due to the changes
in the repository history made by the tool between the fast-export
and the fast-import steps.

Note that as far as signature handling goes:

  * Since fast-export doesn't know what changes filter-repo may make
to the stream, it can't know whether the signatures will still be
valid.

  * Since filter-repo doesn't know what history canonicalizations
fast-export performed (and it performs a few), it can't know whether
the signatures will still be valid.

  * Therefore, fast-import is the only process in the pipeline that
can know whether a specified signature remains valid.

Having invalid signatures in a rewritten repository could be
confusing, so users rewritting history might prefer to simply
discard signatures that are invalid at the fast-import step.

For example a common use case is to rewrite only "recent" history.
While specifying commit ranges corresponding to "recent" commits
could work, users worry about getting it wrong and want to just
automatically rewrite everything, expecting older commit signatures
to be untouched.

To let them do that, let's add a new 'strip-if-invalid' mode to the
`--signed-commits=<mode>` option of `git fast-import`.

It would be interesting for the `--signed-tags=<mode>` option to
have this mode too, but we leave that for a future improvement.

It might also be possible for `git fast-export` to have such a mode
in its `--signed-commits=<mode>` and `--signed-tags=<mode>`
options, but the use cases for it are much less clear, so we also
leave that for possible future improvements.

For now let's just die() if 'strip-if-invalid' is passed to these
options where it hasn't been implemented yet.

[1]: https://github.com/newren/git-filter-repo

Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-26 08:43:44 -08:00
Patrick Steinhardt
eea83c010c t/helper: stop setting up the_repository repeatedly
The "repository" test helper sets up `the_repository` twice. In fact
though, we don't even have to set it up even once: all we need is to set
up its hash algorithm, because we still depend on some subsystems that
aren't free of `the_repository`.

Refactor the code accordingly. This prepares for a subsequent change,
where setting up the repository repeatedly will lead to a `BUG()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 12:16:00 -08:00
Patrick Steinhardt
8dc22e87f0 builtin/index-pack: fix deferred fsck outside repos
When asked to perform object consistency checks via the `--fsck-objects`
flag we verify that each object part of the pack is valid. In general,
this check can even be performed outside of a Git repository: we don't
need an initialized object database as we simply read the object from
the packfile directly.

But there's one exception: a subset of the object checks may be deferred
to a later point in time. For now, this only concerns ".gitmodules" and
".gitattributes" files: whenever we see a tree referencing these files
we queue them for a deferred check. This is done because we need to do
some extra checks for those files to ensure that they are well-formed,
and these checks need to be done regardless of whether the corresponding
blobs are part of the packfile or not.

This works inside a repository, but unfortunately the logic leads to a
segfault when running outside of one. This is because we eventually call
`odb_read_object()`, which will crash because the object database has
not been initialized.

There's multiple options here:

  - We could in theory create a purely in-memory database with only a
    packfile store that contains the single packfile. We don't really
    have the infrastructure for this yet though, and it would end up
    being quite hacky.

  - We could refuse to perform consistency checks outside of a
    repository. But most of the checks work alright, so this would be a
    regression.

  - We can skip the finalizing consistency checks when running outside
    of a repository. This is not as invasive as skipping all checks,
    but it's not great to randomly skip a subset of tests, either.

None of these options really feel perfect. The first one would be the
obvious choice if easily possible.

There's another option though: instead of skipping the final object
checks, we can die if there are any queued object checks. With this
change we now die exactly if and only if we would have previously
segfaulted. Like this we ensure that objects that _may_ fail the
consistency checks won't be silently skipped, and at the same time we
give users a much better error message.

Refactor the code accordingly and add a test that would have triggered
the segfault. Note that we also move down the logic to add the packfile
to the store. There is no point doing this any earlier than right before
we execute `fsck_finish()`, and it ensures that the logic to set up and
perform the consistency check is self-contained.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 12:15:59 -08:00
Junio C Hamano
dd8e8c786e submodule add: sanity check existing .gitmodules
"git submodule add" tries to find if a submodule with the same name
already exists at a different path, by looking up an entry in the
.gitmodules file.  If the entry in the file is incomplete, e.g.,
when the submodule.<name>.something variable is defined but there is
no definition of submodule.<name>.path variable, it accesses the
missing .path member of the submodule structure and triggers a
segfault.

A brief audit was done to make sure that the code does not assume
members other than those that are absolutely certain to exist: a
submodule obtained by submodule_from_name() should have .name
member, while a submodule obtained by submodule_from_path() should
also have .path as well as .name member, and we cannot assume
anything else.  Luckily, the module_add() codepath was the only
problematic one.  It is fairly recent code that comes from 1fa06ced
(submodule: prevent overwriting .gitmodules on path reuse,
2025-07-24).

A helper used by update_submodule() seems to assume that its call to
submodule_from_path() always yields a submodule object without a
failure, which seems to rely on the caller making sure it is the
case.  Leave an assert() with a NEEDSWORK comment there for future
developers to make sure the assumption actually holds.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-25 08:43:20 -08:00
Junio C Hamano
ce1a5a22a5 config: really pretend missing :(optional) value is not there
Earlier we added support for a value spelled as ":(optional)path"
for configuration variables whose values are of type "path", with
the documented semantics "if the path is missing, behave as if such
a variable definition is not even there."

This has worked OK for code paths that reads configuration files and
stores the configured value as a string, where NULL in such a string
is treated as if the setting is not there, left as the default.

However, there are other code paths that do not _ignore_ such NULL
values and misbehave.  "git config get --path" is one of them.

When git_config_pathname() helper function finds that the value of
the variable is an optional path *and* the path is missing, it
leaves the destination pointer intact (which usually is left to
NULL) and returns 0 to signal a success.  format_config() helper
however assumed that the destination pointer always gets a string,
which no longer is the case, and segfaulted.

Make sure that git_config_pathname() clears the destination pointer
in such a case, and teach format_config() to react to the condition
by returning 1 (which is different from 0 that is a normal success
and negative that is an error) to its callers.  Adjust the callers
to react to this new return value that tells them to pretend as if
they did not even see this partcular <key, value> pair.

Reported-by: Han Jiang <jhcarl0814@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-24 17:00:47 -08:00
Junio C Hamano
a5d5c50160 Merge branch 'jx/repo-struct-utf8width-fix'
The "git repo structure" subcommand tried to align its output but
mixed up byte count and display column width, which has been
corrected.

* jx/repo-struct-utf8width-fix:
  builtin/repo: fix table alignment for UTF-8 characters
  t/unit-tests: add UTF-8 width tests for CJK chars
2025-11-24 15:46:41 -08:00
Junio C Hamano
9370a6be79 Merge branch 'sa/replay-atomic-ref-updates'
"git replay" (experimental) learned to perform ref updates itself
in a transaction by default, instead of emitting where each refs
should point at and leaving the actual update to another command.

* sa/replay-atomic-ref-updates:
  replay: add replay.refAction config option
  replay: make atomic ref updates the default behavior
  replay: use die_for_incompatible_opt2() for option validation
2025-11-24 15:46:40 -08:00
Junio C Hamano
d91d79f26d Merge branch 'bc/submodule-force-same-hash'
Adding a repository that uses a different hash function is a no-no,
but "git submodule add" did nt prevent it, which has been corrected.

* bc/submodule-force-same-hash:
  read-cache: drop submodule check from add_to_cache()
  object-file: disallow adding submodules of different hash algo
2025-11-24 15:46:40 -08:00
Junio C Hamano
54f7817456 Merge branch 'jk/attr-macroexpand-wo-recursion'
The code to expand attribute macros has been rewritten to avoid
recursion to avoid running out of stack space in an uncontrolled
way.

* jk/attr-macroexpand-wo-recursion:
  attr: avoid recursion when expanding attribute macros
2025-11-24 15:46:39 -08:00
Junio C Hamano
c62d2d3810 Merge branch 'kn/maintenance-is-needed'
"git maintenance" command learned "is-needed" subcommand to tell if
it is necessary to perform various maintenance tasks.

* kn/maintenance-is-needed:
  maintenance: add 'is-needed' subcommand
  maintenance: add checking logic in `pack_refs_condition()`
  refs: add a `optimize_required` field to `struct ref_storage_be`
  reftable/stack: add function to check if optimization is required
  reftable/stack: return stack segments directly
2025-11-21 09:14:17 -08:00
Junio C Hamano
3176576a56 Merge branch 'rs/diff-quiet-no-rename'
As "git diff --quiet" only cares about the existence of any
changes, disable rename/copy detection to skip more expensive
processing whose result will be discarded anyway.

* rs/diff-quiet-no-rename:
  diff: disable rename detection with --quiet
2025-11-21 09:14:15 -08:00
Junio C Hamano
7ccfc262d7 Merge branch 'kn/refs-optim-cleanup'
Code clean-up.

* kn/refs-optim-cleanup:
  t/pack-refs-tests: move the 'test_done' to callees
  refs: rename 'pack_refs_opts' to 'refs_optimize_opts'
  refs: move to using the '.optimize' functions
2025-11-19 10:55:40 -08:00
Junio C Hamano
13134cecb0 Merge branch 'ps/ref-peeled-tags'
Some ref backend storage can hold not just the object name of an
annotated tag, but the object name of the object the tag points at.
The code to handle this information has been streamlined.

* ps/ref-peeled-tags:
  t7004: do not chdir around in the main process
  ref-filter: fix stale parsed objects
  ref-filter: parse objects on demand
  ref-filter: detect broken tags when dereferencing them
  refs: don't store peeled object IDs for invalid tags
  object: add flag to `peel_object()` to verify object type
  refs: drop infrastructure to peel via iterators
  refs: drop `current_ref_iter` hack
  builtin/show-ref: convert to use `reference_get_peeled_oid()`
  ref-filter: propagate peeled object ID
  upload-pack: convert to use `reference_get_peeled_oid()`
  refs: expose peeled object ID via the iterator
  refs: refactor reference status flags
  refs: fully reset `struct ref_iterator::ref` on iteration
  refs: introduce `.ref` field for the base iterator
  refs: introduce wrapper struct for `each_ref_fn`
2025-11-19 10:55:39 -08:00
Lucas Seiki Oshiro
155caac7d1 repo: add --all to git-repo-info
Add a new flag `--all` to git-repo-info for requesting values for all
the available keys. By using this flag, the user can retrieve all the
values instead of searching what are the desired keys for what they
wants.

Helped-by: Karthik Nayak <karthik.188@gmail.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18 13:29:10 -08:00
Phillip Wood
08dfa59835 worktree list: quote paths
If a worktree path contains newlines or other control characters
it messes up the output of "git worktree list". Fix this by using
quote_path() to display the worktree path. The output of "git worktree
list" is designed for human consumption, scripts should be using the
"--porcelain" option so this change should not break them.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18 10:11:29 -08:00
Phillip Wood
a6238ee163 worktree list: fix column spacing
The output of "git worktree list" displays a table containing the
worktree path, HEAD OID and branch name for each worktree. The code
aligns the columns by measuring the visual width of the worktree path
when it is printed. Unfortunately it fails to use the visual width
when calculating the width of the column so, if any of the paths
contain a multibyte character, we can end up with excess padding
between columns. The simplest fix would be to replace strlen() with
utf8_strwidth() in measure_widths(). However that leaves us measuring
the visual width twice and the byte length once. By caching the visual
width and printing the padding separately to the worktree path, we only
need to calculate the visual width once and do not need the byte length
at all. The visual widths are stored in an arrays of structs rather
than an array of ints as the next commit will add more struct members.

Even if there are no multibyte characters in any of the paths we still
print an extra space between the path and the object id as the field
width is calculated as one plus the length of the path and we print an
explicit space as well. This is fixed by not printing the extra space.

The tests are updated to include multibyte characters in one of the
worktree paths and to check the spacing of the columns.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18 10:11:19 -08:00
Jeff King
14b561e768 test-mktemp: plug memory and descriptor leaks
We test xmkstemp() in our helper by just calling:

  xmkstemp(xstrdup(argv[1]));

This leaks both the copied string as well as the descriptor returned by
the function. In practice this isn't a big deal, since we immediately
exit the program, but:

  1. LSan will complain about the memory leak. The only reason we did
     not notice this in our leak-checking builds is that both of the
     callers in the test suite (both in t0070) pass a broken template
     (and expect failure). So the function calls die() before we can
     actually leak.

     But it's an accident waiting to happen if anybody adds a call which
     succeeds.

  2. Coverity complains about the descriptor leak. There's a long list
     of uninteresting or false positives in Coverity's results, but
     since we're here we might as well fix it, too.

I didn't bother adding a new test that triggers the leak. It's not even
in real production code, but just in the test-helper itself.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18 10:05:14 -08:00
Jeff King
e96105aa17 unit-test: ignore --no-chain-lint
In the same spirit as 9faf3963b6 (t: introduce compatibility options to
clar-based tests, 2024-12-13), we should ignore --no-chain-lint passed
to our clar tests, since it may appear in GIT_TEST_OPTS to be used with
other tests.

This is particularly important on Windows CI, where --no-chain-lint is
added to the test options by default, and the meson build will pass all
options to the unit tests. The only reason our meson Windows CI job does
not run into this currently is that it is not respecting GIT_TEST_OPTS
at all! So ignoring this option is a prerequisite to fixing that
situation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18 09:45:28 -08:00
Jeff King
a031b6181a t: enable ASan's strict_string_checks option
ASan has an option to enable strict string checking, where any pointer
passed to a function that expects a NUL-terminated string will be
checked for that NUL termination. This can sometimes produce false
positives. E.g., it is not wrong to pass a buffer with { '1', '2', '\n' }
into strtoul(). Even though it is not NUL-terminated, it will stop at
the newline.

But in trying it out, it identified two problematic spots in our test
suite (which have now been adjusted):

  1. The strtol() parsing in cache-tree.c was a real potential problem,
     which would have been very hard to find otherwise (since it
     required constructing a very specific broken index file).

  2. The use of string functions in fsck_ident() were false positives,
     because we knew that there was always a trailing newline which
     would stop the functions from reading off the end of the buffer.
     But the reasoning behind that is somewhat fragile, and silencing
     those complaints made the code easier to reason about.

So even though this did not find any earth-shattering bugs, and even had
a few false positives, I'm sufficiently convinced that its complaints
are more helpful than hurtful. Let's turn it on by default (since the
test suite now runs cleanly with it) and see if it ever turns up any
other instances.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18 09:36:12 -08:00
Elijah Newren
a562d90a35 merge-ort: fix failing merges in special corner case
At GitHub, we had a repository that was triggering
  git: merge-ort.c:3032: process_renames: Assertion `newinfo && !newinfo->merged.clean` failed.
during git replay.

This sounds similar to the somewhat recent f6ecb603ff8a (merge-ort: fix
directory rename on top of source of other rename/delete, 2025-08-06),
but the cause is different.  Unlike that case, there are no
rename-to-self situations arising in this case, and new to this case it
can only be triggered during a replay operation on the 2nd or later
commit being replayed, never on the first merge in the sequence.

To trigger, the repository needs:
  * an upstream which:
    * renames a file to a different directory, e.g.
        old/file -> new/file
    * leaves other files remaining in the original directory (so that
      e.g. "old/" still exists upstream even though file has been
      removed from it and placed elsewhere)
  * a topic branch being rebased where:
    * a commit in the sequence:
      * modifies old/file
    * a subsequent commit in the sequence being replayed:
      * does NOT touch *anything* under new/
      * does NOT touch old/file
      * DOES modify other paths under old/
      * does NOT have any relevant renames that we need to detect
        _anywhere_ elsewhere in the tree (meaning this interacts
        interestingly with both directory renames and cached renames)

In such a case, the assertion will trigger.  The fix turns out to be
surprisingly simple.  I have a very vague recollection that I actually
considered whether to add such an if-check years ago when I added the
very similar one for oldinfo in 1b6b902d95a5 (merge-ort:
process_renames() now needs more defensiveness, 2021-01-19), but I think
I couldn't figure out a possible way to trigger it and was worried at
the time that if I didn't know how to trigger it then I wasn't so sure
that simply skipping it was correct.  Waiting did give me a chance to
put more thorough tests and checks into place for the rename-to-self
cases a few months back, which I might not have found as easily
otherwise.  Anyway, put the check in place now and add a test that
demonstrates the fix.

Note that this bug, as demonstrated by the conditions listed above,
runs at the intersection of relevant renames, trivial directory
resolutions, and cached renames.  All three of those optimizations are
ones that unfortunately make the code (and testcases!) a bit more
complex, and threading all three makes it a bit more so.  However, the
testcase isn't crazy enough that I'd expect no one to ever hit it in
practice, and was confused why we didn't see it before.  After some
digging, I discovered that merge.directoryRenames=false is a workaround
to this bug, and GitHub used that setting until recently (it was a
"temporary" match-what-libgit2-does piece of code that lasted years
longer than intended).  Since the conditions I gave above for triggering
this bug rule out the possibility of there being directory renames, one
might assume that it shouldn't matter whether you try to detect such
renames if there aren't any.  However, due to commit a16e8efe5c2b
(merge-ort: fix merge.directoryRenames=false, 2025-03-13), the heavy
hammer used there means that merge.directoryRenames=false ALSO turns off
rename caching, which is critical to triggering the bug.  This becomes
a bit more than an aside since...

Re-reading that old commit, a16e8efe5c2b (merge-ort: fix
merge.directoryRenames=false, 2025-03-13), it appears that the solution
to this latest bug might have been at least a partial alternative
solution to that old commit.  And it may have been an improved
alternative (or at least help implement one), since it may be able to
avoid the heavy-handed disabling of rename cache.  That might be an
interesting future thing to investigate, but is not critical for the
current fix.  However, since I spent time digging it all up, at least
leave a small comment tweak breadcrumb to help some future reader
(myself or others) who wants to dig further to connect the dots a little
quicker.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17 14:08:09 -08:00
Elijah Newren
ffe702b3ed t6429: update comment to mention correct tool
A comment at the top of t6429 mentions why the test doesn't exercise git
rebase or git cherry-pick.  However, it claims that it uses `test-tool
fast-rebase`.  That was true when the comment was written, but commit
f920b0289ba3 (replay: introduce new builtin, 2023-11-24) changed it to
use git replay without updating this comment.

We could potentially just strike this second comment, since git replay
is a bona fide built-in, but perhaps the explanation about why it focuses
on git replay is still useful.  Update the comment to make it accurate
again.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17 14:08:08 -08:00
Antonin Delpeuch
ffffb987fc blame: make diff algorithm configurable
The diff algorithm used in 'git-blame(1)' is set to 'myers',
without the possibility to change it aside from the `--minimal` option.

There has been long-standing interest in changing the default diff
algorithm to "histogram", and Git 3.0 was floated as a possible occasion
for taking some steps towards that:

https://lore.kernel.org/git/xmqqed873vgn.fsf@gitster.g/

As a preparation for this move, it is worth making sure that the diff
algorithm is configurable where useful.

Make it configurable in the `git-blame(1)` command by introducing the
`--diff-algorithm` option and make honor the `diff.algorithm` config
variable. Keep Myers diff as the default.

Signed-off-by: Antonin Delpeuch <antonin@delpeuch.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-17 09:31:59 -08:00
Jiang Xin
7a03a10a3a builtin/repo: fix table alignment for UTF-8 characters
The output table from "git repo structure" is misaligned when displaying
UTF-8 characters (e.g., non-ASCII glyphs). E.g.:

    | 仓库结构   | 值  |
    | -------------- | ---- |
    | * 引用       |      |
    |   * 计数     |   67 |

The previous implementation used simple width formatting with printf()
which didn't properly handle multi-byte UTF-8 characters, causing
misaligned table columns when displaying repository structure
information.

This change modifies the stats_table_print_structure function to use
strbuf_utf8_align() instead of basic printf width specifiers. This
ensures proper column alignment regardless of the character encoding of
the content being displayed.

Also add test cases for strbuf_utf8_align(), a function newly introduced
in "builtin/repo.c".

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-16 16:04:24 -08:00
Jiang Xin
878fef8ebf t/unit-tests: add UTF-8 width tests for CJK chars
The file "builtin/repo.c" uses utf8_strwidth() to calculate the display
width of UTF-8 characters in a table, but the resulting output is still
misaligned. Add test cases for both utf8_strwidth and utf8_strnwidth to
verify that they correctly compute the display width for UTF-8
characters.

Also updated the build configuration in Makefile and meson.build to
include the new test suite in the build process.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-16 16:04:24 -08:00
Jeff King
6fe288bfbc read-cache: drop submodule check from add_to_cache()
In add_to_cache(), we treat any directories as submodules, and complain
if we can't resolve their HEAD. This call to resolve_gitlink_ref() was
added by f937bc2f86 (add: error appropriately on repository with no
commits, 2019-04-09), with the goal of improving the error message for
empty repositories.

But we already resolve the submodule HEAD in index_path(), which is
where we find the actual oid we're going to use. Resolving it again here
introduces some downsides:

  1. It's more work, since we have to open up the submodule repository's
     files twice.

  2. There are call paths that get to index_path() without going through
     add_to_cache(). For instance, we'd want a similar informative
     message if "git diff empty" finds that it can't resolve the
     submodule's HEAD. (In theory we can also get there through
     update-index, but AFAICT it refuses to consider directories as
     submodules at all, and just complains about them).

  3. The resolution in index_path() catches more errors that we don't
     handle here. In particular, it will validate that the object format
     for the submodule matches that of the superproject. This isn't a
     bug, since our call in add_to_cache() throws away the oid it gets
     without looking at it. But it certainly caused confusion for me
     when looking at where the object-format check should go.

So instead of resolving the submodule HEAD in add_to_cache(), let's just
teach the call in index_path() to actually produce an error message
(which it already does for other cases). That's probably what f937bc2f86
should have done in the first place, and it gives us a single point of
resolution when adding a submodule to the index.

The resulting output is slightly more verbose, as we propagate the error
up the call stack, but I think that's OK (and again, matches many other
errors we get when indexing fails).

I've left the text of the error message as-is, though it is perhaps
overly specific.  There are many reasons that resolving the submodule
HEAD might fail, though outside of corruption or system errors it is
probably most likely that the submodule HEAD is simply on an unborn
branch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-15 21:18:49 -08:00
brian m. carlson
66c78e0653 object-file: disallow adding submodules of different hash algo
The design of the hash algorithm transition plan is that objects stored
must be entirely in one algorithm since we lack any way to indicate a
mix of algorithms.  This also includes submodules, but we have
traditionally not enforced this, which leads to various problems when
trying to clone or check out the the submodule from the remote.

Since this cannot work in the general case, restrict adding a submodule
of a different algorithm to the index.  Add tests for git add and git
submodule add that these are rejected.

Note that we cannot check this in git fsck because the malformed
submodule is stored in the tree as an object ID which is either
truncated (when a SHA-256 submodule is added to a SHA-1 repository) or
padded with zeros (when a SHA-1 submodule is added to a SHA-256
repository).  We cannot detect even the latter case because someone
could have an actual submodule that actually ends in 24 zeros, which
would be a false positive.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-15 11:51:37 -08:00
Junio C Hamano
ab2693cb52 diff: highlight and error out on incomplete lines
Teach "git diff" to highlight "\ No newline at end of file" message
as a whitespace error when incomplete-line whitespace error class is
in effect.  Thanks to the previous refactoring of complete rewrite
code path, we can do this at a single place.

Unlike whitespace errors in the payload where we need to annotate in
line, possibly using colors, the line that has whitespace problems,
we have a dedicated line already that can serve as the error
message, so paint it as a whitespace error message.

Also teach "git diff --check" to notice incomplete lines as
whitespace errors and report when incomplete-line whitespace error
class is in effect.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12 14:04:05 -08:00
Junio C Hamano
9fb15a8e14 apply: check and fix incomplete lines
The final line of a file that lacks the terminating newline at its
end is called an incomplete line.  In general they are frowned upon
for many reasons (imagine concatenating two files with "cat A B" and
what happens when A ends in an incomplete line, for example), and
text-oriented tools often mishandle such a line.

Implement checks in "git apply" for incomplete lines, which is off
by default for backward compatibility's sake, so that "git apply
--whitespace={fix,warn,error}" can notice, warn against, and fix
them.

As one of the new test shows, if you modify contents on an
incomplete line in the original and leave the resulting line
incomplete, it is still considered a whitespace error, the reasoning
being that "you'd better fix it while at it if you are making a
change on an incomplete line anyway", which may controversial.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12 14:04:04 -08:00
Junio C Hamano
99bd5a5c9f Merge branch 'tc/last-modified-active-paths-optimization'
"git last-modified" was optimized by narrowing the set of paths to
follow as it dug deeper in the history.

* tc/last-modified-active-paths-optimization:
  last-modified: implement faster algorithm
2025-11-12 11:45:24 -08:00
Jeff King
42ed046866 attr: avoid recursion when expanding attribute macros
Given a set of attribute macros like:

   [attr]a1 a2
   [attr]a2 a3
   ...
   [attr]a300000 -text
   file a1

expanding the attributes for "file" requires expanding "a1" to "a2",
"a2" to "a3", and so on until hitting a non-macro expansion ("-text", in
this case). We implement this via recursion: fill_one() calls
macroexpand_one(), which then recurses back to fill_one(). As a result,
very deep macro chains like the one above can run out of stack space and
cause us to segfault.

The required stack space is fairly small; I needed on the order of
200,000 entries to get a segfault on Linux. So it's unlikely anybody
would hit this accidentally, leaving only malicious inputs. There you
can easily construct a repo which will segfault on clone (we look at
attributes during the checkout step, but you'd see the same trying to do
other operations, like diff in a bare repo). It's mostly harmless, since
anybody constructing such a repo is only preventing victims from cloning
their evil garbage, but it could be a nuisance for hosting sites.

One option to prevent this is to limit the depth of recursion we'll
allow. This is conceptually easy to implement, but it raises other
questions: what should the limit be, and do we need a configuration knob
for it?

The recursion here is simple enough that we can avoid those questions by
just converting it to iteration instead. Rather than iterate over the
states of a match_attr in fill_one(), we'll put them all in a queue, and
the expansion of each can add to the queue rather than recursing. Note
that this is a LIFO queue in order to keep the same depth-first order we
did with the recursive implementation. I've avoided using the word
"stack" in the code because the term is already heavily used to refer to
the stack of .gitattribute files that matches the tree structure of the
repository.

The test uses a limited stack size so we can trigger the problem with a
much smaller input than the one shown above. The value here (3000) is
enough to trigger the issue on my x86_64 Linux machine.

Reported-by: Ben Stav <benstav@miggo.io>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-12 10:30:04 -08:00
René Scharfe
fa052367ef diff: disable rename detection with --quiet
Detecting renames and copies improves diff's output.  This effort is
wasted if we don't show any.  Disable detection in that case.

This actually fixes the error code when using the options --cached,
--find-copies-harder, --no-ext-diff and --quiet together:
run_diff_index() indirectly calls diff-lib.c::show_modified(), which
queues even non-modified entries using diff_change() because we need
them for copy detection.  diff_change() sets flags.has_changes, though,
which causes diff_can_quit_early() to declare we're done after seeing
only the very first entry -- way too soon.

Using --cached, --find-copies-harder and --quiet together without
--no-ext-diff was not affected even before, as it causes the flag
flags.diff_from_contents to be set, which disables the optimization
in a different way.

Reported-by: D. Ben Knoble <ben.knoble@gmail.com>
Suggested-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-10 11:23:57 -08:00