"git fetch" that involves fetching tags, when a tag being fetched
needs to overwrite existing one, failed to fetch other tags, which
has been corrected.
* kn/fix-fetch-backfill-tag-with-batched-ref-updates:
fetch: fix failed batched updates skipping operations
fetch: fix non-conflicting tags not being committed
fetch: extract out reference committing logic
MEMZERO_ARRAY() helper is introduced to avoid clearing only the
first N bytes of an N-element array whose elements are larger than
a byte.
* tc/memzero-array:
contrib/coccinelle: pass include paths to spatch(1)
git-compat-util: introduce MEMZERO_ARRAY() macro
"git submodule add" to add a submodule under <name> segfaulted,
when a submodule.<name>.something is already in .gitmodules file
without defining where its submodule.<name>.path is, which has been
corrected.
* jc/submodule-add:
submodule add: sanity check existing .gitmodules
"git replay --onto=<commit> ...", when <commit> is mistyped,
started to segfault with recent change, which has been corrected.
* rs/replay-wrong-onto-fix:
replay: move onto NULL check before first use
"git replay" documentation updates.
* kh/doc-replay-updates:
doc: replay: link section using markup
replay: improve --contained and add to doc
doc: replay: mention no output on conflicts
The "git_istream" abstraction has been revamped to make it easier
to interface with pluggable object database design.
* ps/object-read-stream:
streaming: drop redundant type and size pointers
streaming: move into object database subsystem
streaming: refactor interface to be object-database-centric
streaming: move logic to read packed objects streams into backend
streaming: move logic to read loose objects streams into backend
streaming: make the `odb_read_stream` definition public
streaming: get rid of `the_repository`
streaming: rely on object sources to create object stream
packfile: introduce function to read object info from a store
streaming: move zlib stream into backends
streaming: create structure for filtered object streams
streaming: create structure for packed object streams
streaming: create structure for loose object streams
streaming: create structure for in-core object streams
streaming: allocate stream inside the backend-specific logic
streaming: explicitly pass packfile info when streaming a packed object
streaming: propagate final object type via the stream
streaming: drop the `open()` callback function
streaming: rename `git_istream` into `odb_read_stream`
"git repo struct" learned to take "-z" as a synonym to "--format=nul".
* lo/repo-struct-z:
repo: add -z as an alias for --format=nul to git-repo-structure
repo: use [--format=... | -z] instead of [-z] in git-repo-info synopsis
repo: remove blank line from Documentation/git-repo.adoc
A help message from "git branch" now mentions "git help" instead of
"man" when suggesting to read some documentation.
* kh/advise-w-git-help-in-branch:
branch: advice using git-help(1) instead of man(1)
"git last-modified" used to mishandle "--" to mark the beginning of
pathspec, which has been corrected.
* js/last-modified-with-sparse-checkouts:
last-modified: support sparse checkouts
Recent optimization to "last-modified" command introduced use of
uninitialized block of memory, which has been corrected.
* tc/last-modified-active-paths-optimization:
last-modified: fix use of uninitialized memory
There is no documentation for `--contained`.
Start by copying the text from `replay_options` in `builtin/
replay.c`. But some people think that the existing text is a
bit unclear; what does it mean for a branch to be contained
in a revision range? Let’s include the implied commits here:
the branches that point at commits in the range.
Also use “update” instead of “advance”. “Update” is the verb
commonly used in this context.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unless there are good reasons, it is customary to have the options[]
array used with the parse-options API declared in function scope rather
than at file scope.
Move builtin/pull.c:cmd_pull()’s options[] array into the function to
match that convention.
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_replay() aborts if the pointer "onto" is NULL after argument
parsing, e.g. when specifying a non-existing commit with --onto.
15cd4ef1f4 (replay: make atomic ref updates the default behavior,
2025-11-06) added code that dereferences this pointer before the check.
Switch their places to avoid a segmentation fault.
Reported-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new macro MEMZERO_ARRAY() that zeroes the memory allocated
by ALLOC_ARRAY() and friends. And add coccinelle rule to enforce the use
of this macro.
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression introduced with batched updates in 0e358de64a (fetch:
use batched reference updates, 2025-05-19) when fetching references. In
the `do_fetch()` function, we jump to cleanup if committing the
transaction fails, regardless of whether using batched or atomic
updates. This skips three subsequent operations:
- Update 'FETCH_HEAD' as part of `commit_fetch_head()`.
- Add upstream tracking information via `set_upstream()`.
- Setting remote 'HEAD' values when `do_set_head` is true.
For atomic updates, this is expected behavior. For batched updates,
we want to continue with these operations even if some refs fail to
update.
Skipping `commit_fetch_head()` isn't actually a regression because
'FETCH_HEAD' is already updated via `append_fetch_head()` when not
using '--atomic'. However, we add a test to validate this behavior.
Skipping the other two operations (upstream tracking and remote HEAD)
is a regression. Fix this by only jumping to cleanup when using
'--atomic', allowing batched updates to continue with post-fetch
operations. Add tests to prevent future regressions.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit 0e358de64a (fetch: use batched reference updates, 2025-05-19)
updated the 'git-fetch(1)' command to use batched updates. This batches
updates to gain performance improvements. When fetching references, each
update is added to the transaction. Finally, when committing, individual
updates are allowed to fail with reason, while the transaction itself
succeeds.
One scenario which was missed here, was fetching tags. When fetching
conflicting tags, the `fetch_and_consume_refs()` function returns '1',
which skipped committing the transaction and directly jumped to the
cleanup section. This mean that no updates were applied. This also
extends to backfilling tags which is done when fetching specific
refspecs which contains tags in their history.
Fix this by committing the transaction when we have an error code and
not using an atomic transaction. This ensures other references are
applied even when some updates fail.
The cleanup section is reached with `retcode` set in several scenarios:
- `truncate_fetch_head()`, `open_fetch_head()` and `prune_refs()` set
`retcode` before the transaction is created, so no commit is
attempted.
- `fetch_and_consume_refs()` and `backfill_tags()` are the primary
cases this fix targets, both setting a positive `retcode` to
trigger the committing of the transaction.
This simplifies error handling and ensures future modifications to
`do_fetch()` don't need special handling for batched updates.
Add tests to check for this regression. While here, add a missing
cleanup from previous test.
Reported-by: David Bohman <debohman@gmail.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The use of "revision" (a connected set of commits) has been
clarified in the "git replay" documentation.
* en/replay-doc-revision-range:
Documentation/git-replay.adoc: fix errors around revision range
"git replay" forgot to omit the "gpgsig-sha256" extended header
from the resulting commit the same way it omits "gpgsig", which has
been corrected.
* pw/replay-exclude-gpgsig-fix:
replay: do not copy "gpgsign-sha256" header
The error message given by "git config set", when the variable
being updated has more than one values defined, used old style "git
config" syntax with an incorrect option in its hint, both of which
have been corrected.
* rs/config-set-multi-error-message-fix:
config: fix suggestion for failed set of multi-valued option
The option help text given by "git config unset -h" described
the "--all" option to "replace", not "unset", multiple variables,
which has been corrected.
* rs/config-unset-opthelp-fix:
config: fix short help of unset flags
Code refactoring around object database sources.
* ps/object-source-management:
odb: handle recreation of quarantine directories
odb: handle changing a repository's commondir
chdir-notify: add function to unregister listeners
odb: handle initialization of sources in `odb_new()`
http-push: stop setting up `the_repository` for each reference
t/helper: stop setting up `the_repository` repeatedly
builtin/index-pack: fix deferred fsck outside repos
oidset: introduce `oidset_equal()`
odb: move logic to disable ref updates into repo
odb: refactor `odb_clear()` to `odb_free()`
odb: adopt logic to close object databases
setup: convert `set_git_dir()` to have file scope
path: move `enter_repo()` into "setup.c"
"git config get --path" segfaulted on an ":(optional)path" that
does not exist, which has been corrected.
* jc/optional-path:
config: really treat missing optional path as not configured
config: really pretend missing :(optional) value is not there
config: mark otherwise unused function as file-scope static
Other Git commands that have nul-terminated output, such as git-config,
git-status, git-ls-files, and git-repo-info have a flag `-z` for using
the null character as the record separator.
Add the `-z` flag to git-repo-structure as an alias for `--format=nul`,
making it consistent with the behavior of the other commands.
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The flag -z is only an alias for --format=null and even though --format
and -z can be used together and repeated, only the last one is
considered.
Replace `[-z]` in the synopsis of git-repo-info by
`[--format=... | -z]`, expliciting that the use of one of those flags
replace the other.
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a sparse checkout, a user might want to run `last-modified` on a
directory outside the worktree.
And even in non-sparse checkouts, a user might need to run that command
on a directory that does not exist in the worktree.
These use cases should be supported via the `--` separator between
revision and file arguments, which is even advertised in the
documentation. This patch fixes a tiny bug that prevents that from
working.
This fixes https://github.com/git-for-windows/git/issues/5978
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
8fbd903e (branch: advise about ref syntax rules, 2024-03-05) added
an advice about checking git-check-ref-format(1) for the ref syntax
rules. The advice uses man(1). But git(1) is a multi-platform tool and
man(1) may not be available on some platforms. It might also be slightly
jarring to see a suggestion for running a command which is not from
the Git suite.
Let’s instead use git-help(1) in order to stay inside the land of
git(1). This also means that `help.format` (for `man`, `html` or other
formats) will be used if set.
Also change to using single quotes (') to quote the command since that
is more conventional.
While here let’s also update the test to use `{SQ}`, which is more
readable and easier to edit.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-last-modified(1) uses a scratch bitmap to keep track of paths that
have been changed between commits. To avoid reallocating a bitmap on
each call of process_parent(), the scratch bitmap is kept and reused.
Although, it seems an incorrect length is passed to memset(3).
`struct bitmap` uses `eword_t` to for internal storage. This type is
typedef'd to uint64_t. To fully zero the memory used by the bitmap,
multiply the length (saved in `struct bitmap::word_alloc`) by the size
of `eword_t`.
Reported-by: Anders Kaseorg <andersk@mit.edu>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There was significant confusion in the git-replay manual about what
constitutes a revision range. As noted in f302c1e4aa09 (revisions(7):
clarify that most commands take a single revision range, 2021-05-18):
Commands that are specifically designed to take two distinct ranges
(e.g. "git range-diff R1 R2" to compare two ranges) do exist, but they
are exceptions. Unless otherwise noted, all "git" commands that operate
on a set of commits work on a single revision range.
`git replay` is not an exception, but a few places in the manual were
written as though it were. These appear to have come in revisions to
the original series, between v3->v4 (see
https://lore.kernel.org/git/CAP8UFD3bpLrVW97DH7j=V9H2GsTSAkksC9L3QujQERFk_kLnZA@mail.gmail.com/
, "More than one <revision-range> can be passed") and between v6->v7
(https://lore.kernel.org/git/20231115143327.2441397-1-christian.couder@gmail.com/,
"Takes ranges of commits"), and I missed both of these revisions when
reviewing. Fix them now.
There was also a reference to the "Commit Limiting options below", but
this page has no such section of options; strike the misleading
reference.
It is worth noting that we are documenting existing behavior, rather
than optimal behavior. Junio has multiple times suggested introducing
alternative ways to walk revisions and use them in `git replay
--advance`, e.g. at
* https://lore.kernel.org/git/xmqqy1mqo6kv.fsf@gitster.g/
* https://lore.kernel.org/git/xmqq8rb3is8c.fsf@gitster.g/
* https://lore.kernel.org/git/xmqqtsydj2zk.fsf@gitster.g/ (item (2))
If/when we introduce some new revision walking flag that implements one
of these alternate types of revision walks, we can update the --advance
option and this manual appropriately.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git worktree list" attempts to show paths to worktrees while
aligning them, but miscounted display columns for the paths when
non-ASCII characters were involved, which has been corrected.
* pw/worktree-list-display-width-fix:
worktree list: quote paths
worktree list: fix column spacing
When "git replay" replays a commit it copies the extended headers
across from the original commit. However, if the original commit
was signed, we do not want to copy the header associated with the
signature is it wont be valid for the new commit. The code already
knows to avoid coping the "gpgsig" header but does not know to avoid
copying the "gpgsig-sha256" header. Add that header to the list of
exclusions to match what "git commit --amend" does.
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Tools like `git filter-repo`[1] use `git fast-export` and
`git fast-import` to rewrite repository history. When rewriting
history using one such tool though, commit signatures might become
invalid because the commits they sign changed due to the changes
in the repository history made by the tool between the fast-export
and the fast-import steps.
Note that as far as signature handling goes:
* Since fast-export doesn't know what changes filter-repo may make
to the stream, it can't know whether the signatures will still be
valid.
* Since filter-repo doesn't know what history canonicalizations
fast-export performed (and it performs a few), it can't know whether
the signatures will still be valid.
* Therefore, fast-import is the only process in the pipeline that
can know whether a specified signature remains valid.
Having invalid signatures in a rewritten repository could be
confusing, so users rewritting history might prefer to simply
discard signatures that are invalid at the fast-import step.
For example a common use case is to rewrite only "recent" history.
While specifying commit ranges corresponding to "recent" commits
could work, users worry about getting it wrong and want to just
automatically rewrite everything, expecting older commit signatures
to be untouched.
To let them do that, let's add a new 'strip-if-invalid' mode to the
`--signed-commits=<mode>` option of `git fast-import`.
It would be interesting for the `--signed-tags=<mode>` option to
have this mode too, but we leave that for a future improvement.
It might also be possible for `git fast-export` to have such a mode
in its `--signed-commits=<mode>` and `--signed-tags=<mode>`
options, but the use cases for it are much less clear, so we also
leave that for possible future improvements.
For now let's just die() if 'strip-if-invalid' is passed to these
options where it hasn't been implemented yet.
[1]: https://github.com/newren/git-filter-repo
Helped-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When asked to perform object consistency checks via the `--fsck-objects`
flag we verify that each object part of the pack is valid. In general,
this check can even be performed outside of a Git repository: we don't
need an initialized object database as we simply read the object from
the packfile directly.
But there's one exception: a subset of the object checks may be deferred
to a later point in time. For now, this only concerns ".gitmodules" and
".gitattributes" files: whenever we see a tree referencing these files
we queue them for a deferred check. This is done because we need to do
some extra checks for those files to ensure that they are well-formed,
and these checks need to be done regardless of whether the corresponding
blobs are part of the packfile or not.
This works inside a repository, but unfortunately the logic leads to a
segfault when running outside of one. This is because we eventually call
`odb_read_object()`, which will crash because the object database has
not been initialized.
There's multiple options here:
- We could in theory create a purely in-memory database with only a
packfile store that contains the single packfile. We don't really
have the infrastructure for this yet though, and it would end up
being quite hacky.
- We could refuse to perform consistency checks outside of a
repository. But most of the checks work alright, so this would be a
regression.
- We can skip the finalizing consistency checks when running outside
of a repository. This is not as invasive as skipping all checks,
but it's not great to randomly skip a subset of tests, either.
None of these options really feel perfect. The first one would be the
obvious choice if easily possible.
There's another option though: instead of skipping the final object
checks, we can die if there are any queued object checks. With this
change we now die exactly if and only if we would have previously
segfaulted. Like this we ensure that objects that _may_ fail the
consistency checks won't be silently skipped, and at the same time we
give users a much better error message.
Refactor the code accordingly and add a test that would have triggered
the segfault. Note that we also move down the logic to add the packfile
to the store. There is no point doing this any earlier than right before
we execute `fsck_finish()`, and it ensures that the logic to set up and
perform the consistency check is self-contained.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git submodule add" tries to find if a submodule with the same name
already exists at a different path, by looking up an entry in the
.gitmodules file. If the entry in the file is incomplete, e.g.,
when the submodule.<name>.something variable is defined but there is
no definition of submodule.<name>.path variable, it accesses the
missing .path member of the submodule structure and triggers a
segfault.
A brief audit was done to make sure that the code does not assume
members other than those that are absolutely certain to exist: a
submodule obtained by submodule_from_name() should have .name
member, while a submodule obtained by submodule_from_path() should
also have .path as well as .name member, and we cannot assume
anything else. Luckily, the module_add() codepath was the only
problematic one. It is fairly recent code that comes from 1fa06ced
(submodule: prevent overwriting .gitmodules on path reuse,
2025-07-24).
A helper used by update_submodule() seems to assume that its call to
submodule_from_path() always yields a submodule object without a
failure, which seems to rely on the caller making sure it is the
case. Leave an assert() with a NEEDSWORK comment there for future
developers to make sure the assumption actually holds.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These callers expect that git_config_pathname() that returns 0 is a
signal that the variable they passed has a string they need to act
on. But with the introduction of ":(optional)path" earlier, that is
no longer the case. If the path specified by the configuration
variable is missing, their variable will get a NULL in it, and they
need to act on it (often, just refraining from copying it elsewhere).
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Earlier we added support for a value spelled as ":(optional)path"
for configuration variables whose values are of type "path", with
the documented semantics "if the path is missing, behave as if such
a variable definition is not even there."
This has worked OK for code paths that reads configuration files and
stores the configured value as a string, where NULL in such a string
is treated as if the setting is not there, left as the default.
However, there are other code paths that do not _ignore_ such NULL
values and misbehave. "git config get --path" is one of them.
When git_config_pathname() helper function finds that the value of
the variable is an optional path *and* the path is missing, it
leaves the destination pointer intact (which usually is left to
NULL) and returns 0 to signal a success. format_config() helper
however assumed that the destination pointer always gets a string,
which no longer is the case, and segfaulted.
Make sure that git_config_pathname() clears the destination pointer
in such a case, and teach format_config() to react to the condition
by returning 1 (which is different from 0 that is a normal success
and negative that is an error) to its callers. Adjust the callers
to react to this new return value that tells them to pretend as if
they did not even see this partcular <key, value> pair.
Reported-by: Han Jiang <jhcarl0814@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "git repo structure" subcommand tried to align its output but
mixed up byte count and display column width, which has been
corrected.
* jx/repo-struct-utf8width-fix:
builtin/repo: fix table alignment for UTF-8 characters
t/unit-tests: add UTF-8 width tests for CJK chars
A part of code paths that deals with loose objects has been cleaned
up.
* ps/object-source-loose:
object-file: refactor writing objects via a stream
object-file: rename `write_object_file()`
object-file: refactor freshening of objects
object-file: rename `has_loose_object()`
object-file: read objects via the loose object source
object-file: move loose object map into loose source
object-file: hide internals when we need to reprepare loose sources
object-file: move loose object cache into loose source
object-file: introduce `struct odb_source_loose`
object-file: move `fetch_if_missing`
odb: adjust naming to free object sources
odb: introduce `odb_source_new()`
odb: fix subtle logic to check whether an alternate is usable
"git replay" (experimental) learned to perform ref updates itself
in a transaction by default, instead of emitting where each refs
should point at and leaving the actual update to another command.
* sa/replay-atomic-ref-updates:
replay: add replay.refAction config option
replay: make atomic ref updates the default behavior
replay: use die_for_incompatible_opt2() for option validation
The flags --all and --value of "git config unset" don't make the command
"replace" or "show" anything, they are about selecting what to unset.
Change their help text accordingly.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The command "git config set <name> <value>" fails for an option that has
multiple values. List the "git config set" flags that can be used,
instead of old-style "git config" actions.
Reported-by: Paul Wintz <pwintz@ucsc.edu>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the preceding commits we have turned `struct odb_read_stream` into a
publicly visible structure. Furthermore, this structure now contains the
type and size of the object that we are about to stream. Consequently,
the out-pointers that we used before to propagate the type and size of
the streamed object are now somewhat redundant with the data contained
in the structure itself.
Drop these out-pointers and adapt callers accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "streaming" terminology is somewhat generic, so it may not be
immediately obvious that "streaming.{c,h}" is specific to the object
database. Rectify this by moving it into the "odb/" directory so that it
can be immediately attributed to the object subsystem.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor the streaming interface to be centered around object databases
instead of centered around the repository. Rename the functions
accordingly.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Subsequent commits will move the backend-specific logic of object
streaming into their respective subsystems. These subsystems have gotten
rid of `the_repository` already, but we still use it in two locations in
the streaming subsystem.
Prepare for the move by fixing those two cases. Converting the logic in
`open_istream_pack_non_delta()` is trivial as we already got the object
database as input.
But for `stream_blob_to_fd()` we have to add a new parameter to make it
accessible. So, as we already have to adjust all callers anyway, rename
the function to `odb_stream_blob_to_fd()` to indicate it's part of the
object subsystem.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>