Code clean-up.
* rs/tag-wo-the-repository:
tag: stop using the_repository
tag: support arbitrary repositories in parse_tag()
tag: support arbitrary repositories in gpg_verify_tag()
tag: use algo of repo parameter in parse_tag_buffer()
Use hook API to replace ad-hoc invocation of hook scripts with the
run_command() API.
* ar/run-command-hook:
receive-pack: convert receive hooks to hook API
receive-pack: convert update hooks to new API
hooks: allow callers to capture output
run-command: allow capturing of collated output
hook: allow overriding the ungroup option
reference-transaction: use hook API instead of run-command
transport: convert pre-push to hook API
hook: convert 'post-rewrite' hook in sequencer.c to hook API
hook: provide stdin via callback
run-command: add stdin callback for parallelization
run-command: add first helper for pp child states
More object database related information are shown in "git repo
structure" output.
* jt/repo-struct-more-objinfo:
builtin/repo: add object disk size info to structure table
builtin/repo: add disk size info to keyvalue stucture output
builtin/repo: add inflated object info to structure table
builtin/repo: add inflated object info to keyvalue structure output
builtin/repo: humanise count values in structure output
strbuf: split out logic to humanise byte values
builtin/repo: group per-type object values into struct
Allow callers of parse_tag() pass in the repository to use. Let most of
them pass in the_repository to get the same result as before. One of
them has stopped using the_repository in ef9b0370da (sha1-name.c: store
and use repo in struct disambiguate_state, 2019-04-16); let it pass in
its stored repository.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow callers of gpg_verify_tag() specify the repository to use by
providing a parameter for that. One of the two has not been using
the_repository since 43a8391977 (builtin/verify-tag: stop using
`the_repository`, 2025-03-08); let it pass in the correct repository.
The other simply passes the_repository to get the same result as before.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This converts the last remaining hooks to the new hook API, for
the same benefits as the previous conversions (no need to toggle
signals, manage custom struct child_process, call find_hook(),
prepares for specifyinig hooks via configs, etc.).
I noticed a performance degradation when processing large amounts
of hook input with just 1 line per callback, due to run-command's
poll loop, therefore I batched 500 lines per callback, to ensure
similar pipe throughput as before and to avoid hook child waiting
on stdin.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the new hook sideband API introduced in the previous commit.
The hook API avoids creating a custom struct child_process and other
internal hook plumbing (e.g. calling find_hook()) and prepares for
the specification of hooks via configs or running parallel hooks.
Execution is still sequential through the current hook.[ch] via the
run_process_parallel_opts.processes=1 arg.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When calling run_process_parallel() in run_hooks_opt(), the
ungroup option is currently hardcoded to .ungroup = 1.
This causes problems when ungrouping should be disabled, for
example when sideband-reading collated output from child hooks,
because sideband-reading and ungrouping are mutually exclusive.
Thus a new hook.h option is added to allow overriding.
The existing ungroup=1 behavior is preserved in the run_hooks()
API and the "hook run" command. We could modify these to take
an option if necessary, so I added two code comments there.
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Building a list using commit_list_insert_by_date() has quadratic worst
case complexity. Avoid it by using prio_queue.
Use prio_queue_peek()+prio_queue_replace() instead of prio_queue_get()+
prio_queue_put() if possible, as the former only rebalance the
prio_queue heap once instead of twice.
In sane repositories this won't make much of a difference because the
number of items in the list or queue won't be very high:
Benchmark 1: ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo
Time (mean ± σ): 538.2 ms ± 0.8 ms [User: 527.6 ms, System: 9.6 ms]
Range (min … max): 537.0 ms … 539.2 ms 10 runs
Benchmark 2: ./git show-branch origin/main origin/next origin/seen origin/todo
Time (mean ± σ): 530.6 ms ± 0.4 ms [User: 519.8 ms, System: 9.8 ms]
Range (min … max): 530.1 ms … 531.3 ms 10 runs
Summary
./git show-branch origin/main origin/next origin/seen origin/todo ran
1.01 ± 0.00 times faster than ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo
That number is not limited, though, and in pathological cases like the
one in p6010 we see a sizable improvement:
Test v2.52.0 HEAD
------------------------------------------------------------------
6010.4: git show-branch 2.19(2.19+0.00) 0.03(0.02+0.00) -98.6%
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We received a report that invoking "git restore -source my_base_branch"
resulted in the confusing error message "fatal: could not resolve
ource". This looked like a typo in our error message, but it is
actually because "-source" is missing its second dash and is being
resolved as "-s ource". However, due to the lack of the quoting
recommended in CodingGuidelines, this is confusing to the reader and
we can do better.
Add the necessary quoting to this message. With this change, we now get
this less confusing message:
fatal: could not resolve 'ource'
Reported-by: Zhelyo Zhelev <zhelyo@gmail.com>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch" that involves fetching tags, when a tag being fetched
needs to overwrite existing one, failed to fetch other tags, which
has been corrected.
* kn/fix-fetch-backfill-tag-with-batched-ref-updates:
fetch: fix failed batched updates skipping operations
fetch: fix non-conflicting tags not being committed
fetch: extract out reference committing logic
MEMZERO_ARRAY() helper is introduced to avoid clearing only the
first N bytes of an N-element array whose elements are larger than
a byte.
* tc/memzero-array:
contrib/coccinelle: pass include paths to spatch(1)
git-compat-util: introduce MEMZERO_ARRAY() macro
"git submodule add" to add a submodule under <name> segfaulted,
when a submodule.<name>.something is already in .gitmodules file
without defining where its submodule.<name>.path is, which has been
corrected.
* jc/submodule-add:
submodule add: sanity check existing .gitmodules
"git replay --onto=<commit> ...", when <commit> is mistyped,
started to segfault with recent change, which has been corrected.
* rs/replay-wrong-onto-fix:
replay: move onto NULL check before first use
"git replay" documentation updates.
* kh/doc-replay-updates:
doc: replay: link section using markup
replay: improve --contained and add to doc
doc: replay: mention no output on conflicts
Similar to a prior commit, update the table output format for the
git-repo(1) structure command to display the total object disk usage by
object type.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to a prior commit, extend the keyvalue and nul output formats of
the git-repo(1) structure command to additionally provide info regarding
total object disk sizes by object type.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the table output format for the git-repo(1) structure command to
begin printing the total inflated object size info by object type. To be
more human-friendly, larger values are scaled down and displayed with
the appropriate unit prefix. Output for the keyvalue and nul formats
remains unchanged.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The structure subcommand for git-repo(1) outputs basic count information
for objects and references. Extend this output to also provide
information regarding total size of inflated objects by object type.
For now, object size by object type info is only added to the keyvalue
and nul output formats. In a subsequent commit, this info is also added
to the table format.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The table output format for the git-repo(1) structure subcommand is used
by default and intended to provide output to users in a human-friendly
manner. When the reference/object count values in a repository are
large, it becomes more cumbersome for users to read the values.
For larger values, update the table output format to instead produce
more human-friendly count values that are scaled down with the
appropriate unit prefix. Output for the keyvalue and nul formats remains
unchanged.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `object_stats` structure stores object counts by type. In a
subsequent commit, additional per-type object measurements will also be
stored. Group per-type object values into a new struct to allow better
reuse.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "git_istream" abstraction has been revamped to make it easier
to interface with pluggable object database design.
* ps/object-read-stream:
streaming: drop redundant type and size pointers
streaming: move into object database subsystem
streaming: refactor interface to be object-database-centric
streaming: move logic to read packed objects streams into backend
streaming: move logic to read loose objects streams into backend
streaming: make the `odb_read_stream` definition public
streaming: get rid of `the_repository`
streaming: rely on object sources to create object stream
packfile: introduce function to read object info from a store
streaming: move zlib stream into backends
streaming: create structure for filtered object streams
streaming: create structure for packed object streams
streaming: create structure for loose object streams
streaming: create structure for in-core object streams
streaming: allocate stream inside the backend-specific logic
streaming: explicitly pass packfile info when streaming a packed object
streaming: propagate final object type via the stream
streaming: drop the `open()` callback function
streaming: rename `git_istream` into `odb_read_stream`
"git repo struct" learned to take "-z" as a synonym to "--format=nul".
* lo/repo-struct-z:
repo: add -z as an alias for --format=nul to git-repo-structure
repo: use [--format=... | -z] instead of [-z] in git-repo-info synopsis
repo: remove blank line from Documentation/git-repo.adoc
A help message from "git branch" now mentions "git help" instead of
"man" when suggesting to read some documentation.
* kh/advise-w-git-help-in-branch:
branch: advice using git-help(1) instead of man(1)
"git last-modified" used to mishandle "--" to mark the beginning of
pathspec, which has been corrected.
* js/last-modified-with-sparse-checkouts:
last-modified: support sparse checkouts
Recent optimization to "last-modified" command introduced use of
uninitialized block of memory, which has been corrected.
* tc/last-modified-active-paths-optimization:
last-modified: fix use of uninitialized memory
There is no documentation for `--contained`.
Start by copying the text from `replay_options` in `builtin/
replay.c`. But some people think that the existing text is a
bit unclear; what does it mean for a branch to be contained
in a revision range? Let’s include the implied commits here:
the branches that point at commits in the range.
Also use “update” instead of “advance”. “Update” is the verb
commonly used in this context.
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unless there are good reasons, it is customary to have the options[]
array used with the parse-options API declared in function scope rather
than at file scope.
Move builtin/pull.c:cmd_pull()’s options[] array into the function to
match that convention.
Signed-off-by: K Jayatheerth <jayatheerthkulkarni2005@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
cmd_replay() aborts if the pointer "onto" is NULL after argument
parsing, e.g. when specifying a non-existing commit with --onto.
15cd4ef1f4 (replay: make atomic ref updates the default behavior,
2025-11-06) added code that dereferences this pointer before the check.
Switch their places to avoid a segmentation fault.
Reported-by: Kristoffer Haugsbakk <kristofferhaugsbakk@fastmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Introduce a new macro MEMZERO_ARRAY() that zeroes the memory allocated
by ALLOC_ARRAY() and friends. And add coccinelle rule to enforce the use
of this macro.
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression introduced with batched updates in 0e358de64a (fetch:
use batched reference updates, 2025-05-19) when fetching references. In
the `do_fetch()` function, we jump to cleanup if committing the
transaction fails, regardless of whether using batched or atomic
updates. This skips three subsequent operations:
- Update 'FETCH_HEAD' as part of `commit_fetch_head()`.
- Add upstream tracking information via `set_upstream()`.
- Setting remote 'HEAD' values when `do_set_head` is true.
For atomic updates, this is expected behavior. For batched updates,
we want to continue with these operations even if some refs fail to
update.
Skipping `commit_fetch_head()` isn't actually a regression because
'FETCH_HEAD' is already updated via `append_fetch_head()` when not
using '--atomic'. However, we add a test to validate this behavior.
Skipping the other two operations (upstream tracking and remote HEAD)
is a regression. Fix this by only jumping to cleanup when using
'--atomic', allowing batched updates to continue with post-fetch
operations. Add tests to prevent future regressions.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit 0e358de64a (fetch: use batched reference updates, 2025-05-19)
updated the 'git-fetch(1)' command to use batched updates. This batches
updates to gain performance improvements. When fetching references, each
update is added to the transaction. Finally, when committing, individual
updates are allowed to fail with reason, while the transaction itself
succeeds.
One scenario which was missed here, was fetching tags. When fetching
conflicting tags, the `fetch_and_consume_refs()` function returns '1',
which skipped committing the transaction and directly jumped to the
cleanup section. This mean that no updates were applied. This also
extends to backfilling tags which is done when fetching specific
refspecs which contains tags in their history.
Fix this by committing the transaction when we have an error code and
not using an atomic transaction. This ensures other references are
applied even when some updates fail.
The cleanup section is reached with `retcode` set in several scenarios:
- `truncate_fetch_head()`, `open_fetch_head()` and `prune_refs()` set
`retcode` before the transaction is created, so no commit is
attempted.
- `fetch_and_consume_refs()` and `backfill_tags()` are the primary
cases this fix targets, both setting a positive `retcode` to
trigger the committing of the transaction.
This simplifies error handling and ensures future modifications to
`do_fetch()` don't need special handling for batched updates.
Add tests to check for this regression. While here, add a missing
cleanup from previous test.
Reported-by: David Bohman <debohman@gmail.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The use of "revision" (a connected set of commits) has been
clarified in the "git replay" documentation.
* en/replay-doc-revision-range:
Documentation/git-replay.adoc: fix errors around revision range
"git replay" forgot to omit the "gpgsig-sha256" extended header
from the resulting commit the same way it omits "gpgsig", which has
been corrected.
* pw/replay-exclude-gpgsig-fix:
replay: do not copy "gpgsign-sha256" header
The error message given by "git config set", when the variable
being updated has more than one values defined, used old style "git
config" syntax with an incorrect option in its hint, both of which
have been corrected.
* rs/config-set-multi-error-message-fix:
config: fix suggestion for failed set of multi-valued option
The option help text given by "git config unset -h" described
the "--all" option to "replace", not "unset", multiple variables,
which has been corrected.
* rs/config-unset-opthelp-fix:
config: fix short help of unset flags
Code refactoring around object database sources.
* ps/object-source-management:
odb: handle recreation of quarantine directories
odb: handle changing a repository's commondir
chdir-notify: add function to unregister listeners
odb: handle initialization of sources in `odb_new()`
http-push: stop setting up `the_repository` for each reference
t/helper: stop setting up `the_repository` repeatedly
builtin/index-pack: fix deferred fsck outside repos
oidset: introduce `oidset_equal()`
odb: move logic to disable ref updates into repo
odb: refactor `odb_clear()` to `odb_free()`
odb: adopt logic to close object databases
setup: convert `set_git_dir()` to have file scope
path: move `enter_repo()` into "setup.c"
"git config get --path" segfaulted on an ":(optional)path" that
does not exist, which has been corrected.
* jc/optional-path:
config: really treat missing optional path as not configured
config: really pretend missing :(optional) value is not there
config: mark otherwise unused function as file-scope static
Other Git commands that have nul-terminated output, such as git-config,
git-status, git-ls-files, and git-repo-info have a flag `-z` for using
the null character as the record separator.
Add the `-z` flag to git-repo-structure as an alias for `--format=nul`,
making it consistent with the behavior of the other commands.
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The flag -z is only an alias for --format=null and even though --format
and -z can be used together and repeated, only the last one is
considered.
Replace `[-z]` in the synopsis of git-repo-info by
`[--format=... | -z]`, expliciting that the use of one of those flags
replace the other.
Signed-off-by: Lucas Seiki Oshiro <lucasseikioshiro@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a sparse checkout, a user might want to run `last-modified` on a
directory outside the worktree.
And even in non-sparse checkouts, a user might need to run that command
on a directory that does not exist in the worktree.
These use cases should be supported via the `--` separator between
revision and file arguments, which is even advertised in the
documentation. This patch fixes a tiny bug that prevents that from
working.
This fixes https://github.com/git-for-windows/git/issues/5978
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
8fbd903e (branch: advise about ref syntax rules, 2024-03-05) added
an advice about checking git-check-ref-format(1) for the ref syntax
rules. The advice uses man(1). But git(1) is a multi-platform tool and
man(1) may not be available on some platforms. It might also be slightly
jarring to see a suggestion for running a command which is not from
the Git suite.
Let’s instead use git-help(1) in order to stay inside the land of
git(1). This also means that `help.format` (for `man`, `html` or other
formats) will be used if set.
Also change to using single quotes (') to quote the command since that
is more conventional.
While here let’s also update the test to use `{SQ}`, which is more
readable and easier to edit.
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git-last-modified(1) uses a scratch bitmap to keep track of paths that
have been changed between commits. To avoid reallocating a bitmap on
each call of process_parent(), the scratch bitmap is kept and reused.
Although, it seems an incorrect length is passed to memset(3).
`struct bitmap` uses `eword_t` to for internal storage. This type is
typedef'd to uint64_t. To fully zero the memory used by the bitmap,
multiply the length (saved in `struct bitmap::word_alloc`) by the size
of `eword_t`.
Reported-by: Anders Kaseorg <andersk@mit.edu>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Toon Claes <toon@iotcl.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>