79260 Commits

Author SHA1 Message Date
Junio C Hamano
e50c3ca095 Merge branch 'pw/3.0-default-initial-branch-to-main'
Declare that "git init" that is not otherwise configured uses
'main' as the initial branch, not 'master', starting Git 3.0.

* pw/3.0-default-initial-branch-to-main:
  t0613: stop setting default initial branch
  t9902: switch default branch name to main
  t4013: switch default branch name to main
  breaking-changes: switch default branch to main
2025-09-29 11:40:34 -07:00
Junio C Hamano
d235f69ae8 Merge branch 'nb/send-email-no-dup-reply-to'
"git send-email --compose --reply-to=<address>" used to add
duplicated Reply-To: header, which made mailservers unhappy.  This
has been corrected.

* nb/send-email-no-dup-reply-to:
  send-email: don't duplicate Reply-to: in intro message
2025-09-29 11:40:33 -07:00
Junio C Hamano
347af012db Merge branch 'ps/clar-updates'
Import a newer version of the clar unit testing framework.

* ps/clar-updates:
  t/unit-tests: update to 10e96bc
  t/unit-tests: update clar to fcbed04
2025-09-29 11:40:33 -07:00
Junio C Hamano
9fab7ec7ff Merge branch 'ps/packfile-store' into tb/incremental-midx-part-3.1
* ps/packfile-store:
  packfile: refactor `get_packed_git_mru()` to work on packfile store
  packfile: refactor `get_all_packs()` to work on packfile store
  packfile: refactor `get_packed_git()` to work on packfile store
  packfile: move `get_multi_pack_index()` into "midx.c"
  packfile: introduce function to load and add packfiles
  packfile: refactor `install_packed_git()` to work on packfile store
  packfile: split up responsibilities of `reprepare_packed_git()`
  packfile: refactor `prepare_packed_git()` to work on packfile store
  packfile: reorder functions to avoid function declaration
  odb: move kept cache into `struct packfile_store`
  odb: move MRU list of packfiles into `struct packfile_store`
  odb: move packfile map into `struct packfile_store`
  odb: move initialization bit into `struct packfile_store`
  odb: move list of packfiles into `struct packfile_store`
  packfile: introduce a new `struct packfile_store`
2025-09-29 09:31:08 -07:00
Junio C Hamano
666b29b58f t7500: make each piece more independent
These tests prepare the working tree & index state to have something
to be committed, and try a sequence of "test_must_fail git commit".
If an earlier one did not fail by a bug, a later one will fail for
a wrong reason (namely, "nothing to commit").

Give them "--allow-empty" to make sure that they would work even
when there is nothing to commit by accident.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: D. Ben Knoble <ben.knoble+github@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-28 15:40:30 -07:00
Justin Tobler
3721541d35 clang-format: exclude control macros from SpaceBeforeParens
The formatter currently suggests adding a space between a control macro
and parentheses. In the Git project, this is not typically expected. Set
`SpaceBeforeParens` to `ControlStatementsExceptControlMacros`
accordingly.

Helped-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-28 08:37:23 -07:00
Jean-Noël Avila
5a12fd2a8c doc: change the markup of paragraphs following a nested list item
Asciidoctor and asciidoc.py have different behaviors when a paragraph
follows a nested list item. Asciidoctor has a bug[1] that makes it keep a
plus sign (+) used to attached paragraphs at the beginning of the paragraph.

This commit uses workarounds to avoid this problem by using second level
definition lists and open blocks.

[1]:https://github.com/asciidoctor/asciidoctor/issues/4704

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-27 17:35:29 -07:00
Ezekiel Newren
efaf553b1a xdiff: delete unnecessary fields from xrecord_t and xdfile_t
xrecord_t.next, xdfile_t.hbits, xdfile_t.rhash are initialized,
but never used for anything by the code. Remove them.

Best-viewed-with: --color-words
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-26 16:08:55 -07:00
Ezekiel Newren
d1c028bdf7 xdiff: delete local variables and initialize/free xdfile_t directly
These local variables are essentially a hand-rolled additional
implementation of xdl_free_ctx() inlined into xdl_prepare_ctx(). Modify
the code to use the existing xdl_free_ctx() function so there aren't
two ways to free such variables.

Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-26 16:08:54 -07:00
Ezekiel Newren
43d5f52ac4 xdiff: delete static forward declarations in xprepare
Move xdl_prepare_env() later in the file to avoid the need
for static forward declarations.

Best-viewed-with: --color-moved
Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-26 16:08:54 -07:00
Johannes Schindelin
ecc5749578 http-push: avoid new compile error
With the recent update in Git for Windows/ARM64 as of
https://github.com/git-for-windows/git-sdk-arm64/commit/21b288e16358
cURL was updated from v8.15.0 to v8.16.0, and the LLVM-based builds (but
strangely not the GCC-based builds) continuously greet me thusly:

  http-push.c:211:2: error: call to '_curl_easy_setopt_err_long' declared
  with 'warning' attribute: curl_easy_setopt expects a long argument
  [-Werror,-Wattribute-warning]
      CC builtin/apply.o
    211 |         curl_easy_setopt(curl, CURLOPT_INFILESIZE, buffer->buf.len);
        |         ^
  C:/a/git-sdk-arm64/git-sdk-arm64/minimal-sdk/clangarm64/include/curl/typecheck-gcc.h:50:15:
  note: expanded from macro 'curl_easy_setopt'
     50 |               _curl_easy_setopt_err_long();                             \
        |               ^
  1 error generated.
  make: *** [Makefile:2877: http-push.o] Error 1

The easiest way to shut up that compile error (which is legitimate,
seeing as the `CURLOPT_INFILESIZE` options expects a `long` parameter,
but `buffer->buf.len` refers to the `size_t` attribute of a `strbuf`)
would be to simply cast the parameter to a `long`.

However, there is a much better solution: To use the
`CURLOPT_INFILESIZE_LARGE` option instead, which was added in cURL
v7.11.0 (see https://curl.se/ch/7.11.0.html) and which Git _already_
uses in `curl_append_msgs_to_imap()`.

This fix was the motivation for renaming `xcurl_off_t()` to
`cast_size_t_to_curl_off_t()` and making it available more broadly,
which is the reason why it is used here, too.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-26 10:38:18 -07:00
Johannes Schindelin
580cf0f2f6 imap-send: be more careful when casting to curl_off_t
When casting a `size_t` to `curl_off_t`, there is a currently uncommon
chance that the value can be cut off (`curl_off_t` is expected to be a
signed 64-bit data type).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-26 10:38:18 -07:00
Johannes Schindelin
e4efcd7060 http: offer to cast size_t to curl_off_t safely
This commit moves the `xcurl_off_t()` function, which validates that a
given value fits within the `curl_off_t` data type and then casts it, to
a more central place so that it can be used outside of `remote-curl.c`,
too.

At the same time, this function is renamed to conform better with the
naming convention of the helper functions that safely cast from one data
type to another which has been well established in `git-compat-util.h`.

With this move, `gettext.h` must be `#include`d in `http.h` to allow the
error message to remain translatable.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-26 10:38:18 -07:00
Mark Levedahl
fe2005e723 gitk: make sha1but a ttk::button
gitk's 'Commit ID' button uses a classic widget, not a themed one,
leading to inconsistent style. Commit 51a7e8b654 (d93f1713b0 ("gitk: Use
themed tk widgets", 2009-04-17) that added themed widgets did not touch
this particular widget, but does not say why. Regardless, let's use a
themed button to be consistent with the rest of the interface.

Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
2025-09-25 15:55:57 -04:00
Jacob Keller
c0bec06cfe diff --no-index: fix logic for paths ending in '/'
If one of the two provided paths for git diff --no-index ends in a '/',
a failure similar to the following occurs:

  $ git diff --no-index -- /tmp/ /tmp/ ':!'
  fatal: `pos + len' is too far after the end of the buffer

This occurs because of an incorrect calculation of the skip lengths in
diff_no_index(). The code wants to calculate the length of the string,
but add one in case the string doesn't end with a slash.

The method it uses is incorrect, as it always checks the trailing NUL
character of the string. This will never be a '/', so we always add one.
In the event that we *do* have a trailing slash, this will create an
off-by-one length error later when using the skip value.

The most straightforward fix would be to correct the skip1 and skip2
lengths by using ends_with().

However, Johannes made a good point that the existing logic is wasting a
lot of computation. We generate the match string by copying the path in
and then skipping almost all of it immediately with a potentially
expensive memmove() from the strbuf_remove() call. We also re-initialize
the match stringbuf each time we call read_directory_contents.

The read_directory_contents really wants a path that is rooted at the
start of the directory scan. We're currently building this by taking the
full path and stripping out the start portion. Instead, replace this
logic by building up the portion of the match as we go.

Start by initializing two strbuf in diff_no_index containing the empty
string. Pass these into queue_diff, which in turn passes the appropriate
left or right side into read_directory_contents.

As before, we build up the matches by appending elements to the match
path and then clearing them using strbuf_setlen.

In the recursive portion of the queue_diff algorithm, we build up new
match paths the same way that we build up new buffer paths, by appending
the elements and then clearing them with strbuf_setlen after each
iteration. This is cheaper as it avoids repeated allocations, and is a
bit simpler to track what is going on.

Add a couple of test cases that pass in paths already ending in '/', to
ensure the tests cover this regression.

Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Closes: https://lore.kernel.org/git/c75ec5f9-407a-6555-d4fb-bb629d54ec61@gmx.de/
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
[jc: small leakfixes at the end of diff_no_index()]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-25 11:35:20 -07:00
Kristoffer Haugsbakk
155986b49b format-patch: handle range-diff on notes correctly for single patches
(The two next paragraphs are taken from the previous commit.)

git-format-patch(1) supports Git notes by showing them beneath the
patch/commit message, similar to git-log(1). The command also supports
showing those same notes ref names in the range diff output.

Note *the same* ref names; any Git notes options or configuration
variables need to be handed off to the range-diff machinery. This works
correctly in the case when the range diff is on the cover letter. But it
does not work correctly when the output is a single patch with an
embedded range diff.

Concretely, git-format-patch(1) needs to pass `--[no-]notes` options on
to the range-diff subprocess in `range-diff.c`. Range diffs for single-
commit series are handled in `log-tree.c`. But `log-tree.c` had no
access to any `log_arg` variable before we added it to `rev_info` in the
previous commit.

Use that new struct member to fix this inconsistency.

Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-25 11:34:12 -07:00
Kristoffer Haugsbakk
85bd88a7e8 revision: add rdiff_log_arg to rev_info
git-format-patch(1) supports Git notes by showing them beneath the
patch/commit message, similar to git-log(1). The command also supports
showing those same notes ref names in the range diff output.

Note *the same* ref names; any Git notes options or configuration
variables need to be handed off to the range-diff machinery. This works
correctly in the case when the range diff is on the cover letter. But it
does not work correctly when the output is a single patch with an
embedded range diff.

Concretely, git-format-patch(1) needs to pass `--[no-]notes` options
on to the range-diff subprocess in `range-diff.c`. This is handled in
`builtin/log.c` by the local variable `log_arg` in the case of mul-
tiple commits, but not in the single commit case where there is no
cover letter and the range diff is embedded in the patch output; the
range diff is then made in `log-tree.c`, whither `log_arg` has not
been propagated. This means that the range-diff subprocess reverts
to its default behavior, which is to act like git-log(1) w.r.t. notes.

We need to fix this. But first lay the groundwork by converting
`log_arg` to a struct member; next we can simply use that member
in `log-tree.c` without having to thread it from `builtin/log.c`.

No functional changes.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-25 11:34:12 -07:00
Kristoffer Haugsbakk
71fd6c695c range-diff: rename other_arg to log_arg
Rename `other_arg` to `log_arg` in `range_diff_options` and
related places.

“Other argument” comes from bd361918 (range-diff: pass through --notes
to `git log`, 2019-11-20) which introduced Git notes handling to
git-range-diff(1) by passing that option on to git-log(1). And that kind
of name might be fine in a local context. However, it was initially
spread among multiple files, and is now[1] part of the
`range_diff_options` struct. It is, prima facie, difficult to guess what
“other” means, especially when just looking at the struct.

But with a little reading we find out that it is used for `--[no-]notes`
and `--diff-merges`, which are both passed on to git-log(1). We should
just rename it to reflect this role; `log_arg` suggests, along with the
`strvec` type, that it is used to pass extra arguments to git-log(1).

† 1: since f1ce6c19 (range-diff: combine all options in a single data
     structure, 2021-02-05)

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Kristoffer Haugsbakk <code@khaugsbakk.name>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-25 11:34:11 -07:00
Phillip Wood
732650e263 add-patch: update hunk splitability after editing
If, when the user edits a hunk, they change deletion lines into
context lines or vice versa, then the number of hunks that the edited
hunk can be split into may differ from the unedited hunk. This means
that so we should recalculate `hunk->splittable_into` after the hunk
has been edited. In practice users are unlikely to hit this bug as it
is doubtful that a user who has edited a hunk will split it afterwards.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-25 10:13:23 -07:00
Phillip Wood
3b9532dab2 add -p: mark split hunks as undecided
When a hunk is split, each of the new hunks inherits whether it is
selected or not from the original hunk. If a selected hunk is split
all of the new hunks are marked as "selected" and the user is only
prompted with the first of the split hunks. The user is not asked
whether or not they wish to select the rest of the new hunks. This
means that if they wish to deselect any of the new hunks apart from
the first one they have to navigate back to the hunk they want to
deselect before they can deselect it. This is unfortunate as the user
is presumably splitting the original hunk because they only want to
select some sub-set of it.

Instead mark all the new hunks as "undecided" so that the user is
prompted whether they wish to select each one in turn. In the case
where the user only wants to change the selection of the first of
the split hunks they will now have to do more work re-selecting the
remaining split hunks. However, changing the selection of any of the
other newly created hunks is now much simpler as the user no-longer has
to navigate back to them in order to change their selected state.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-25 10:13:22 -07:00
Mark Levedahl
811b8a34b9 gitk: use themed spinboxes
gitk uses classic (non-themed) spinboxes rather than the ttk variants.
Commit d93f1713b0 ("gitk: Use themed tk widgets", 2009-04-17) that added
ttk makes no mention of why ttk:spinboxes were omitted, but this leads
to an inconsistent interface. Let's use the ttk version.

Signed-off-by: Mark Levedahl <mlevedahl@gmail.com>
2025-09-25 12:04:02 -04:00
Julia Evans
657586a5a6 doc: git-push: rewrite refspec specification
From user feedback, there was a request for examples, as well as a
comment that one person found "If git push [<repository>] without
any <refspec> argument is set to update some ref at the destination
with <src> with remote.<repository>.push configuration variable..."
impossible to understand.

To make the section easier to navigate, create a list of every possible
refspec form, with examples for each form as well as 2 forms which were
previously missing (patterns and negative refspecs).

Made a few changes to use more familiar language, but ultimately
refspecs are a pretty advanced feature so I've mostly left the
terminology alone.

Signed-off-by: Julia Evans <julia@jvns.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 12:29:34 -07:00
Julia Evans
cc1cc31e2a doc: git-push: create PUSH RULES section
Right now the rules for when a `git push` is allowed are buried at the
bottom of the description of `<refspec>`. Put them in their own section
so that we can reference them from `--force` and give some context for
why they exist.

Having the "PUSH RULES" section also lets us be a little bit more
specific with the rule in `--force`: we can just focus on the rule
for pushing for a branch (which is likely the one that's most relevant)
and leave the details about what happens when you push to a tag or a ref
that isn't a branch to the later section.

Signed-off-by: Julia Evans <julia@jvns.ca>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 12:29:34 -07:00
Patrick Steinhardt
dd52a29b78 packfile: refactor get_packed_git_mru() to work on packfile store
The `get_packed_git_mru()` function prepares the packfile store and then
returns its packfiles in most-recently-used order. Refactor it to accept
a packfile store instead of a repository to clarify its scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:51 -07:00
Patrick Steinhardt
d2779beb36 packfile: refactor get_all_packs() to work on packfile store
The `get_all_packs()` function prepares the packfile store and then
returns its packfiles. Refactor it to accept a packfile store instead of
a repository to clarify its scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:51 -07:00
Patrick Steinhardt
751808b2a1 packfile: refactor get_packed_git() to work on packfile store
The `get_packed_git()` function prepares the packfile store and then
returns its packfiles. Refactor it to accept a packfile store instead of
a repository to clarify its scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:51 -07:00
Patrick Steinhardt
ab8aff4a6b packfile: move get_multi_pack_index() into "midx.c"
The `get_multi_pack_index()` function is declared and implemented in the
packfile subsystem, even though it really belongs into the multi-pack
index subsystem. The reason for this is likely that it needs to call
`packfile_store_prepare()`, which is not exposed by the packfile system.
In a subsequent commit we're about to add another caller outside of the
packfile system though, so we'll have to expose the function anyway.

Do so now already and move `get_multi_pack_index()` into the MIDX
subsystem.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:50 -07:00
Patrick Steinhardt
d67530f6bb packfile: introduce function to load and add packfiles
We have a recurring pattern where we essentially perform an upsert of a
packfile in case it isn't yet known by the packfile store. The logic to
do so is non-trivial as we have to reconstruct the packfile's key, check
the map of packfiles, then create the new packfile and finally add it to
the store.

Introduce a new function that does this dance for us. Refactor callsites
to use it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:50 -07:00
Patrick Steinhardt
f6f236d926 packfile: refactor install_packed_git() to work on packfile store
The `install_packed_git()` functions adds a packfile to a specific
object store. Refactor it to accept a packfile store instead of a
repository to clarify its scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:50 -07:00
Patrick Steinhardt
78237ea53d packfile: split up responsibilities of reprepare_packed_git()
In `reprepare_packed_git()` we perform a couple of operations:

  - We reload alternate object directories.

  - We clear the loose object cache.

  - We reprepare packfiles.

While the logic is hosted in "packfile.c", it clearly reaches into other
subsystems that aren't related to packfiles.

Split up the responsibility and introduce `odb_reprepare()` which now
becomes responsible for repreparing the whole object database. The
existing `reprepare_packed_git()` function is refactored accordingly and
only cares about reloading the packfile store now.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:50 -07:00
Patrick Steinhardt
c36ecc0685 packfile: refactor prepare_packed_git() to work on packfile store
The `prepare_packed_git()` function and its friends are responsible for
loading packfiles as well as the multi-pack index for a given object
database. Refactor these functions to accept a packfile store instead of
a repository to clarify their scope.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:50 -07:00
Patrick Steinhardt
995ee88027 packfile: reorder functions to avoid function declaration
Reorder functions so that we can avoid a forward declaration of
`prepare_packed_git()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:49 -07:00
Patrick Steinhardt
bd1a521de8 odb: move kept cache into struct packfile_store
The object database tracks a cache of "kept" packfiles, which is used by
git-pack-objects(1) to handle cruft objects. With the introduction of
the `struct packfile_store` we have a better place to host this cache
though.

Move the cache accordingly.

This moves the last bit of packfile-related state from the object
database into the packfile store. Adapt the comment for the `packfiles`
pointer in `struct object_database` to reflect this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:49 -07:00
Patrick Steinhardt
fe835b0ca0 odb: move MRU list of packfiles into struct packfile_store
The object database tracks the list of packfiles in most-recently-used
order, which is mostly used to favor reading from packfiles that contain
most of the objects that we're currently accessing. With the
introduction of the `struct packfile_store` we have a better place to
host this list though.

Move the list accordingly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:49 -07:00
Patrick Steinhardt
14aaf5c9d8 odb: move packfile map into struct packfile_store
The object database tracks a map of packfiles by their respective paths,
which is used to figure out whether a given packfile has already been
loaded. With the introduction of the `struct packfile_store` we have a
better place to host this list though.

Move the map accordingly.

`pack_map_entry_cmp()` isn't used anywhere but in "packfile.c" anymore
after this change, so we convert it to a static function, as well. Note
that we also drop the `inline` hint: the function is used as a callback
function exclusively, and callbacks cannot be inlined.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:49 -07:00
Patrick Steinhardt
3421cb56a8 odb: move initialization bit into struct packfile_store
The object database knows to skip re-initializing the list of packfiles
in case it's already been initialized. Whether or not that is the case
is tracked via a separate `initialized` bit that is stored in the object
database. With the introduction of the `struct packfile_store` we have a
better place to host this bit though.

Move it accordingly. While at it, convert the field into a boolean now
that we're allowed to use them in our code base.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:49 -07:00
Patrick Steinhardt
535b7a667a odb: move list of packfiles into struct packfile_store
The object database tracks the list of packfiles it currently knows
about. With the introduction of the `struct packfile_store` we have a
better place to host this list though.

Move the list accordingly. Extract the logic from `odb_clear()` that
knows to close all such packfiles and move it into the new subsystem, as
well.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:48 -07:00
Patrick Steinhardt
b7983adb51 packfile: introduce a new struct packfile_store
Information about an object database's packfiles is currently
distributed across two different structures:

  - `struct packed_git` contains the `next` pointer as well as the
    `mru_head`, both of which serve to store the list of packfiles.

  - `struct object_database` contains several fields that relate to the
    packfiles.

So we don't really have a central data structure that tracks our
packfiles, and consequently responsibilities aren't always clear cut.
A consequence for the upcoming pluggable object databases is that this
makes it very hard to move management of packfiles from the object
database level down into the object database source.

Introduce a new `struct packfile_store` which is about to become the
single source of truth for managing packfiles. Right now this data
structure doesn't yet contain anything, but in subsequent patches we
will move all data structures that relate to packfiles and that are
currently contained in `struct object_database` into this new home.

Note that this is only a first step: most importantly, we won't (yet)
move the `struct packed_git::next` pointer around. This will happen in a
subsequent patch series though so that `struct packed_git` will really
only host information about the specific packfile it represents.

Further note that the new structure still sits at the wrong level at the
end of this patch series: as mentioned, it should eventually sit at the
level of the object database source, not at the object database level.
But introducing the packfile store now already makes it way easier to
eventually push down the now-selfcontained data structure by one level.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-24 11:53:48 -07:00
Junio C Hamano
bb69721404 The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-23 11:53:40 -07:00
Junio C Hamano
3e0e2e3a5c Merge branch 'cs/subtree-squash-split-fix'
"git subtree" (in contrib/) did not work correctly when splitting
squashed subtrees, which has been improved.

* cs/subtree-squash-split-fix:
  contrib/subtree: fix split with squashed subtrees
2025-09-23 11:53:40 -07:00
Junio C Hamano
7c15d990cc Merge branch 'rs/get-oid-with-flags-cleanup'
Code clean-up.

* rs/get-oid-with-flags-cleanup:
  use repo_get_oid_with_flags()
2025-09-23 11:53:40 -07:00
Junio C Hamano
2e8d7569ea Merge branch 'jk/add-i-color'
Some among "git add -p" and friends ignored color.diff and/or
color.ui configuration variables, which is an old regression, which
has been corrected.

* jk/add-i-color:
  contrib/diff-highlight: mention interactive.diffFilter
  add-interactive: manually fall back color config to color.ui
  add-interactive: respect color.diff for diff coloring
  stash: pass --no-color to diff plumbing child processes
2025-09-23 11:53:40 -07:00
Junio C Hamano
2be606a3bd Merge branch 'cc/promisor-remote-capability'
The "promisor-remote" capability mechanism has been updated to
allow the "partialCloneFilter" settings and the "token" value to be
communicated from the server side.

* cc/promisor-remote-capability:
  promisor-remote: use string_list_split() in mark_remotes_as_accepted()
  promisor-remote: allow a client to check fields
  promisor-remote: use string_list_split() in filter_promisor_remote()
  promisor-remote: refactor how we parse advertised fields
  promisor-remote: use string constants for 'name' and 'url' too
  promisor-remote: allow a server to advertise more fields
  promisor-remote: refactor to get rid of 'struct strvec'
2025-09-23 11:53:40 -07:00
Jeff King
a04bc71725 revision: retain argv NULL invariant in setup_revisions()
In an argc/argv pair, the entry for argv[argc] is generally NULL. You
can iterate by counting up to argc, or by looking for the NULL entry in
argv.

When we pass such a pair to setup_revisions(), it shrinks argc to
account for the options we consumed and returns the result to the
caller. But it doesn't touch the entries after the reduced argc. So
argv[argc] will be left pointing at some arbitrary entry rather than
NULL.

This isn't the source of any known bugs, since all callers are aware of
the limitation and act accordingly. But it's a possible gotcha that may
be easy to miss.

Let's set the new argv[argc] to NULL, taking care to free it if the
caller asked us to do so.

It is tempting to do likewise for all of the entries afterwards, too, as
some of them may also need to be freed (e.g., if coming from a strvec).
But doing so isn't entirely trivial, as we munge argc in the function
(e.g., when we find "--" and move all of the entries after it into the
prune_data list). It would be possible with some light refactoring, but
it's probably not worth it. Nobody should ever look at them (they are
beyond the revised argc and past the NULL argv entry) outside of strvec
cleanup, and setup_revisions_from_strvec() already handles this case.

There's one other interesting gotcha: many callers which do not want to
provide arguments just pass 0/NULL for argc/argv. We need to check for
this case before assigning the final NULL.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-22 14:27:03 -07:00
Jeff King
18068139f2 treewide: pass strvecs around for setup_revisions_from_strvec()
The previous commit converted callers of setup_revisions() with a strvec
to use the safer and easier _from_strvec() variant.

Let's now convert spots that don't directly have a strvec, but receive
an argc/argv pair that eventually comes from one. We'll instead pass the
strvec down to the point where we call setup_revisions().

That makes these functions slightly less flexible if they were to grow
other callers that don't use strvecs, but this rigidity is buying us
some safety. It is only safe to pass the free_removed_argv_elements
option to setup_revisions() if we know the elements of argv/argc are
allocated on the heap. That isn't communicated in the type system when
we are passed the bare elements. But if we get a strvec, we know that
the elements are allocated strings.

And at any rate, each of these modified functions has only a single
caller (that has a strvec), so the loss of flexibility is unlikely to
ever matter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-22 14:27:03 -07:00
Jeff King
b553332f82 treewide: use setup_revisions_from_strvec() when we have a strvec
The previous commit introduced a wrapper to make using setup_revisions()
with a strvec easier and safer. It converted spots that were already
doing most of what the wrapper did.

Let's now convert spots where we were not setting up the
free_removed_argv_elements flag. As discussed in the previous commit,
this probably isn't fixing any bugs or leaks (since these sites wouldn't
trigger the re-shuffling of argv that causes them). This is mostly
future-proofing us against setup_revisions() becoming more aggressive
about its re-shuffling.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-22 14:27:03 -07:00
Jeff King
f93c1d86cc revision: add wrapper to setup_revisions() from a strvec
The setup_revisions() function was designed to take the argc/argv pair
from the operating system. But we sometimes construct our own argv using
a strvec and pass that in. There are a few gotchas that callers need to
deal with here:

  1. You should always pass the free_removed_argv_elements option via
     setup_revision_opt. Otherwise, entries may be leaked if
     setup_revisions() re-shuffles options.

  2. After setup_revisions() returns, the strvec state is odd. We get a
     reduced argc from setup_revisions() telling us how many unknown
     options were left in place. Entries after that in argv may be
     retained, or may be NULL (depending on how the reshuffling
     happened). But the strvec's "nr" field still represents the
     original value, and some of the entries it thinks it is still
     storing may be NULL. Callers must be careful with how they access
     it.

Some callers deal with (1), but not all. In practice they are OK because
they do not pass any options that would cause setup_revisions() to
re-shuffle (namely unknown options which may be relayed from the user,
and the use of the "--" separator). But it's probably a good idea to
consistently pass this option anyway to future-proof ourselves against
the details of setup_revisions() changing.

No callers address (2), though I don't think there any visible bugs.
Most of them simply call strvec_clear() and never otherwise look at the
result. And in fact, if they naively set foo.nr to the argc returned by
setup_revisions(), that would cause leaks!  Because setup_revisions()
does not free consumed options[1], we have to leave the "nr" field of
the strvec at its original value to find and free them during
strvec_clear().

So I don't think there are any bugs to fix here, but we can make things
safer and simpler for callers. Let's introduce a helper function that
sets the free_removed_argv_elements automatically and shrinks the strvec
to represent the retained options afterwards (taking care to free the
now-obsolete entries).

We'll start by converting all of the call-sites which use the
free_removed_argv_elements option. There should be no behavior change
for them, except that their "shrunken" entries are cleaned up
immediately, rather than waiting for a strvec_clear() call.

[1] Arguably setup_revisions() should be doing this step for us if we
    told it to free removed options, but there are many existing callers
    which will be broken if it did. Introducing this helper is a
    possible first step towards that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-22 14:27:03 -07:00
Jeff King
cd43948798 revision: manage memory ownership of argv in setup_revisions()
The setup_revisions() function takes an argc/argv pair and consumes
arguments from it, returning a reduced argc count to the caller. But it
may also overwrite entries within the argv array, as it shifts unknown
options to the front of argv (so they can be found in the range of
0..argc-1 after we return).

For a normal argc/argv coming from the operating system, this is OK.
We don't need to worry about memory ownership of the strings in those
entries. But some callers pass in allocated strings from a strvec, and
we do need to care about those.

We faced a similar issue in f92dbdbc6a (revisions API: don't leak memory
on argv elements that need free()-ing, 2022-08-02), which added an
option for callers to tell us that elements need to be freed. But the
implementation within setup_revisions() was incomplete.  It only covered
the case of dropping "--", but not the movement of unknown options.

When we shift argv entries around, we should free the elements we are
about to overwrite, so they are not leaked. For example, in:

  git stash show -p --invalid

we will pass this to setup_revisions():

  argc = 3, argv[] = { "show", "-p", "--invalid", NULL }

which will then return:

   argc = 2, argv[] = { "show", "--invalid", "--invalid", NULL }

overwriting the "-p" entry, which is leaked unless we free it at that
moment.

You can see in the output above another potential problem. We now have
two copies of the "--invalid" string. If the caller does not respect the
new argc when free-ing the strings via strvec_clear(), we'll get a
double-free. And git-stash suffers from this, and will crash with the
above command.

So it seems at first glance that the solution is to just assign the
reduced argc to the strvec.nr field in the caller. Then it would stop
after freeing only any copied entries. But that's not always right
either!

Remember that we are reducing "argc" to account for elements we've
consumed. So if there isn't an invalid option, we'd turn:

  argc = 2, argv[] = { "show", "-p", NULL }

into:

  argc = 1, argv[] = { "show", "-p", NULL }

In that case strvec_clear() must keep looking past the shortened argc we
return to find the original "-p" to free. It needs to use the original
argc to do that.

We can solve this by turning our argv writes into strict moves, not
copies. When we shuffle an unknown option to the front, we'll overwrite
its old position with NULL. That leaves an argv array that may have NULL
"holes" in it.

So in the "--invalid" example above we get:

   argc = 2, argv[] = { "show", "--invalid", NULL, NULL }

but something like "git stash -p --invalid -p" would yield:

  argc = 3, argv[] = { "show", "--invalid", NULL, "-p", NULL }

because we move "--invalid" to overwrite the first "-p", but the second
one is quietly consumed. But strvec_clear() can handle that fine (it
iterates over the "nr" field, and passing NULL to free() is OK).

To ease the implementation, I've introduced a helper function. It's a
little hacky because it must take a double-pointer to set the old
position to NULL. Which in turn means we cannot pass "&arg", our local
alias for the current entry we're parsing, but instead "&argv[i]", the
pointer in the original array. And to make it even more confusing, we
delegate some of this work to handle_revision_opt(), which is passed a
subset of the argv array, so is always working on "&argv[0]".

Likewise, because handle_revision_opt() only receives the part of argv
left to parse, it receives the array to accumulate unknown options as a
separate unkc/unkv pair. But we're always working on the same argv
array, so our strategy works fine. I suspect this would be a bit more
obvious (and avoid some pointer cleverness) if all functions saw the
full argv array and worked with positions within it (and our new helper
would take two positions, a src and dst). But that would involve
refactoring handle_revision_opt().  I punted on that, as what's here is
not too ugly and is all contained within revision.c itself.

The new test demonstrates that "git stash show -p --invalid" no longer
crashes with a double-free (because we move instead of copy). And it
passes with SANITIZE=leak because we free "-p" before overwriting.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-22 14:27:03 -07:00
Jeff King
3ea35c64b0 stash: tell setup_revisions() to free our allocated strings
In "git stash show", we do a first pass of parsing our command line
options by splitting them into revision args and stash args. These are
stored in strvecs, and we pass the revision args to setup_revisions().

But setup_revisions() may modify the argv we pass it, causing us to leak
some of the entries. In particular, if it sees a "--" string, that will
be dropped from argv. This is the same as other cases addressed by
f92dbdbc6a (revisions API: don't leak memory on argv elements that need
free()-ing, 2022-08-02), and we should fix it the same way: by passing
the free_removed_argv_elements option to setup_revisions().

The added test here is run only with SANITIZE=leak, without checking its
output, because the behavior of stash with "--" is a little odd:

  1. Running "git stash show" will show --stat output. But running "git
     stash show --" will show --patch.

  2. I'd expect a non-option after "--" to be treated as a pathspec, so:

       git stash show -p 1 -- foo

     would look treat "1" as a stash (a synonym for stash@{1}) and
     restrict the resulting diff to "foo". But it doesn't. We split the
     revision/stash args without any regard to "--". So in the example
     above both "1" and "foo" are stashes. Which is an error, but also:

       git stash show -- foo

     treats "foo" as a stash, not a pathspec.

These are both oddities that we may want to address (or may not, if we
want to retain historical quirks). But they are well outside the scope
of this patch. So for now we'll just let the tests confirm we aren't
leaking without otherwise expecting any behavior. If we later address
either of those points and end up with another test that covers "stash
show --", we can drop this leak-only test.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-22 14:24:52 -07:00
Patrick Steinhardt
93dbb6b3c5 t/unit-tests: update to 10e96bc
Update to 10e96bc (Merge pull request #127 from
pks-gitlab/pks-ci-improvements, 2025-09-22). This commit includes a
couple of changes:

  - The GitHub CI has been updated to include a 32 bit CI job.
    Furthermore, the jobs now compile with "-Werror" and more warnings
    enabled.

  - An issue was addressed where `uintptr_t` is not available on
    NonStop [1].

  - The clar selftests have been restructured so that it is now possible
    to add small test suites more readily. This was done to add tests
    for the above addressed issue, where we now use "%p" to print
    pointers in a platform dependent way.

  - An issue was addressed where the test output had a trailing
    whitespace with certain output formats, which caused whitespace
    issues in the test expectation files.

[1]: <01c101dc2842$38903640$a9b0a2c0$@nexbridge.com>

Reported-by: Randall S. Becker <rsbecker@nexbridge.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-09-22 10:09:03 -07:00