22514 Commits

Author SHA1 Message Date
Patrick Steinhardt
38678e5df5 userdiff: fix leaking memory for configured diff drivers
The userdiff structures may be initialized either statically on the
stack or dynamically via configuration keys. In the latter case we end
up leaking memory because we didn't have any infrastructure to discern
those strings which have been allocated statically and those which have
been allocated dynamically.

Refactor the code such that we have two pointers for each of these
strings: one that holds the value as accessed by other subsystems, and
one that points to the same string in case it has been allocated. Like
this, we can safely free the second pointer and thus plug those memory
leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
Patrick Steinhardt
1bc158e750 builtin/format-patch: fix various trivial memory leaks
There are various memory leaks hit by git-format-patch(1). Basically all
of them are trivial, except that un-setting `diffopt.no_free` requires
us to unset the `diffopt.file` because we manually close it already.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:01 -07:00
Patrick Steinhardt
6b15d9ca7f diff: fix leak when parsing invalid ignore regex option
When parsing invalid ignore regexes passed via the `-I` option we don't
free already-allocated memory, leading to a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:00 -07:00
Patrick Steinhardt
4dfd4f1dfe unpack-trees: clear index when not propagating it
When provided a pointer to a destination index, then `unpack_trees()`
will end up copying its `o->internal.result` index into the provided
pointer. In those cases it is thus not necessary to free the index, as
we have transferred ownership of it.

There are cases though where we do not end up transferring ownership of
the memory, but `clear_unpack_trees_porcelain()` will never discard the
index in that case and thus cause a memory leak. And right now it cannot
do so in the first place because we have no indicator of whether we did
or didn't transfer ownership of the index.

Adapt the code to zero out the index in case we transfer its ownership.
Like this, we can now unconditionally discard the index when being asked
to clear the `unpack_trees_options`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:00 -07:00
Patrick Steinhardt
2f07d228c3 sequencer: release todo list on error paths
We're not releasing the `todo_list` in `sequencer_pick_revisions()` when
hitting an error path. Restructure the function to have a common exit
path such that we can easily clean up the list and thus plug this memory
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:00 -07:00
Patrick Steinhardt
de54b450a3 merge-ort: unconditionally release attributes index
We conditionally release the index used for reading gitattributes in
merge-ort based on whether or the index has been populated. This check
uses `cache_nr` as a condition. This isn't sufficient though, as the
variable may be zero even when some other parts of the index have been
populated. This leads to memory leaks when sparse checkouts are in use,
as we may not end up releasing the sparse checkout patterns.

Fix this issue by unconditionally releasing the index.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:08:00 -07:00
Patrick Steinhardt
a0b82622cb builtin/fast-export: plug leaking tag names
When resolving revisions in `get_tags_and_duplicates()`, we only
partially manage the lifetime of `full_name`. In fact, managing its
lifetime properly is almost impossible because we put direct pointers to
that variable into multiple lists without duplicating the string. The
consequence is that these strings will ultimately leak.

Refactor the code to make the lists we put those names into duplicate
the memory. This allows us to properly free the string as required and
thus plugs the memory leak.

While this requires us to allocate more data overall, it shouldn't be
all that bad given that the number of allocations corresponds with the
number of command line parameters, which typically aren't all that many.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
Patrick Steinhardt
0662f0dacb builtin/fast-import: plug trivial memory leaks
Plug some trivial memory leaks in git-fast-import(1).

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
Patrick Steinhardt
187b623eef builtin/notes: fix leaking struct notes_tree when merging notes
We allocate a `struct notes_tree` in `merge_commit()` which we then
initialize via `init_notes()`. It's not really necessary to allocate the
structure though given that we never pass ownership to the caller.
Furthermore, the allocation leads to a memory leak because despite its
name, `free_notes()` doesn't free the `notes_tree` but only clears it.

Fix this issue by converting the code to use an on-stack variable.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:59 -07:00
Patrick Steinhardt
1ca57bea4a builtin/rebase: fix leaking commit.gpgsign value
In `get_replay_opts()`, we override the `gpg_sign` field that already
got populated by `sequencer_init_config()` in case the user has
"commit.gpgsign" set in their config. This creates a memory leak because
we overwrite the previously assigned value, which may have already
pointed to an allocated string.

Let's plug the memory leak by freeing the value before we overwrite it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:58 -07:00
Patrick Steinhardt
d1c53f6703 read-cache: fix leaking hashfile when writing index fails
In `do_write_index()`, we use a `struct hashfile` to write the index
with a trailer hash. In case the write fails though, we never clean up
the allocated `hashfile` state and thus leak memory.

Refactor the code to have a common exit path where we can free this and
other allocated memory. While at it, refactor our use of `strbuf`s such
that we reuse the same buffer to avoid some unneeded allocations.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:58 -07:00
Patrick Steinhardt
c81dcf630c bulk-checkin: fix leaking state TODO
When flushing a bulk-checking to disk we also reset the `struct
bulk_checkin_packfile` state. But while we free some of its members,
others aren't being free'd, leading to memory leaks:

  - The temporary packfile name is not getting freed.

  - The `struct hashfile` only gets freed in case we end up calling
    `finalize_hashfile()`. There are code paths though where that is not
    the case, namely when nothing has been written. For this, we need to
    make `free_hashfile()` public.

Fix those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:57 -07:00
Patrick Steinhardt
9ddd5f755d object-name: fix leaking symlink paths in object context
The object context may be populated with symlink contents when reading a
symlink, but the associated strbuf doesn't ever get released when
releasing the object context, causing a memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:57 -07:00
Patrick Steinhardt
aa9ef614dc object-file: fix memory leak when reading corrupted headers
When reading corrupt object headers in `read_loose_object()`, we bail
out immediately. This causes a memory leak though because we would have
already initialized the zstream in `unpack_loose_header()`, and it is
the callers responsibility to finish the zstream even on error. While
this feels weird, other callsites do it correctly already.

Fix this leak by ending the zstream even on errors. We may want to
revisit this interface in the future such that the callee handles this
for us already when there was an error.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:57 -07:00
Patrick Steinhardt
ce01f92889 remote: plug memory leak when aliasing URLs
When we have a `url.*.insteadOf` configuration, then we end up aliasing
URLs when populating remotes. One place where this happens is in
`alias_all_urls()`, where we loop through all remotes and then alias
each of their URLs. The actual aliasing logic is then contained in
`alias_url()`, which returns an allocated string that contains the new
URL. This URL replaces the old URL that we have in the strvec that
contains all remote URLs.

We replace the remote URLs via `strvec_replace()`, which does not hand
over ownership of the new string to the vector. Still, we didn't free
the aliased URL and thus have a memory leak here. Fix it by freeing the
aliased string.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-14 10:07:56 -07:00
Junio C Hamano
90b801d8ff Merge branch 'ps/leakfixes-part-3' into ps/leakfixes-part-4
* ps/leakfixes-part-3: (24 commits)
  commit-reach: fix trivial memory leak when computing reachability
  convert: fix leaking config strings
  entry: fix leaking pathnames during delayed checkout
  object-name: fix leaking commit list items
  t/test-repository: fix leaking repository
  builtin/credential-cache: fix trivial leaks
  builtin/worktree: fix leaking derived branch names
  builtin/shortlog: fix various trivial memory leaks
  builtin/rerere: fix various trivial memory leaks
  builtin/credential-store: fix leaking credential
  builtin/show-branch: fix several memory leaks
  builtin/rev-parse: fix memory leak with `--parseopt`
  builtin/stash: fix various trivial memory leaks
  builtin/remote: fix various trivial memory leaks
  builtin/remote: fix leaking strings in `branch_list`
  builtin/ls-remote: fix leaking `pattern` strings
  builtin/submodule--helper: fix leaking buffer in `is_tip_reachable`
  builtin/submodule--helper: fix leaking clone depth parameter
  builtin/name-rev: fix various trivial memory leaks
  builtin/describe: fix trivial memory leak when describing blob
  ...
2024-08-06 12:40:41 -07:00
Patrick Steinhardt
f30bfafcd4 commit-reach: fix trivial memory leak when computing reachability
We don't free the local `stack` commit list that we use to compute
reachability of multiple commits at once. Do so.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:38 -07:00
Patrick Steinhardt
9642479a2b convert: fix leaking config strings
In `read_convert_config()`, we end up reading some string values into
variables. We don't free any potentially-existing old values though,
which will result in a memory leak in case the same key has been defined
multiple times.

Fix those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:38 -07:00
Patrick Steinhardt
1f08999781 entry: fix leaking pathnames during delayed checkout
When filtering files during delayed checkout, we pass a string list to
`async_query_available_blobs()`. This list is initialized with NODUP,
and thus inserted strings will not be owned by the list. In the latter
function we then try to hand over ownership by passing an `xstrup()`'d
value to `string_list_insert()`. But this is not how this works: a NODUP
list does not take ownership of allocated strings and will never free
them for the caller.

Fix this issue by initializing the list as `DUP` instead and dropping
the explicit call to `xstrdup()`. This is okay to do given that this is
the single callsite of `async_query_available_blobs()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
Patrick Steinhardt
57fb139b5e object-name: fix leaking commit list items
When calling `get_oid_oneline()`, we pass in a `struct commit_list` that
gets modified by the function. This creates a weird situation where the
commit list may sometimes be empty after returning, but sometimes it
will continue to carry additional commits. In those cases the remainder
of the list leaks.

Ultimately, the design where we only pass partial ownership to
`get_oid_oneline()` feels shoddy. Refactor the code such that we only
pass a constant pointer to the list, creating a local copy as needed.
Callers are thus always responsible for freeing the commit list, which
then allows us to plug a bunch of memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
Patrick Steinhardt
11f841c1cc t/test-repository: fix leaking repository
The test-repository test helper zeroes out `the_repository` such that it
can be sure that our codebase only ends up using the supplied repository
that we initialize in the respective helper functions. This does cause
memory leaks though as the data that `the_repository` has been holding
onto is not referenced anymore.

Fix this by calling `repo_clear()` instead.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
Patrick Steinhardt
145c979020 builtin/credential-cache: fix trivial leaks
There are two trivial leaks in git-credential-cache(1):

  - We leak the child process in `spawn_daemon()`. As we do not call
    `finish_command()` and instead let the created process daemonize, we
    have to clear the process manually.

  - We do not free the computed socket path in case it wasn't given via
    `--socket=`.

Plug both of these memory leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
Patrick Steinhardt
cd6d7630fa builtin/worktree: fix leaking derived branch names
There are several heuristics that git-worktree(1) uses to derive the
name of the newly created branch when not given explicitly. These
heuristics all allocate a new string, but we only end up freeing that
string in a subset of cases.

Fix the remaining cases where we didn't yet free the derived branch
names. While at it, also free `opt_track`, which is being populated via
an `OPT_PASSTHRU()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
Patrick Steinhardt
06da42beec builtin/shortlog: fix various trivial memory leaks
There is a trivial memory leak in git-shortlog(1). Fix it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
Patrick Steinhardt
50ef4e09c3 builtin/rerere: fix various trivial memory leaks
There are multiple trivial memory leaks in git-rerere(1). Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:37 -07:00
Patrick Steinhardt
1d615afa8d builtin/credential-store: fix leaking credential
We never free credentials read by the credential store, leading to a
memory leak. Plug it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
Patrick Steinhardt
11d6a81c01 builtin/show-branch: fix several memory leaks
There are several memory leaks in git-show-branch(1). Fix them.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
Patrick Steinhardt
2d197e4a0f builtin/rev-parse: fix memory leak with --parseopt
The `--parseopt` mode allows shell scripts to have the same option
parsing mode as we have in C builtins. It soaks up a set of option
descriptions via stdin and massages them into proper `struct option`s
that we can then use to parse a set of arguments.

We only partially free those options when done though, creating a memory
leak. Interestingly, we only end up free'ing the first option's help,
which is of course wrong.

Fix this by freeing all option's help fields as well as their `argh`
fields to plug this memory leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
Patrick Steinhardt
2e875b6cb4 builtin/stash: fix various trivial memory leaks
There are multiple trivial memory leaks in git-stash(1). Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
Patrick Steinhardt
fc68633352 builtin/remote: fix various trivial memory leaks
There are multiple trivial memory leaks in git-remote(1). Fix those.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
Patrick Steinhardt
4119fc08e2 builtin/ls-remote: fix leaking pattern strings
Users can pass patterns to git-ls-remote(1), which allows them to filter
the list of printed references. We assemble those patterns into an array
and prefix them with "*/", but never free either the array nor the
allocated strings.

Refactor the code to use a `struct strvec` instead of manually tracking
the strings in an array. Like this, we can easily use `strvec_clear()`
to release both the vector and the contained string for us, plugging the
leak.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:36 -07:00
Patrick Steinhardt
6771e2012e builtin/submodule--helper: fix leaking buffer in is_tip_reachable
The `rev` buffer in `is_tip_reachable()` is being populated with the
output of git-rev-list(1) -- if either the command fails or the buffer
contains any data, then the input commit is not reachable.

The buffer isn't used for anything else, but neither do we free it,
causing a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
Patrick Steinhardt
ac3b143370 builtin/name-rev: fix various trivial memory leaks
There are several structures that we don't release after
`cmd_name_rev()` is done. Plug those leaks.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
Patrick Steinhardt
ed041007f0 builtin/describe: fix trivial memory leak when describing blob
We never free the `struct strvec args` variable in `describe_blob()`,
which thus causes a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
Patrick Steinhardt
7935a02613 builtin/log: fix leaking branch name when creating cover letters
When calling `make_cover_letter()` without a branch name, we try to
derive the branch name by calling `find_branch_name()`. But while this
function returns an allocated string, we never free the result and thus
have a memory leak. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:35 -07:00
Patrick Steinhardt
34968e56de builtin/replay: plug leaking advance_name variable
The `advance_name` variable can either contain a static string when
parsed via the `--advance` command line option or it may be an allocated
string when set via `determine_replay_mode()`. Because we cannot be sure
whether it is allocated or not we just didn't free it at all, resulting
in a memory leak.

Split up the variables such that we can track the static and allocated
strings separately and then free the allocated one to fix the memory
leak.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-01 08:47:34 -07:00
Junio C Hamano
6c70d65712 Merge branch 'cp/unit-test-reftable-merged'
Another reftable test has been ported to use the unit test framework.

* cp/unit-test-reftable-merged:
  t-reftable-merged: add test for REFTABLE_FORMAT_ERROR
  t-reftable-merged: use reftable_ref_record_equal to compare ref records
  t-reftable-merged: add tests for reftable_merged_table_max_update_index
  t-reftable-merged: improve the const-correctness of helper functions
  t-reftable-merged: improve the test t_merged_single_record()
  t: harmonize t-reftable-merged.c with coding guidelines
  t: move reftable/merged_test.c to the unit testing framework
2024-07-31 13:34:19 -07:00
Junio C Hamano
90139ae377 Merge branch 'jc/checkout-no-op-switch-errors'
"git checkout --ours" (no other arguments) complained that the
option is incompatible with branch switching, which is technically
correct, but found confusing by some users.  It now says that the
user needs to give pathspec to specify what paths to checkout.

* jc/checkout-no-op-switch-errors:
  checkout: special case error messages during noop switching
2024-07-31 13:34:18 -07:00
Junio C Hamano
d71121c060 Merge branch 'pw/add-patch-with-suppress-blank-empty'
"git add -p" by users with diff.suppressBlankEmpty set to true
failed to parse the patch that represents an unmodified empty line
with an empty line (not a line with a single space on it), which
has been corrected.

* pw/add-patch-with-suppress-blank-empty:
  add-patch: use normalize_marker() when recounting edited hunk
  add-patch: handle splitting hunks with diff.suppressBlankEmpty
2024-07-31 13:34:17 -07:00
Junio C Hamano
f084c50de6 Merge branch 'ad/merge-with-diff-algorithm'
Many Porcelain commands that internally use the merge machinery
were taught to consistently honor the diff.algorithm configuration.

* ad/merge-with-diff-algorithm:
  merge-recursive: honor diff.algorithm
2024-07-31 13:34:16 -07:00
Junio C Hamano
6a52f307af Merge branch 'rs/t-strvec-use-test-msg'
Unit test clean-up.

* rs/t-strvec-use-test-msg:
  t-strvec: fix type mismatch in check_strvec
  t-strvec: improve check_strvec() output
  t-strvec: use test_msg()
2024-07-31 13:34:15 -07:00
Junio C Hamano
ec9d46588e Merge branch 'ds/midx-write-repack-fix'
Repacking a repository with multi-pack index started making stupid
pack selections in Git 2.45, which has been corrected.

* ds/midx-write-repack-fix:
  midx-write: revert use of --stdin-packs
  t5319: add failing test case for repack/expire
2024-07-23 16:54:33 -07:00
Phillip Wood
39bdd84eaf add-patch: handle splitting hunks with diff.suppressBlankEmpty
When "add -p" parses diffs, it looks for context lines starting with a
single space. But when diff.suppressBlankEmpty is in effect, an empty
context line will omit the space, giving us a true empty line. This
confuses the parser, which is unable to split based on such a line.

It's tempting to say that we should just make sure that we generate a
diff without that option.  However, although we do not parse hunks that
the user has manually edited with parse_diff() we do allow the user
to split such hunks. As POSIX calls the decision of whether to print the
space here "implementation-defined" we need to handle edited hunks where
empty context lines omit the space.

So let's handle both cases: a context line either starts with a space or
consists of a totally empty line by normalizing the first character to a
space when we parse them. Normalizing the first character rather than
changing the code to check for a space or newline will hopefully future
proof against introducing similar bugs if the code is changed.

Reported-by: Ilya Tumaykin <itumaykin@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-20 16:29:14 -07:00
Derrick Stolee
8fb6d11fad midx-write: revert use of --stdin-packs
This reverts b7d6f23a171 (midx-write.c: use `--stdin-packs` when
repacking, 2024-04-01) and then marks the test created in the previous
change as passing.

The fundamental issue with the reverted change is that the focus on
pack-files separates the object selection from how the multi-pack-index
selects a single pack-file for an object ID with multiple copies among
the tracked pack-files.

The change was made with the intention of improving delta compression in
the resulting pack-file, but that can be resolved with the existing
object list mechanism. There are other potential pitfalls of doing an
object walk at this time if the repository is a blobless partial clone,
and that will require additional testing on top of the one that changes
here.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Acked-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-19 07:19:01 -07:00
Derrick Stolee
738fab524c t5319: add failing test case for repack/expire
Git 2.45.0 included the change b7d6f23a171 (midx-write.c: use
`--stdin-packs` when repacking, 2024-04-01) which caused the 'git
multi-pack-index repack' command to use 'git pack-objects --stdin-packs'
instead of listing the objects to repack. While this change was
motivated by efficient cross-process communication and the ability to
improve delta compression, it breaks a fundamental function of the
'incremental-repack' task that is enabled by default in Scalar clones or
Git repositories that run 'git maintenance start'.

The 'incremental-repack' task performs a two-step process of the
'expire' and 'repack' subcommands of the 'git multi-pack-index' builtin.
The 'expire' command removes any pack-files listed in the
multi-pack-index but without any referenced objects. The 'repack' task
then finds a batch of pack-files to repack and sends their objects to
'git pack-objects'. Both the pack-files chosen for the batch and the
objects chosen to repack are based on the ones that the multi-pack-index
references. Objects that appear in a pack-file but have a duplicate copy
in a newer pack-file are not considered in this case. Since the
multi-pack-index references only the newest copy of an object, this
allows the next 'incremental-repack' task to remove the pack-files in
the next 'expire' task. This delay is intentional due to how Windows
handles may block deletion of files with open read handles.

However, the mentioned commit changed this behavior to divorce the set
of objects referenced by the multi-pack-index and instead use a set of
"included" and "excluded" pack-files in the 'git pack-objects' builtin.
When a pack-file is selected as "included", only the objects it contains
but are not in any "excluded" pack-files are considered for repacking.
This has led to client repositories failing to remove old pack-files as
they still have some referenced objects. This grows over time until the
point that Git is trying to repack the same pack-files over and over.

For now, create a test case that demonstrates the expected behavior, but
also fails in its final line. The setup here it attempting to recreate a
typical situation for a repository that uses a blobless partial clone.
There would be a large initial pack-file from the clone that is never
selected in the 'repack' batch. There are other pack-files that have a
combination of new objects from incremental fetches and possibly blobs
that are not connected to those incremental fetches; these blobs could
be filled in from commands like 'git checkout' or 'git blame'. The
pack-files also have some overlap on purpose so test-1 has some
duplicates in test-2 and test-2 has some duplicates in test-3.

At the end of the test, the test-2 pack-file still exists though it
should have been expired. This test will pass when reverting the
offending commit.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-18 14:53:27 -07:00
Junio C Hamano
1aac20a4b0 Merge branch 'jk/am-retry'
Test fix as a follow-up to an already graduated topic.

* jk/am-retry:
  t4153: stop redirecting input from /dev/zero
2024-07-18 08:30:27 -07:00
Junio C Hamano
76e018b9a1 Merge branch 'js/var-git-shell-path'
"git var GIT_SHELL_PATH" should report the path to the shell used
to spawn external commands, but it didn't do so on Windows, which
has been corrected.

* js/var-git-shell-path:
  var(win32): do report the GIT_SHELL_PATH that is actually used
  run-command: declare the `git_shell_path()` function globally
  run-command(win32): resolve the path to the Unix shell early
  mingw(is_msys2_sh): handle forward slashes in the `sh.exe` path, too
  win32: override `fspathcmp()` with a directory separator-aware version
  strvec: declare the `strvec_push_nodup()` function globally
  run-command: refactor getting the Unix shell path into its own function
2024-07-17 10:47:27 -07:00
Junio C Hamano
e13feda98f Merge branch 'kn/push-empty-fix'
"git push '' HEAD:there" used to hit a BUG(); it has been corrected
to die with "fatal: bad repository ''".

* kn/push-empty-fix:
  builtin/push: call set_refspecs after validating remote
2024-07-17 10:47:26 -07:00
Junio C Hamano
b19a8c00c6 Merge branch 'jk/test-body-in-here-doc'
The test framework learned to take the test body not as a single
string but as a here-document.

* jk/test-body-in-here-doc:
  t/.gitattributes: ignore whitespace in chainlint expect files
  t: convert some here-doc test bodies
  test-lib: allow test snippets as here-docs
  chainlint.pl: add tests for test body in heredoc
  chainlint.pl: recognize test bodies defined via heredoc
  chainlint.pl: check line numbers in expected output
  chainlint.pl: force CRLF conversion when opening input files
  chainlint.pl: do not spawn more threads than we have scripts
  chainlint.pl: only start threads if jobs > 1
  chainlint.pl: add test_expect_success call to test snippets
2024-07-17 10:47:25 -07:00
Junio C Hamano
6da44da936 Merge branch 'rj/test-sanitize-leak-log-fix'
Tests that use GIT_TEST_SANITIZE_LEAK_LOG feature got their exit
status inverted, which has been corrected.

* rj/test-sanitize-leak-log-fix:
  test-lib: GIT_TEST_SANITIZE_LEAK_LOG enabled by default
  test-lib: fix GIT_TEST_SANITIZE_LEAK_LOG
2024-07-17 10:47:24 -07:00