The ort merge machinery hit an assertion failure in a history with
criss-cross merges renamed a directory and a non-directory, which
has been corrected.
* en/ort-recursive-d-f-conflict-fix:
merge-ort: fix corner case recursive submodule/directory conflict handling
Running "git diff" with "--name-only" and other options that allows
us not to look at the blob contents, while objects that are lazily
fetched from a promisor remote, caused use-after-free, which has
been corrected.
* ds/diff-lazy-fetch-with-name-only-fix:
diff: avoid segfault with freed entries
Use hook API to replace ad-hoc invocation of hook scripts with the
run_command() API.
* ar/run-command-hook:
receive-pack: convert receive hooks to hook API
receive-pack: convert update hooks to new API
hooks: allow callers to capture output
run-command: allow capturing of collated output
hook: allow overriding the ungroup option
reference-transaction: use hook API instead of run-command
transport: convert pre-push to hook API
hook: convert 'post-rewrite' hook in sequencer.c to hook API
hook: provide stdin via callback
run-command: add stdin callback for parallelization
run-command: add first helper for pp child states
Update HTTP tests to adjust for changes in curl 8.18.0
* jk/test-curl-updates:
t5563: add missing end-of-line in HTTP header
t5551: handle trailing slashes in expected cookies output
Even when there is no changes in the packfile and no need to
recompute bitmaps, "git repack" recomputed and updated the MIDX
file, which has been corrected.
* ps/repack-avoid-noop-midx-rewrite:
midx-write: skip rewriting MIDX with `--stdin-packs` unless needed
midx-write: extract function to test whether MIDX needs updating
midx: fix `BUG()` when getting preferred pack without a reverse index
Prepare test suite for Git for Windows that supports symbolic
links.
* js/test-symlink-windows:
t7800: work around the MSYS path conversion on Windows
t6423: introduce Windows-specific handling for symlinking to /dev/null
t1305: skip symlink tests that do not apply to Windows
t1006: accommodate for symlink support in MSYS2
t0600: fix incomplete prerequisite for a test case
t0301: another fix for Windows compatibility
t0001: handle `diff --no-index` gracefully
mingw: special-case `open(symlink, O_CREAT | O_EXCL)`
apply: symbolic links lack a "trustable executable bit"
t9700: accommodate for Windows paths
More object database related information are shown in "git repo
structure" output.
* jt/repo-struct-more-objinfo:
builtin/repo: add object disk size info to structure table
builtin/repo: add disk size info to keyvalue stucture output
builtin/repo: add inflated object info to structure table
builtin/repo: add inflated object info to keyvalue structure output
builtin/repo: humanise count values in structure output
strbuf: split out logic to humanise byte values
builtin/repo: group per-type object values into struct
When computing a diff in a partial clone, there is a chance that we
could trigger a prefetch of missing objects at the same time as we are
freeing entries from the global diff queue. This is difficult to
reproduce, as we need to have some objects be freed from the queue
before triggering the prefetch of missing objects. There is a new test
in t4067 that does trigger the segmentation fault that results in this
case.
The fix is to set the queue pointer to NULL after it is freed, and then
to be careful about NULL values in the prefetch.
The more elaborate explanation is that within diffcore_std(), we may
skip the initial prefetch due to the output format (--name-only in the
test) and go straight to diffcore_skip_stat_unmatch(). In that method,
the index entries that have been invalidated by path changes show up as
entries but may be deleted because they are not actually content diffs
and only newer timestamps than expected. As those entries are deleted,
later entries are checked with diff_filespec_check_stat_unmatch(), which
uses diff_queued_diff_prefetch() as the missing_object_cb in its diff
options. That can trigger downloading missing objects if the appropriate
scenario occurs to trigger a call to diff_popoulate_filespec(). It's
finally within that callback to diff_queued_diff_prefetch() that the
segfault occurs.
The test was hard to find because it required some real differences,
some not-different files that had a newer modified time, and the order
of those files alphabetically was important to trigger the deletion
before the prefetch was triggered.
I briefly considered a "lock" member for the diff queue, but it was a
much larger diff and introduced many more possible error scenarios.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Replace 'test -f' with the test_path_is_file in
t5403-post-checkout-hook.sh. This helper provides better error
messages when tests fail, making it easier to debug issues.
Signed-off-by: Deveshi Dwivedi <deveshigurgaon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
At GitHub, a few repositories were triggering errors of the form:
git: merge-ort.c:3037: process_renames: Assertion `newinfo && !newinfo->merged.clean' failed.
Aborted (core dumped)
While these may look similar to both
a562d90a350d (merge-ort: fix failing merges in special corner case,
2025-11-03)
and
f6ecb603ff8a (merge-ort: fix directory rename on top of source of other
rename/delete, 2025-08-06)
the cause is different and in this case the problem is not an
over-conservative assertion, but a bug before the assertion where we did
not update all relevant state appropriately.
It sadly took me a really long time to figure out how to get a simple
reproducer for this one. It doesn't really have that many moving parts,
but there are multiple pieces of background information needed to
understand it.
First of all, when we have two files added at the same path, merge-ort
does a two-way merge of those files. If we have two directories added
at the same path, we basically do the same thing (taking the union of
files, and two-way merging files with the same name). But two-way
merging requires components of the same type. We can't merge the
contents of a regular file with a directory, or with a symlink, or with
a submodule. Nor can any of those other types be merged with each
other, e.g. merging a submodule with a directory is a bad idea. When
two paths have the same name but their types do not match, merge-ort is
forced to move one of them to an alternate filename (using the
unique_path() function).
Second, if two commits being merged have more than one merge-base,
merge-ort will merge the merge-bases to create a virtual merge-base, and
use that as the base commit.
Third, one of the really important optimizations in merge-ort is trivial
tree-level resolution (roughly meaning merging trees without recursing
into them). This optimization has some nuance to it that is important
to the current bug, and to understand it, it helps to first look at the
high-level overview of how merge-ort runs; there are basically three
high-level functions that the work is divided between:
collect_merge_info() - walks the top-level trees getting individual
paths of interest
detect_renames() - detect renames between paths in order to match up
paths for three-way merging
process_entries() - does a few things of interest:
* three-way merging of files,
* other special handling (e.g. adjusting paths with conflicting
types to avoid path collisions)
* as it finishes handling all the files within a subdirectory,
writes out a new tree object for that directory
If it were not for renames, we could just always do tree-level merging
whenever the tree on at least one side was unmodified. Unfortunately,
we need to recurse into trees to determine whether there are renames.
However, we can also do tree-level merging so long as there aren't any
*relevant* renames (another merge-ort optimization), which we can
determine without recursing into trees.
We would also be able to do tree-level merging if we somehow apriori
knew what renames existed, by only recursing into the trees which we
could otherwise trivially merge if they contained files involved in
renames. That might not seem useful, because we need to find out the
renames and we have to recurse into trees to do so, but when you find
out that the process_entries() step is more computationally expensive
than the collect_merge_info() step, it yields an interesting strategy:
* run collect_merge_info()
* run detect_renames()
* cache the renames()
* restart -- rerun collect_merge_info(), using the cached renames to
only recurse into the needed trees
* we already have the renames cached so no need to re-detect
* run process_entries() on the reduced list of paths
which was implemented back in 7bee6c100431 (merge-ort: avoid recursing
into directories when we don't need to, 2021-07-16) Crucially, this
restarting only occurs if the number of paths we could skip recursing
into exceeds the number we still need to recurse into by some safety
factor (wanted_factor in handle_deferred_entries()); forgetting this
fact is a great way to repeatedly fail to create a minimal testcase for
several days and go down alternate wrong paths).
Now, I earlier summarized this optimization as "merging trees without
recursing into them", but this optimization does not require that all
three sides of history has a directory at a given path. So long as the
tree on one side matches the tree in the base version, we can decide to
resolve in favor of whatever the other side of history has at that path
-- be it a directory, a file, a submodule, or a symlink. Unfortunately,
the code in question didn't fully realize this, and was written assuming
the base version and both sides would have a directory at the given
path, as can be seen by the "ci->filemask == 0" comment in
resolve_trivial_directory_merge() that was added as part of 7bee6c100431
(merge-ort: avoid recursing into directories when we don't need to,
2021-07-16). A few additional lines of code are needed to handle cases
where we have something other than a directory on the other side of
history.
But, knowing that resolve_trivial_directory_merge() doesn't have
sufficient state updating logic doesn't show us how to trigger a bug
without combining with the other bits of information we provided above.
Here's a relevant testcase:
* branches A & B
* commit A1: adds "folder" as a directory with files tracked under it
* commit B1: adds "folder" as a submodule
* commit A2: merges B1 into A1, keeping "folder" as a directory
(and in fact, with no changes to "folder" since A1), discarding the
submodule
* commit B2: merges A1 into B1, keeping "folder" as a submodule
(and in fact, with no changes to "folder" since B1), discarding the
directory
Here, if we try to merge A2 & B2, the logic proceeds as follows:
* we have multiple merge-bases: A1 & B1. So we have to merge those
to get a virtual merge base.
* due to "folder" as a directory and "folder" as a submodule, the
path collision logic triggers and renames "folder" as a submodule
to "folder~Temporary merge branch 2" so we can keep it alongside
"folder" as a directory.
* we now have a virtual merge base (containing both "folder"
directory and a "folder~Temporary merge branch 2" submodule) and
can now do the outer merge
* in the first step of the outer merge, we attempt to defer recursing
into folder/ as a directory, but find we need to for rename
detection.
* in rename detection, we note that "folder~Temporary merge branch 2"
has the same hash as "folder" as a submodule in B2, which means we
have an exact rename.
* after rename detection, we discover no path in folder/ is needed
for renames, and so we can cache renames and restart.
* after restarting, we avoid recursing into "folder/" and realize we
can resolve it trivially since it hasn't been modified. The
resolution removes "folder/", leaving us only "folder" as a
submodule from commit B2.
* After this point, we should have a rename/delete conflict on
"folder~Temporary merge branch 2" -> "folder", but our marking of
the merge of "folder" as clean broke our ability to handle that and
in fact triggers an assertion in process_renames().
When there was a df_conflict (directory/"file" conflict, where "file"
could be submodule or regular file or symlink), ensure
resolve_trivial_directory_merge() handles it properly. In particular:
* do not pre-emptively mark the path as cleanly merged if the
remaining path is a file; allow it to be processed in
process_entries() later to determine if it was clean
* clear the parts of dirmask or filemask corresponding to the matching
sides of history, since we are resolving those away
* clear the df_conflict bit afterwards; since we cleared away the two
matching sides and only have one side left, that one side can't
have a directory/file conflict with itself.
Also add the above minimal testcase showcasing this bug to t6422, **with
a sufficient number of paths under the folder/ directory to actually
trigger it**. (I wish I could have all those days back from all the
wrong paths I went down due to not having enough files under that
directory...)
I know this commit has a very high ratio of lines in the commit message
to lines of comments, and a relatively high ratio of comments to actual
code, but given how long it took me to track down, on the off chance
that we ever need to further modify this logic, I wanted it thoroughly
documented for future me and for whatever other poor soul might end up
needing to read this commit message.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some callers, for example server-side hooks which wish to relay hook
output to clients across a transport, want to capture what would
normally print to stderr and do something else with it. Allow that via a
callback.
By calling the callback regardless of whether there's output available,
we allow clients to send e.g. a keepalive if necessary.
Because we expose a strbuf, not a fd or FILE*, there's no need to create
a temporary pipe or similar - we can just skip the print to stderr and
instead hand it to the caller.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a user of the run_processes_parallel() API wants to pipe a large
amount of information to the stdin of each parallel command, that
data could exceed the pipe buffer of the process's stdin and can be
too big to store in-memory via strbuf & friends or to slurp to a file.
Generally this is solved by repeatedly writing to child_process.in
between calls to start_command() and finish_command(). For a specific
pre-existing example of this, see transport.c:run_pre_push_hook().
This adds a generic callback API to run_processes_parallel() to do
exactly that in a unified manner, similar to the existing callback APIs,
which can then be used by hooks.h to convert the remaining hooks to the
new, simpler parallel interface.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Building a list using commit_list_insert_by_date() has quadratic worst
case complexity. Avoid it by using prio_queue.
Use prio_queue_peek()+prio_queue_replace() instead of prio_queue_get()+
prio_queue_put() if possible, as the former only rebalance the
prio_queue heap once instead of twice.
In sane repositories this won't make much of a difference because the
number of items in the list or queue won't be very high:
Benchmark 1: ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo
Time (mean ± σ): 538.2 ms ± 0.8 ms [User: 527.6 ms, System: 9.6 ms]
Range (min … max): 537.0 ms … 539.2 ms 10 runs
Benchmark 2: ./git show-branch origin/main origin/next origin/seen origin/todo
Time (mean ± σ): 530.6 ms ± 0.4 ms [User: 519.8 ms, System: 9.8 ms]
Range (min … max): 530.1 ms … 531.3 ms 10 runs
Summary
./git show-branch origin/main origin/next origin/seen origin/todo ran
1.01 ± 0.00 times faster than ./git_v2.52.0 show-branch origin/main origin/next origin/seen origin/todo
That number is not limited, though, and in pathological cases like the
one in p6010 we see a sizable improvement:
Test v2.52.0 HEAD
------------------------------------------------------------------
6010.4: git show-branch 2.19(2.19+0.00) 0.03(0.02+0.00) -98.6%
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch" that involves fetching tags, when a tag being fetched
needs to overwrite existing one, failed to fetch other tags, which
has been corrected.
* kn/fix-fetch-backfill-tag-with-batched-ref-updates:
fetch: fix failed batched updates skipping operations
fetch: fix non-conflicting tags not being committed
fetch: extract out reference committing logic
"git diff-files -R --find-copies-harder" has been taught to use
the potential copy sources from the index correctly.
* rs/diff-files-r-find-copies-fix:
diff-files: fix copy detection
"git submodule add" to add a submodule under <name> segfaulted,
when a submodule.<name>.something is already in .gitmodules file
without defining where its submodule.<name>.path is, which has been
corrected.
* jc/submodule-add:
submodule add: sanity check existing .gitmodules
When 58aaf59133b (t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar,
2023-12-29) copy-edited the `test_detect_hash` function, the code
comment was accidentally left unchanged. Let's adjust it.
Noticed-by: Matthew John Cheetham <mjcheetham@outlook.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In t5563, we test how various oddly-formatted WWW-Authenticate headers
are passed through curl to git's credential subsystem (and ultimately
out to credential helpers). One test, "access using basic auth with
wwwauth header mixed line-endings" does something odd. It does not mix
line endings at all (which must be CRLF according to the RFC anyway),
but omits the line ending entirely for the final header!
This means that the server produces an incomplete response. We send our
final header, and then the newline which is meant to mark the end of
headers (and the start of the body) becomes the line ending for that
header. And there is no header/body separator in the output at all.
Looking at strace, this is what the client reads:
recvfrom(9, "WWW-Authenticate: FooBar param1=\"value1\"\r\n \r\n\tparam2=\"value2\"\r\nWWW-Authenticate: Basic realm=\"example.com\"", 16384, 0, NULL, NULL) = 106
recvfrom(9, "\n", 16384, 0, NULL, NULL) = 1
recvfrom(9, "", 16384, 0, NULL, NULL) = 0
The headers themselves are produced from the custom-auth.challenge file
we write in the test (which is missing the final CRLF), and then the
header/body separator comes from our lib-httpd/nph-custom-auth.sh CGI.
(Ignore for a moment that it is producing a bare newline, which I think
is a bug; it should be a CRLF but curl is happy with either).
Older versions of curl seemed to be OK with the truncated output, but
the upcoming 8.18.0 release seems to get confused. Specifically, since
67ae101666 (http: unfold response headers earlier, 2025-12-12) our
request to the server fails with insufficient credentials. I traced far
enough to see that curl does relay the header back to us, which we then
pass to a credential helper, which gives us the correct
username/password combination. But on our followup request, curl refuses
to send the Authorization header (and so gets an HTTP 401 again).
The change in curl's behavior is a bit unexpected, but since we are
sending it garbage, it is hard to complain too much. Let's add the
missing CRLF to the header. I _think_ this was just an oversight and not
the intent of the test. And that the "mixed line-endings" really meant
"mixed continuations", since we differ from the previous test in
continuing with both space and tab. So I've likewise updated the test
title to match that assumption.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We check in t5551 that curl updates the expected list of cookies after
making a request. We do this by telling it to read and write cookies
from a particular text file, and then checking that after curl runs, the
file has the expected content.
However, in the upcoming curl 8.18.0, the output file has changed
slightly: curl will canonicalize the paths it writes, due to commit
a093c93994 (cookie: only keep and use the canonical cleaned up path,
2025-12-07). In particular, it strips trailing slashes from the paths we
see in the cookies.txt file.
This doesn't matter to Git, as the cookie handling is all internal to
curl. But our test is overly brittle and breaks as a result.
We can fix it by matching either format. We'll expect the new format
(without trailing slashes) and strip the slashes from curl's output
before comparing. That lets us pass with both old and new versions (I
tested against curl's 8_17_0 and rc-8_18_0-2 tags, which are
respectively before and after the curl change).
In theory it might be nice to try to future-proof this test more by
looking only for the bits we care about, rather than a byte-wise
comparison of the whole file. But after removing comments and blank
lines (which we already do), we care about most of what's there. So it's
not clear to me what a more liberal test would look like. Given that the
format doesn't change all that often, it's probably OK to stop here and
see if it ever breaks again.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to a prior commit, update the table output format for the
git-repo(1) structure command to display the total object disk usage by
object type.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to a prior commit, extend the keyvalue and nul output formats of
the git-repo(1) structure command to additionally provide info regarding
total object disk sizes by object type.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the table output format for the git-repo(1) structure command to
begin printing the total inflated object size info by object type. To be
more human-friendly, larger values are scaled down and displayed with
the appropriate unit prefix. Output for the keyvalue and nul formats
remains unchanged.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The structure subcommand for git-repo(1) outputs basic count information
for objects and references. Extend this output to also provide
information regarding total size of inflated objects by object type.
For now, object size by object type info is only added to the keyvalue
and nul output formats. In a subsequent commit, this info is also added
to the table format.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The table output format for the git-repo(1) structure subcommand is used
by default and intended to provide output to users in a human-friendly
manner. When the reference/object count values in a repository are
large, it becomes more cumbersome for users to read the values.
For larger values, update the table output format to instead produce
more human-friendly count values that are scaled down with the
appropriate unit prefix. Output for the keyvalue and nul formats remains
unchanged.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In a subsequent commit, byte size values displayed in table output for
the git-repo(1) "structure" subcommand will be shown in a more
human-readable format with the appropriate unit prefixes. For this
usecase, the downscaled values and unit strings must be handled
separately to ensure proper column alignment.
Split out logic from strbuf_humanise() to downscale byte values and
determine the corresponding unit prefix into a separate humanise_bytes()
function that provides seperate value and unit strings.
Note that the "byte" string in "t/helper/test-simple-ipc.c" is unmarked
for translation here so that it doesn't conflict with the newly defined
plural "byte/bytes" translation and instead uses it.
Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's test suite's relies on Unix shell scripting, which is
understandable, of course, given Git's firm roots (and indeed, ongoing
focus) on Linux.
This fact, combined with Unix shell scripting's natural
habitat -- which is, naturally... *drumroll*... Unix --
often has unintended side effects, where developers expect the test
suite to run in a Unix environment, which is an incorrect assumption.
One instance of this problem can be observed in the 'difftool --dir-diff
handles modified symlinks' test case in `t7800-difftool.sh`, which
assumes that all absolute paths start with a forward slash. That
assumption is incorrect in general, e.g. on Windows, where absolute
paths have many shapes and forms, none of which starts with a forward
slash.
The only saving grace is that this test case is currently not run on
Windows because of the `SYMLINK` prerequisite. However, I am currently
working towards upstreaming symbolic link support from Git for Windows
to upstream Git, which will put a crack into that saving grace.
Let's change that test case so that it does not rely on absolute paths
(which are passed to the "external command" `ls` as parameters and are
therefore part of its output, and which the test case wants to filter
out before verifying that the output is as expected) starting with a
forward slash. Let's instead rely on the much more reliable fact that
`ls` will output the path in a line that ends in a colon, and simply
filter out those lines by matching said colon instead.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The device `/dev/null` does not exist on Windows, it's called `NUL`
there. Calling `ln -s /dev/null my-symlink` in a symlink-enabled MSYS2
Bash will therefore literally link to a file or directory called `null`
that is supposed to be in the current drive's top-level `dev` directory.
Which typically does not exist.
The test, however, really wants the created symbolic link to point to
the NUL device. Let's instead use the `mklink` utility on Windows to
perform that job, and keep using `ln -s /dev/null <target>` on
non-Windows platforms.
While at it, add the missing `SYMLINKS` prereq because this test _still_
would not pass on Windows before support for symbolic links is
upstreamed from Git for Windows.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In Git for Windows, the gitdir is canonicalized so that even when the
gitdir is specified via a symbolic link, the `gitdir:` conditional
include will only match the real directory path.
Unfortunately, t1305 codifies a different behavior in two test cases,
which are hereby skipped on Windows.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The MSYS2 runtime (which inherits this trait from the Cygwin runtime,
and which is used by Git for Windows' Bash to emulate POSIX
functionality on Windows, the same Bash that is also used to run Git's
test suite on Windows) has a mode where it can create native symbolic
links on Windows.
Naturally, this is a bit of a strange feature, given that Cygwin goes
out of its way to support Unix-like paths even if no Win32 program
understands those, and the symbolic links have to use Win32 paths
instead (which Win32 programs understand very well).
As a consequence, the symbolic link targets get normalized before the
links are created.
This results in certain quirks that Git's test suite is ill equipped to
accommodate (because Git's test suite expects to be able to use
Unix-like paths even on Windows).
The test script t1006-cat-file.sh contains two prime examples, two test
cases that need to skip a couple assertions because they are simply
wrong in the context of Git for Windows.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'symref transaction supports symlinks' test case is guarded by the
`SYMLINK` prerequisite because `core.prefersymlinkrefs = true` requires
symbolic links to be supported.
However, the `preferSymlinkRefs` feature is not supported on Windows,
therefore this test case needs the `MINGW` prerequisite, too.
There's a couple more cases where we set this config key:
- In a subsequent test in t0600, but there we explicitly set it to
"false". So this would naturally be supported by Windows.
- In t7201 we set the value to `yes`, but we never verify that the
written reference is a symbolic link in the first place. I guess
that we could rather remove setting the configuration value here, as
we are about to deprecate support for symrefs via symbolic links in
the first place. But that's certainly outside of the scope of this
patch.
- In t9903 we do the same, but likewise, we don't check whether the
written file is a symbolic link.
Therefore this seems to be the only instance where the tests actually
need to be adapted.
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just like 0fdcfa2f9f5 (t0301: fixes for windows compatibility,
2021-09-14) explained, we should not call `mkdir -m<mode>` in the test
suite because that would fail on Windows.
There was one forgotten instance of this which was hidden by a `SYMLINK`
prerequisite. Currently, this prevents this test case from being
executed on Windows, but with the upcoming support for symbolic links,
it would become a problem.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test case 're-init to move gitdir symlink' wants to compare the
contents of `newdir/.git`, which is a symbolic link pointing to a file.
However, `git diff --no-index`, which is used by `test_cmp` on Windows,
does not resolve symlinks; It shows the symlink _target_ instead (with a
file mode of 120000). That is totally unexpected by the test case, which
as a consequence fails, meaning that it's a bug in the test case itself.
Co-authored-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ever since fe53bbc9beb (Git.pm: Always set Repository to absolute path
if autodetecting, 2009-05-07), the t9700 test _must_ fail on Windows
because of that age-old Unix paths vs Windows paths problem.
The underlying root cause is that Git cannot run with a regular Win32
variant of Perl, the assumption that every path is a Unix path is just
too strong in Git's Perl code.
As a consequence, Git for Windows is basically stuck with using the
MSYS2 variant of Perl which uses a POSIX emulation layer (which is a
friendly fork of Cygwin) _and_ a best-effort Unix <-> Windows paths
conversion whenever crossing the boundary between MSYS2 and regular
Win32 processes. It is best effort only, though, using heuristics to
automagically convert correctly in most cases, but not in all cases.
In the context of this here patch, this means that asking `git.exe` for
the absolute path of the `.git/` directory will return a Win32 path
because `git.exe` is a regular Win32 executable that has no idea about
Unix-ish paths. But above-mentioned commit introduced a test that wants
to verify that this path is identical to the one that the Git Perl
module reports (which refuses to use Win32 paths and uses Unix-ish paths
instead). Obviously, this must fail because no heuristics can kick in at
that layer.
This test failure has not even been caught when Git introduced Windows
support in its CI definition in 2e90484eb4a (ci: add a Windows job to
the Azure Pipelines definition, 2019-01-29), as all tests relying on
Perl had to be disabled even from the start (because the CI runs would
otherwise have resulted in prohibitively long runtimes, not because
Windows is super slow per se, but because Git's test suite keeps
insisting on using technology that requires a POSIX emulation layer,
which _is_ super slow on Windows).
To work around this failure, let's use the `cygpath` utility to convert
the absolute `gitdir` path into the form that the Perl code expects.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Copy detection cannot work when comparing the index to the working tree
because Git ignores files that it is not explicitly told to track. It
should work in the other direction, though, i.e. for a reverse diff of
the deletion of a copy from the index.
d1f2d7e8ca (Make run_diff_index() use unpack_trees(), not read_tree(),
2008-01-19) broke it with a seemingly stray change to run_diff_files().
We didn't notice because there's no test for that. But even if we had
one, it might have gone unnoticed because the breakage only happens
with index preloading, which requires at least 1000 entries (more than
most test repos have) and is racy because it runs in parallel with the
actual command.
Fix copy detection by queuing up-to-date and skip-worktree entries using
diff_same().
While at it, use diff_same() also for queuing unchanged files not
flagged as up-to-date, i.e. clean submodules and entries where
preloading was not done at all or not quickly enough. It uses less
memory than diff_change() and doesn't unnecessarily set the diff flag
has_changes.
Add two tests to cover running both without and with preloading. The
first one passes reliably with the original code. The second one
enables preloading and thus is racy. It has a good chance to pass even
without the fix, but fails within seconds when running the test script
with --stress. With the fix it runs fine for several minutes, until
my patience runs out.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git repo struct" learned to take "-z" as a synonym to "--format=nul".
* lo/repo-struct-z:
repo: add -z as an alias for --format=nul to git-repo-structure
repo: use [--format=... | -z] instead of [-z] in git-repo-info synopsis
repo: remove blank line from Documentation/git-repo.adoc
A help message from "git branch" now mentions "git help" instead of
"man" when suggesting to read some documentation.
* kh/advise-w-git-help-in-branch:
branch: advice using git-help(1) instead of man(1)
"git last-modified" used to mishandle "--" to mark the beginning of
pathspec, which has been corrected.
* js/last-modified-with-sparse-checkouts:
last-modified: support sparse checkouts
git --version reports its version with the prefix "git version ".
Remove precisely this string instead of everything up to and including
the rightmost space to avoid butchering version strings that contain
spaces. This helps Apple's release of Git, which reports its version
like this: "git version 2.50.1 (Apple Git-155)".
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These config values were added in the original Scalar contribution,
d0feac4e8c (scalar: 'register' sets recommended config and starts
maintenance, 2021-12-03), but were never fully checked for validity in
the upstream Git project. At the time, Scalar was only intended for the
contrib/ directory so did not have as rigorous of an investigation.
Each config option has its own justification for removal:
* core.preloadIndex: This value is true by default, now. Removing this
causes some changes required to the tests that checked this config
value. Use gui.gcwarning=false instead.
* core.fscache: This config does not exist in the core Git project, but
is instead a config option for a Git for Windows feature.
* core.multiPackIndex: This config value is now enabled by default, so
does not need to be called out specifically. It was originally
included to make sure the background maintenance that created
multi-pack-indexes would result in the expected performance
improvements.
* credential.validate: This option is not something specific to Git but
instead an older version of Git Credential Manager for Windows. That
software was replaced several years ago by the cross-platform Git
Credential Manger so this option is no longer needed to help users who
were on that older software.
* pack.useSparse=true: This value is now Git's default as of de3a864114
(config: set pack.useSparse=true by default, 2020-03-20) so we don't
need it set by Scalar.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The index.skipHash config option has been set to 'false' by Scalar since
4933152cbb (scalar: enable path-walk during push via config, 2025-05-16)
but that commit message is trying to communicate the exact opposite:
that the 'true' value is what we want instead. This means that we've
been disabling this performance benefit for Scalar repos
unintentionally.
Fix this issue before we add justification for the config options set in
this list.
Oddly, enabling index.skipHash causes a test issue during 'test_commit'
in one of the Scalar tests when GIT_TEST_SPLIT_INDEX is enabled (as
caught by the linux-test-vars build). I'm fixing the test by disabling
the environment variable, but the issue should be resolved in a series
focused on the split index.
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A repo may have config options set by 'scalar clone' or 'scalar
register' and then updated by 'scalar reconfigure'. It can be helpful to
point out which of those options were set by the latest scalar
recommendations.
Add "# set by scalar" to the end of each config option to assist users
in identifying why these config options were set in their repo. Use a new
helper method to simplify the two callsites.
Co-authored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In `write_midx_internal()` we know to skip rewriting the multi-pack
index in case the existing one already covers all packs. This logic does
not know to handle `git multi-pack-index write --stdin-packs` though, so
we end up always rewriting the MIDX in this case even if the MIDX would
not change.
With our default maintenance strategy this isn't really much of a
problem, as git-gc(1) does not use the "--stdin-packs" option. But that
is changing with geometric repacking, where "--stdin-packs" is used to
explicitly select the packfiles part of the geometric sequence.
This issue can be demonstrated trivially with a benchmark in the Git
repository: executing `git repack --geometric=2 --write-midx -d` in the
Git repository takes more than 3 seconds only to end up with the same
multi-pack index as we already had before.
The logic that decides if we need to rewrite the MIDX only checks
whether the number of packfiles covered will change. That check is of
course too lenient for "--stdin-packs", as it could happen that we want
to cover a different-but-same-size set of packfiles. But there is no
inherent reason why we cannot handle "--stdin-packs".
Improve the logic to not only check for the number of packs, but to also
verify that we are asked to generate a MIDX for the _same_ packs. This
allows us to also skip no-op rewrites for "--stdin-packs".
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function `midx_preferred_pack()` returns the preferred pack for a
given multi-pack index. To compute the preferred pack we:
1. Take the first position indexed by the MIDX in pseudo-pack order.
2. Convert this pseudo-pack position into the MIDX position.
3. We then look up the pack that corresponds to this MIDX position.
This reliably returns the preferred pack given that all of its contained
objects will be up front in pseudo-pack order.
The second step that turns the pseudo-pack order into MIDX order
requires the reverse index though, which may not exist for example when
the MIDX does not have a bitmap. And in that case one may easily hit a
bug:
BUG: ../pack-revindex.c:491: pack_pos_to_midx: reverse index not yet loaded
In theory, `midx_preferred_pack()` already knows to handle the case
where no reverse index exists, as it calls `load_midx_revindex()` before
calling into `midx_preferred_pack()`. But we only check for negative
return values there, even though the function returns a positive error
code in case the reverse index does not exist.
Fix the issue by testing for a non-zero return value instead, same as
all the other callers of this function already do. While at it, document
the return value of `load_midx_revindex()`.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Fix a regression introduced with batched updates in 0e358de64a (fetch:
use batched reference updates, 2025-05-19) when fetching references. In
the `do_fetch()` function, we jump to cleanup if committing the
transaction fails, regardless of whether using batched or atomic
updates. This skips three subsequent operations:
- Update 'FETCH_HEAD' as part of `commit_fetch_head()`.
- Add upstream tracking information via `set_upstream()`.
- Setting remote 'HEAD' values when `do_set_head` is true.
For atomic updates, this is expected behavior. For batched updates,
we want to continue with these operations even if some refs fail to
update.
Skipping `commit_fetch_head()` isn't actually a regression because
'FETCH_HEAD' is already updated via `append_fetch_head()` when not
using '--atomic'. However, we add a test to validate this behavior.
Skipping the other two operations (upstream tracking and remote HEAD)
is a regression. Fix this by only jumping to cleanup when using
'--atomic', allowing batched updates to continue with post-fetch
operations. Add tests to prevent future regressions.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>