22055 Commits

Author SHA1 Message Date
Patrick Steinhardt
07658e9ce5 builtin/rev-parse: allow shortening to more than 40 hex characters
The `--short=` option for git-rev-parse(1) allows the user to specify
to how many characters object IDs should be shortened to. The option is
broken though for SHA256 repositories because we set the maximum allowed
hash size to `the_hash_algo->hexsz` before we have even set up the repo.
Consequently, `the_hash_algo` will always be SHA1 and thus we truncate
every hash after at most 40 characters.

Fix this by accessing `the_hash_algo` only after we have set up the
repo.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
Patrick Steinhardt
bd455cec37 remote-curl: fix parsing of detached SHA256 heads
The dumb HTTP transport tries to read the remote HEAD reference by
downloading the "HEAD" file and then parsing it via `http_fetch_ref()`.
This function will either parse the file as an object ID in case it is
exactly `the_hash_algo->hexsz` long, or otherwise it will check whether
the reference starts with "ref :" and parse it as a symbolic ref.

This is broken when parsing detached HEADs of a remote SHA256 repository
because we never update `the_hash_algo` to the discovered remote object
hash. Consequently, `the_hash_algo` will always be the fallback SHA1
hash algorithm, which will cause us to fail parsing HEAD altogteher when
it contains a SHA256 object ID.

Fix this issue by setting up `the_hash_algo` via `repo_set_hash_algo()`.
While at it, let's make the expected SHA1 fallback explicit in our code,
which also addresses an upcoming issue where we are going to remove the
SHA1 fallback for `the_hash_algo`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
Patrick Steinhardt
813f17fd6b attr: fix BUG() when parsing attrs outside of repo
If either the `--attr-source` option or the `GIT_ATTR_SOURCE` envvar are
set, then `compute_default_attr_source()` will try to look up the value
as a treeish. It is possible to hit that function while outside of a Git
repository though, for example when using `git grep --no-index`. In that
case, Git will hit a bug because we try to look up the main ref store
outside of a repository.

Handle the case gracefully and detect when we try to look up an attr
source without a repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:49 -07:00
Patrick Steinhardt
b7afb46225 parse-options-cb: only abbreviate hashes when hash algo is known
The `OPT__ABBREV()` option can be used to add an option that abbreviates
object IDs. When given a length longer than `the_hash_algo->hexsz`, then
it will instead set the length to that maximum length.

It may not always be guaranteed that we have `the_hash_algo` initialized
properly as the hash algorithm can only be set up after we have set up
`the_repository`. In that case, the hash would always be truncated to
the hex length of SHA1, which may not be what the user desires.

In practice it's not a problem as all commands that use `OPT__ABBREV()`
also have `RUN_SETUP` set and thus cannot work without a repository.
Consequently, both `the_repository` and `the_hash_algo` would be
properly set up.

Regardless of that, harden the code to not truncate the length when we
didn't set up a repository.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-06 22:50:48 -07:00
Junio C Hamano
3452c8ab8a Merge branch 'ps/the-index-is-no-more' into ps/undecided-is-not-necessarily-sha1
* ps/the-index-is-no-more:
  repository: drop `initialize_the_repository()`
  repository: drop `the_index` variable
  builtin/clone: stop using `the_index`
  repository: initialize index in `repo_init()`
  builtin: stop using `the_index`
  t/helper: stop using `the_index`
2024-05-06 22:50:29 -07:00
Junio C Hamano
e9e8dd8801 Merge branch 'jc/no-default-attr-tree-in-bare' into ps/undecided-is-not-necessarily-sha1
* jc/no-default-attr-tree-in-bare:
  stop using HEAD for attributes in bare repository by default
2024-05-06 22:50:24 -07:00
Junio C Hamano
51441e6460 stop using HEAD for attributes in bare repository by default
With 23865355 (attr: read attributes from HEAD when bare repo,
2023-10-13), we started to use the HEAD tree as the default
attribute source in a bare repository.  One argument for such a
behaviour is that it would make things like "git archive" run in
bare and non-bare repositories for the same commit consistent.
This changes was merged to Git 2.43 but without an explicit mention
in its release notes.

It turns out that this change destroys performance of shallowly
cloning from a bare repository.  As the "server" installations are
expected to be mostly bare, and "git pack-objects", which is the
core of driving the other side of "git clone" and "git fetch" wants
to see if a path is set not to delta with blobs from other paths via
the attribute system, the change forces the server side to traverse
the tree of the HEAD commit needlessly to find if each and every
paths the objects it sends out has the attribute that controls the
deltification.  Given that (1) most projects do not configure such
an attribute, and (2) it is dubious for the server side to honor
such an end-user supplied attribute anyway, this was a poor choice
of the default.

To mitigate the current situation, let's revert the change that uses
the tree of HEAD in a bare repository by default as the attribute
source.  This will help most people who have been happy with the
behaviour of Git 2.42 and before.

Two things to note:

 * If you are stuck with versions of Git 2.43 or newer, that is
   older than the release this fix appears in, you can explicitly
   set the attr.tree configuration variable to point at an empty
   tree object, i.e.

	$ git config attr.tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904

 * If you like the behaviour we are reverting, you can explicitly
   set the attr.tree configuration variable to HEAD, i.e.

	$ git config attr.tree HEAD

The right fix for this is to optimize the code paths that allow
accesses to attributes in tree objects, but that is a much more
involved change and is left as a longer-term project, outside the
scope of this "first step" fix.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-03 09:15:33 -07:00
Junio C Hamano
75b182d34e Merge branch 'js/for-each-repo-keep-going'
A scheduled "git maintenance" job is expected to work on all
repositories it knows about, but it stopped at the first one that
errored out.  Now it keeps going.

* js/for-each-repo-keep-going:
  maintenance: running maintenance should not stop on errors
  for-each-repo: optionally keep going on an error
2024-04-30 14:49:45 -07:00
Junio C Hamano
90f6b5a597 Merge branch 'aj/stash-staged-fix'
"git stash -S" did not handle binary files correctly, which has
been corrected.

* aj/stash-staged-fix:
  stash: fix "--staged" with binary files
2024-04-30 14:49:43 -07:00
Junio C Hamano
708e9257f8 Merge branch 'jc/format-patch-rfc-more'
The "--rfc" option of "git format-patch" learned to take an
optional string value to be used in place of "RFC" to tweak the
"[PATCH]" on the subject header.

* jc/format-patch-rfc-more:
  format-patch: "--rfc=-(WIP)" appends to produce [PATCH (WIP)]
  format-patch: allow --rfc to optionally take a value, like --rfc=WIP
2024-04-30 14:49:43 -07:00
Junio C Hamano
07fc8275e1 Merge branch 'ds/format-patch-rfc-and-k'
The "-k" and "--rfc" options of "format-patch" will now error out
when used together, as one tells us not to add anything to the
title of the commit, and the other one tells us to add "RFC" in
addition to "PATCH".

* ds/format-patch-rfc-and-k:
  format-patch: ensure that --rfc and -k are mutually exclusive
2024-04-30 14:49:42 -07:00
Junio C Hamano
55e5548a0f Merge branch 'xx/disable-replace-when-building-midx'
The procedure to build multi-pack-index got confused by the
replace-refs mechanism, which has been corrected by disabling the
latter.

* xx/disable-replace-when-building-midx:
  midx: disable replace objects
2024-04-30 14:49:42 -07:00
Junio C Hamano
c9f43012a1 Merge branch 'pw/rebase-m-signoff-fix'
"git rebase --signoff" used to forget that it needs to add a
sign-off to the resulting commit when told to continue after a
conflict stops its operation.

* pw/rebase-m-signoff-fix:
  rebase -m: fix --signoff with conflicts
  sequencer: store commit message in private context
  sequencer: move current fixups to private context
  sequencer: start removing private fields from public API
  sequencer: always free "struct replay_opts"
2024-04-30 14:49:41 -07:00
Junio C Hamano
e326e52010 Merge branch 'rj/add-i-leak-fix'
Leakfix.

* rj/add-i-leak-fix:
  add: plug a leak on interactive_add
  add-patch: plug a leak handling the '/' command
  add-interactive: plug a leak in get_untracked_files
  apply: plug a leak in apply_data
2024-04-25 10:34:24 -07:00
Johannes Schindelin
c75662bfc9 maintenance: running maintenance should not stop on errors
In https://github.com/microsoft/git/issues/623, it was reported that
maintenance stops on a missing repository, omitting the remaining
repositories that were scheduled for maintenance.

This is undesirable, as it should be a best effort type of operation.

It should still fail due to the missing repository, of course, but not
leave the non-missing repositories in unmaintained shapes.

Let's use `for-each-repo`'s shiny new `--keep-going` option that we just
introduced for that very purpose.

This change will be picked up when running `git maintenance start`,
which is run implicitly by `scalar reconfigure`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-24 10:46:03 -07:00
Johannes Schindelin
12c2ee5fbd for-each-repo: optionally keep going on an error
In https://github.com/microsoft/git/issues/623, it was reported that
the regularly scheduled maintenance stops if one repo in the middle of
the list was found to be missing.

This is undesirable, and points out a gap in the design of `git
for-each-repo`: We need a mode where that command does not stop on an
error, but continues to try running the specified command with the other
repositories.

Imitating the `--keep-going` option of GNU make, this commit teaches
`for-each-repo` the same trick: to continue with the operation on all
the remaining repositories in case there was a problem with one
repository, still setting the exit code to indicate an error occurred.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-24 10:46:03 -07:00
Junio C Hamano
7b66f5dd8b Merge branch 'mr/rerere-crash-fix'
When .git/rr-cache/ rerere database gets corrupted or rerere is fed to
work on a file with conflicted hunks resolved incompletely, the rerere
machinery got confused and segfaulted, which has been corrected.

* mr/rerere-crash-fix:
  rerere: fix crashes due to unmatched opening conflict markers
2024-04-23 11:52:41 -07:00
Junio C Hamano
567293123d Merge branch 'ps/missing-btmp-fix'
GIt 2.44 introduced a regression that makes the updated code to
barf in repositories with multi-pack index written by older
versions of Git, which has been corrected.

* ps/missing-btmp-fix:
  pack-bitmap: gracefully handle missing BTMP chunks
2024-04-23 11:52:40 -07:00
Junio C Hamano
b258237f4d Merge branch 'dd/t9604-use-posix-timezones'
The cvsimport tests required that the platform understands
traditional timezone notations like CST6CDT, which has been
updated to work on those systems as long as they understand
POSIX notation with explicit tz transition dates.

* dd/t9604-use-posix-timezones:
  t9604: Fix test for musl libc and new Debian
2024-04-23 11:52:39 -07:00
Junio C Hamano
050e334979 Merge branch 'ta/fast-import-parse-path-fix'
The way "git fast-import" handles paths described in its input has
been tightened up and more clearly documented.

* ta/fast-import-parse-path-fix:
  fast-import: make comments more precise
  fast-import: forbid escaped NUL in paths
  fast-import: document C-style escapes for paths
  fast-import: improve documentation for path quoting
  fast-import: remove dead strbuf
  fast-import: allow unquoted empty path for root
  fast-import: directly use strbufs for paths
  fast-import: tighten path unquoting
2024-04-23 11:52:37 -07:00
Junio C Hamano
ce36894509 format-patch: "--rfc=-(WIP)" appends to produce [PATCH (WIP)]
In the previous step, the "--rfc" option of "format-patch" learned
to take an optional string value to prepend to the subject prefix,
so that --rfc=WIP can give "[WIP PATCH]".

There may be cases in which the extra string wants to come after the
subject prefix.  Extend the mechanism to allow "--rfc=-(WIP)" [*] to
signal that the extra string is to be appended instead of getting
prepended, resulting in "[PATCH (WIP)]".

In the documentation, discourage (ab)using "--rfc=-RFC" to say
"[PATCH RFC]" just to be different, when "[RFC PATCH]" is the norm.

[Footnote]

 * The syntax takes inspiration from Perl's open syntax that opens
   pipes "open fh, '|-', 'cmd'", where the dash signals "the other
   stuff comes here".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-23 11:00:39 -07:00
Junio C Hamano
ce48fb2eab format-patch: allow --rfc to optionally take a value, like --rfc=WIP
With the "--rfc" option, we can tweak the "[PATCH]" (or whatever
string specified with the "--subject-prefix" option, instead of
"PATCH") that we prefix the title of the commit with into "[RFC
PATCH]", but some projects may want "[rfc PATCH]".  Adding a new
option, e.g., "--rfc-lowercase", to support such need every time
somebody wants to use different strings would lead to insanity of
accumulating unbounded number of such options.

Allow an optional value specified for the option, so that users can
use "--rfc=rfc" (think of "--rfc" without value as a short-hand for
"--rfc=RFC") if they wanted to.

This can of course be (ab)used to make the prefix "[WIP PATCH]" by
passing "--rfc=WIP".  Passing an empty string, i.e., "--rfc=", is
the same as "--no-rfc" to override an option given earlier on the
same command line.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-23 11:00:38 -07:00
Rubén Justo
16727404c4 add: plug a leak on interactive_add
Plug a leak we have since 5a76aff1a6 (add: convert to use
parse_pathspec, 2013-07-14).

This leak can be triggered with:
    $ git add -p anything

Fixing this leak allows us to mark as leak-free the following tests:

    + t3701-add-interactive.sh
    + t7514-commit-patch.sh

Mark them with "TEST_PASSES_SANITIZE_LEAK=true" to notice and fix
promply any new leak that may be introduced and triggered by them in the
future.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-22 16:27:43 -07:00
Rubén Justo
71c7916053 apply: plug a leak in apply_data
We have an execution path in apply_data that leaks the local struct
image.  Plug it.

This leak can be triggered with:

    $ echo foo >file
    $ git add file && git commit -m file
    $ echo bar >file
    $ git diff file >diff
    $ sed s/foo/frotz/ <diff >baddiff
    $ git apply --cached <baddiff

Fixing this leak allows us to mark as leak-free the following tests:

    + t2016-checkout-patch.sh
    + t4103-apply-binary.sh
    + t4104-apply-boundary.sh
    + t4113-apply-ending.sh
    + t4117-apply-reject.sh
    + t4123-apply-shrink.sh
    + t4252-am-options.sh
    + t4258-am-quoted-cr.sh

Mark them with "TEST_PASSES_SANITIZE_LEAK=true" to notice and fix
promply any new leak that may be introduced and triggered by them in the
future.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-22 16:27:42 -07:00
Adam Johnson
5fb7686409 stash: fix "--staged" with binary files
"git stash --staged" errors out when given binary files, after saving the
stash.

This behaviour dates back to the addition of the feature in 41a28eb6c1
(stash: implement '--staged' option for 'push' and 'save', 2021-10-18).
Adding the "--binary" option of "diff-tree" fixes this. The "diff-tree" call
in stash_patch() also omits "--binary", but that is fine since binary files
cannot be selected interactively.

Helped-By: Jeff King <peff@peff.net>
Helped-By: Randall S. Becker <randall.becker@nexbridge.ca>
Signed-off-by: Adam Johnson <me@adamj.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-22 13:57:18 -07:00
Dragan Simic
cadcf58085 format-patch: ensure that --rfc and -k are mutually exclusive
Fix a bug that allows the "--rfc" and "-k" options to be specified together
when "git format-patch" is executed, which was introduced in the commit
e0d7db7423a9 ("format-patch: --rfc honors what --subject-prefix sets").

Add a couple of additional tests to t4014, to cover additional cases of
the mutual exclusivity between different "git format-patch" options.

Signed-off-by: Dragan Simic <dsimic@manjaro.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-19 08:40:57 -07:00
Phillip Wood
a6c2654f83 rebase -m: fix --signoff with conflicts
When rebasing with "--signoff" the commit created by "rebase --continue"
after resolving conflicts or editing a commit fails to add the
"Signed-off-by:" trailer. This happens because the message from the
original commit is reused instead of the one that would have been used
if the sequencer had not stopped for the user interaction. The correct
message is stored in ctx->message and so with a couple of exceptions
this is written to rebase_path_message() when stopping for user
interaction instead. The exceptions are (i) "fixup" and "squash"
commands where the file is written by error_failed_squash() and (ii)
"edit" commands that are fast-forwarded where the original message is
still reused. The latter is safe because "--signoff" will never
fast-forward.

Note this introduces a change in behavior as the message file now
contains conflict comments. This is safe because commit_staged_changes()
passes an explicit cleanup flag when not editing the message and when
the message is being edited it will be cleaned up automatically. This
means user now sees the same message comments in editor with "rebase
--continue" as they would if they ran "git commit" themselves before
continuing the rebase. It also matches the behavior of "git
cherry-pick", "git merge" etc. which all list the files with merge
conflicts.

The tests are extended to check that all commits made after continuing a
rebase have a "Signed-off-by:" trailer. Sadly there are a couple of
leaks in apply.c which I've not been able to track down that mean this
test file is no-longer leak free when testing "git rebase --apply
--signoff" with conflicts.

Reported-by: David Bimmler <david.bimmler@isovalent.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 13:33:41 -07:00
Junio C Hamano
2a60cb766e Merge branch 'pw/t3428-cleanup' into pw/rebase-m-signoff-fix
* pw/t3428-cleanup:
  t3428: restore coverage for "apply" backend
  t3428: use test_commit_message
  t3428: modernize test setup
2024-04-18 13:33:37 -07:00
Patrick Steinhardt
0c47355790 repository: drop initialize_the_repository()
Now that we have dropped `the_index`, `initialize_the_repository()`
doesn't really do a lot anymore except for setting up the pointer for
`the_repository` and then calling `initialize_repository()`. The former
can be replaced by statically initializing the pointer though, which
basically makes this function moot.

Convert callers to instead call `initialize_repository(the_repository)`
and drop `initialize_thee_repository()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 12:30:43 -07:00
Patrick Steinhardt
319ba14407 t/helper: stop using the_index
Convert test-helper tools to use `the_repository->index` instead of
`the_index`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-18 12:30:41 -07:00
Xing Xin
93e2ae1c95 midx: disable replace objects
We observed a series of clone failures arose in a specific set of
repositories after we fully enabled the MIDX bitmap feature within our
Codebase service. These failures were accompanied with error messages
such as:

    Cloning into bare repository 'clone.git'...
    remote: Enumerating objects: 8, done.
    remote: Total 8 (delta 0), reused 0 (delta 0), pack-reused 8 (from 1)
    Receiving objects: 100% (8/8), done.
    fatal: did not receive expected object ...
    fatal: fetch-pack: invalid index-pack output

Temporarily disabling the MIDX feature eliminated the reported issues.
After some investigation we found that all repositories experiencing
failures contain replace references, which seem to be improperly
acknowledged by the MIDX bitmap generation logic.

A more thorough explanation about the root cause from Taylor Blau says:

Indeed, the pack-bitmap-write machinery does not itself call
disable_replace_refs(). So when it generates a reachability bitmap, it
is doing so with the replace refs in mind. You can see that this is
indeed the cause of the problem by looking at the output of an
instrumented version of Git that indicates what bits are being set
during the bitmap generation phase.

With replace refs (incorrectly) enabled, we get:

    [2, 4, 6, 8, 13, 3, 6, 7, 3, 4, 6, 8]

and doing the same after calling disable_replace_refs(), we instead get:

    [2, 5, 6, 13, 3, 6, 7, 3, 4, 6, 8]

Single pack bitmaps are unaffected by this issue because we generate
them from within pack-objects, which does call disable_replace_refs().

This patch updates the MIDX logic to disable replace objects within the
multi-pack-index builtin, and a test showing a clone (which would fail
with MIDX bitmap) is added to demonstrate the bug.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Xing Xin <xingxin.xx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 12:35:41 -07:00
Junio C Hamano
93e3f9df7a Merge branch 'pw/t3428-cleanup'
Test cleanup.

* pw/t3428-cleanup:
  t3428: restore coverage for "apply" backend
  t3428: use test_commit_message
  t3428: modernize test setup
2024-04-16 14:50:31 -07:00
Junio C Hamano
82a31ec324 Merge branch 'jt/reftable-geometric-compaction'
The strategy to compact multiple tables of reftables after many
operations accumulate many entries has been improved to avoid
accumulating too many tables uncollected.

* jt/reftable-geometric-compaction:
  reftable/stack: use geometric table compaction
  reftable/stack: add env to disable autocompaction
  reftable/stack: expose option to disable auto-compaction
2024-04-16 14:50:30 -07:00
Junio C Hamano
107313eb11 Merge branch 'rs/date-mode-pass-by-value'
The codepaths that reach date_mode_from_type() have been updated to
pass "struct date_mode" by value to make them thread safe.

* rs/date-mode-pass-by-value:
  date: make DATE_MODE thread-safe
2024-04-16 14:50:29 -07:00
Junio C Hamano
2d642afb0a Merge branch 'sj/userdiff-c-sharp'
The userdiff patterns for C# has been updated.

Acked-by: Johannes Sixt <j6t@kdbg.org>
cf. <c2154457-3f2f-496e-9b8b-c8ea7257027b@kdbg.org>

* sj/userdiff-c-sharp:
  userdiff: better method/property matching for C#
2024-04-16 14:50:28 -07:00
Junio C Hamano
625ef1c6f1 Merge branch 'tb/t7700-fixup'
Test fix.

* tb/t7700-fixup:
  t/t7700-repack.sh: fix test breakages with `GIT_TEST_MULTI_PACK_INDEX=1 `
2024-04-16 14:50:28 -07:00
Junio C Hamano
92e8388bd3 Merge branch 'jc/local-extern-shell-rules'
Document and apply workaround for a buggy version of dash that
mishandles "local var=val" construct.

* jc/local-extern-shell-rules:
  t1016: local VAR="VAL" fix
  t0610: local VAR="VAL" fix
  t: teach lint that RHS of 'local VAR=VAL' needs to be quoted
  t: local VAR="VAL" (quote ${magic-reference})
  t: local VAR="VAL" (quote command substitution)
  t: local VAR="VAL" (quote positional parameters)
  CodingGuidelines: quote assigned value in 'local var=$val'
  CodingGuidelines: describe "export VAR=VAL" rule
2024-04-16 14:50:27 -07:00
Marcel Röthke
167395bb47 rerere: fix crashes due to unmatched opening conflict markers
When rerere handles a conflict with an unmatched opening conflict marker
in a file with other conflicts, it will fail create a preimage and also
fail allocate the status member of struct rerere_dir. Currently the
status member is allocated after the error handling. This will lead to a
SEGFAULT when the status member is accessed during cleanup of the failed
parse.

Additionally, in subsequent executions of rerere, after removing the
MERGE_RR.lock manually, rerere crashes for a similar reason. MERGE_RR
points to a conflict id that has no preimage, therefore the status
member is not allocated and a SEGFAULT happens when trying to check if a
preimage exists.

Solve this by making sure the status field is allocated correctly and add
tests to prevent the bug from reoccurring.

This does not fix the root cause, failing to parse stray conflict
markers, but I don't think we can do much better than recognizing it,
printing an error, and moving on gracefully.

Signed-off-by: Marcel Röthke <marcel@roethke.info>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-16 08:42:36 -07:00
Junio C Hamano
b415f15b49 Merge branch 'jc/unleak-core-excludesfile'
The variable that holds the value read from the core.excludefile
configuration variable used to leak, which has been corrected.

* jc/unleak-core-excludesfile:
  config: do not leak excludes_file
2024-04-15 14:11:44 -07:00
Junio C Hamano
372aabe912 Merge branch 'ps/t0610-umask-fix'
The "shared repository" test in the t0610 reftable test failed
under restrictive umask setting (e.g. 007), which has been
corrected.

* ps/t0610-umask-fix:
  t0610: execute git-pack-refs(1) with specified umask
  t0610: make `--shared=` tests reusable
2024-04-15 14:11:43 -07:00
Junio C Hamano
d75ec4c627 Merge branch 'gt/add-u-commit-i-pathspec-check'
"git add -u <pathspec>" and "git commit [-i] <pathspec>" did not
diagnose a pathspec element that did not match any files in certain
situations, unlike "git add <pathspec>" did.

* gt/add-u-commit-i-pathspec-check:
  builtin/add: error out when passing untracked path with -u
  builtin/commit: error out when passing untracked path with -i
  revision: optionally record matches with pathspec elements
2024-04-15 14:11:43 -07:00
Junio C Hamano
509cc1d413 Merge branch 'ma/win32-unix-domain-socket'
Windows binary used to decide the use of unix-domain socket at
build time, but it learned to make the decision at runtime instead.

* ma/win32-unix-domain-socket:
  Win32: detect unix socket support at runtime
2024-04-15 14:11:42 -07:00
Patrick Steinhardt
795006fff4 pack-bitmap: gracefully handle missing BTMP chunks
In 0fea6b73f1 (Merge branch 'tb/multi-pack-verbatim-reuse', 2024-01-12)
we have introduced multi-pack verbatim reuse of objects. This series has
introduced a new BTMP chunk, which encodes information about bitmapped
objects in the multi-pack index. Starting with dab60934e3 (pack-bitmap:
pass `bitmapped_pack` struct to pack-reuse functions, 2023-12-14) we use
this information to figure out objects which we can reuse from each of
the packfiles.

One thing that we glossed over though is backwards compatibility with
repositories that do not yet have BTMP chunks in their multi-pack index.
In that case, `nth_bitmapped_pack()` would return an error, which causes
us to emit a warning followed by another error message. These warnings
are visible to users that fetch from a repository:

```
$ git fetch
...
remote: error: MIDX does not contain the BTMP chunk
remote: warning: unable to load pack: 'pack-f6bb7bd71d345ea9fe604b60cab9ba9ece54ffbe.idx', disabling pack-reuse
remote: Enumerating objects: 40, done.
remote: Counting objects: 100% (40/40), done.
remote: Compressing objects: 100% (39/39), done.
remote: Total 40 (delta 5), reused 0 (delta 0), pack-reused 0 (from 0)
...
```

While the fetch succeeds the user is left wondering what they did wrong.
Furthermore, as visible both from the warning and from the reuse stats,
pack-reuse is completely disabled in such repositories.

What is quite interesting is that this issue can even be triggered in
case `pack.allowPackReuse=single` is set, which is the default value.
One could have expected that in this case we fall back to the old logic,
which is to use the preferred packfile without consulting BTMP chunks at
all. But either we fail with the above error in case they are missing,
or we use the first pack in the multi-pack-index. The former case
disables pack-reuse altogether, whereas the latter case may result in
reusing objects from a suboptimal packfile.

Fix this issue by partially reverting the logic back to what we had
before this patch series landed. Namely, in the case where we have no
BTMP chunks or when `pack.allowPackReuse=single` are set, we use the
preferred pack instead of consulting the BTMP chunks.

Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:42:00 -07:00
Thalia Archibald
be4d6a371e fast-import: forbid escaped NUL in paths
NUL cannot appear in paths. Even disregarding filesystem path
limitations, the tree object format delimits with NUL, so such a path
cannot be encoded by Git.

When a quoted path is unquoted, it could possibly contain NUL from
"\000". Forbid it so it isn't truncated.

fast-import still has other issues with NUL, but those will be addressed
later.

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:06:18 -07:00
Thalia Archibald
a923a04b80 fast-import: document C-style escapes for paths
Simply saying “C-style” string quoting is imprecise, as only a subset of
C escapes are supported. Document the exact escapes.

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:06:18 -07:00
Thalia Archibald
b5062f752e fast-import: allow unquoted empty path for root
Ever since filerename was added in f39a946a1f (Support wholesale
directory renames in fast-import, 2007-07-09) and filecopy in b6f3481bb4
(Teach fast-import to recursively copy files/directories, 2007-07-15),
both have produced an error when the destination path is empty. Later,
when support for targeting the root directory with an empty string was
added in 2794ad5244 (fast-import: Allow filemodify to set the root,
2010-10-10), this had the effect of allowing the quoted empty string
(`""`), but forbidding its unquoted variant (``). This seems to have
been intended as simple data validation for parsing two paths, rather
than a syntax restriction, because it was not extended to the other
operations.

All other occurrences of paths (in filemodify, filedelete, the source of
filecopy and filerename, and ls) allow both.

For most of this feature's lifetime, the documentation has not
prescribed the use of quoted empty strings. In e5959106d6
(Documentation/fast-import: put explanation of M 040000 <dataref> "" in
context, 2011-01-15), its documentation was changed from “`<path>` may
also be an empty string (`""`) to specify the root of the tree” to “The
root of the tree can be represented by an empty string as `<path>`”.

Thus, we should assume that some front-ends have depended on this
behavior.

Remove this restriction for the destination paths of filecopy and
filerename and change tests targeting the root to test `""` and ``.

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:06:17 -07:00
Thalia Archibald
0df86b6689 fast-import: tighten path unquoting
Path parsing in fast-import is inconsistent and many unquoting errors
are suppressed or not checked.

<path> appears in the grammar in these places:

    filemodify ::= 'M' SP <mode> (<dataref> | 'inline') SP <path> LF
    filedelete ::= 'D' SP <path> LF
    filecopy   ::= 'C' SP <path> SP <path> LF
    filerename ::= 'R' SP <path> SP <path> LF
    ls         ::= 'ls' SP <dataref> SP <path> LF
    ls-commit  ::= 'ls' SP <path> LF

and fast-import.c parses them in five different ways:

1. For filemodify and filedelete:
   Try to unquote <path>. If it unquotes without errors, use the
   unquoted version; otherwise, treat it as literal bytes to the end of
   the line (including any number of SP).
2. For filecopy (source) and filerename (source):
   Try to unquote <path>. If it unquotes without errors, use the
   unquoted version; otherwise, treat it as literal bytes up to, but not
   including, the next SP.
3. For filecopy (dest) and filerename (dest):
   Like 1., but an unquoted empty string is forbidden.
4. For ls:
   If <path> starts with `"`, unquote it and report parse errors;
   otherwise, treat it as literal bytes to the end of the line
   (including any number of SP).
5. For ls-commit:
   Unquote <path> and report parse errors.
   (It must start with `"` to disambiguate from ls.)

In the first three, any errors from trying to unquote a string are
suppressed, so a quoted string that contains invalid escapes would be
interpreted as literal bytes. For example, `"\xff"` would fail to
unquote (because hex escapes are not supported), and it would instead be
interpreted as the byte sequence '"', '\\', 'x', 'f', 'f', '"', which is
certainly not intended. Some front-ends erroneously use their language's
standard quoting routine instead of matching Git's, which could silently
introduce escapes that would be incorrectly parsed due to this and lead
to data corruption.

The documentation states “To use a source path that contains SP the path
must be quoted.”, so it is expected that some implementations depend on
spaces being allowed in paths in the final position. Thus we have two
documented ways to parse paths, so simplify the implementation to that.

Now we have:

1. `parse_path_eol` for filemodify, filedelete, filecopy (dest),
   filerename (dest), ls, and ls-commit:

   If <path> starts with `"`, unquote it and report parse errors;
   otherwise, treat it as literal bytes to the end of the line
   (including any number of SP).

2. `parse_path_space` for filecopy (source) and filerename (source):

   If <path> starts with `"`, unquote it and report parse errors;
   otherwise, treat it as literal bytes up to, but not including, the
   next SP. It must be followed by SP.

There remain two special cases: The dest <path> in filecopy and rename
cannot be an unquoted empty string (this will be addressed subsequently)
and <path> in ls-commit must be quoted to disambiguate it from ls.

Signed-off-by: Thalia Archibald <thalia@archibald.dev>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 10:06:17 -07:00
Junio C Hamano
28dc93bab0 Merge branch 'rs/t-prio-queue-cleanup'
t-prio-queue test has been cleaned up by using C99 compound
literals; this is meant to also serve as a weather-balloon to smoke
out folks with compilers who have trouble compiling code that uses
the feature.

* rs/t-prio-queue-cleanup:
  t-prio-queue: simplify using compound literals
2024-04-12 11:31:39 -07:00
Junio C Hamano
847af43a3a Merge branch 'jc/checkout-detach-wo-tracking-report'
"git checkout/switch --detach foo", after switching to the detached
HEAD state, gave the tracking information for the 'foo' branch,
which was pointless.

Tested-by: M Hickford <mirth.hickford@gmail.com>
cf. <CAGJzqsmE9FDEBn=u3ge4LA3ha4fDbm4OWiuUbMaztwjELBd7ug@mail.gmail.com>

* jc/checkout-detach-wo-tracking-report:
  checkout: omit "tracking" information on a detached HEAD
2024-04-12 11:31:39 -07:00
Junio C Hamano
f43863e686 Merge branch 'jc/t2104-style-update'
Coding style fixes.

* jc/t2104-style-update:
  t2104: style fixes
2024-04-10 10:00:09 -07:00