23262 Commits

Author SHA1 Message Date
Junio C Hamano
7e3cb2e515 Merge branch 'en/object-name-with-funny-refname-fix'
Extended SHA-1 expression parser did not work well when a branch
with an unusual name (e.g. "foo{bar") is involved.

* en/object-name-with-funny-refname-fix:
  object-name: be more strict in parsing describe-like output
  object-name: fix resolution of object names containing curly braces
2025-01-23 15:07:01 -08:00
Junio C Hamano
85cf8801c8 Merge branch 'sk/unit-test-hash'
Test update.

* sk/unit-test-hash:
  t/unit-tests: convert hash to use clar test framework
2025-01-21 08:44:55 -08:00
Junio C Hamano
7b39a128c8 Merge branch 'ps/the-repository'
More code paths have a repository passed through the callchain,
instead of assuming the primary the_repository object.

* ps/the-repository:
  match-trees: stop using `the_repository`
  graph: stop using `the_repository`
  add-interactive: stop using `the_repository`
  tmp-objdir: stop using `the_repository`
  resolve-undo: stop using `the_repository`
  credential: stop using `the_repository`
  mailinfo: stop using `the_repository`
  diagnose: stop using `the_repository`
  server-info: stop using `the_repository`
  send-pack: stop using `the_repository`
  serve: stop using `the_repository`
  trace: stop using `the_repository`
  pager: stop using `the_repository`
  progress: stop using `the_repository`
2025-01-21 08:44:54 -08:00
Junio C Hamano
d6a7cace21 Merge branch 'jt/fsck-skiplist-parse-fix'
A misconfigured "fsck.skiplist" configuration variable was not
diagnosed as an error, which has been corrected.

* jt/fsck-skiplist-parse-fix:
  fsck: reject misconfigured fsck.skipList
2025-01-21 08:44:53 -08:00
Junio C Hamano
cb441e1ec3 Merge branch 'ps/reftable-get-random-fix'
The code to compute "unique" name used git_rand() which can fail or
get stuck; the callsite does not require cryptographic security.
Introduce the "insecure" mode and use it appropriately.

* ps/reftable-get-random-fix:
  reftable/stack: accept insecure random bytes
  wrapper: allow generating insecure random bytes
2025-01-21 08:44:53 -08:00
Junio C Hamano
57ebdd5af4 Merge branch 'jk/t7407-use-test-grep'
Test clean-up.

* jk/t7407-use-test-grep:
  t7407: use test_grep
2025-01-21 08:44:53 -08:00
Junio C Hamano
5a59d1e1a0 Merge branch 'jk/lsan-race-ignore-false-positive'
The code to check LSan results has been simplified and made more
robust.

* jk/lsan-race-ignore-false-positive:
  test-lib: add a few comments to LSan log checking
  test-lib: simplify lsan results check
  test-lib: invert return value of check_test_results_san_file_empty
2025-01-21 08:44:52 -08:00
Junio C Hamano
b9a6830836 Merge branch 'mb/t7110-use-test-path-helper'
Test modernization.

* mb/t7110-use-test-path-helper:
  t7110: replace `test -f` with `test_path_is_*` helpers
2025-01-16 16:35:14 -08:00
Junio C Hamano
564b907c8a Merge branch 'ps/more-sign-compare'
More -Wsign-compare fixes.

* ps/more-sign-compare:
  sign-compare: avoid comparing ptrdiff with an int/unsigned
  commit-reach: use `size_t` to track indices when computing merge bases
  shallow: fix -Wsign-compare warnings
  builtin/log: fix remaining -Wsign-compare warnings
  builtin/log: use `size_t` to track indices
  commit-reach: use `size_t` to track indices in `get_reachable_subset()`
  commit-reach: use `size_t` to track indices in `remove_redundant()`
  commit-reach: fix type of `min_commit_date`
  commit-reach: fix index used to loop through unsigned integer
  prio-queue: fix type of `insertion_ctr`
2025-01-16 16:35:14 -08:00
Junio C Hamano
65faad6d84 Git 2.47.2
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmdkT1sACgkQsLXohpav
 5svdhRAAq0WoZIg+33vYNNVSTm3Ux9RJslmXs3lQuhuUJ61hK/28drSLU29GH7x7
 3nmmjp1cegnXRVLBAfoYDdzPprNNrQFQEHQEzgG/GDZw0OXn+WTZuNyrrUYoa+sd
 QSLlElRj2qrpHIMOsMIBKBSNB+qjJHOMGdxcBAS768TfnQpGIpc1KJa24TxsVBzC
 ScP4uvrFfPyQrqFUgiUhCeqLnO/6T5i/QAn/8cS5a1+zor5ZHSlw28TZTOxN2odo
 Rulp/FtehiDEzmRowgD3M4fImAPY6Ib6VORCYASqpJFFla30tu2bQqEi6raOMTec
 hg5Ibkmj6fHFONaYvoTMRkYHmtUnNgIPU/CYPwswNk8w1+PPQfJ+TYjBXOQgdTLW
 F0azHBHh7NRmEHVydiF9CqjgNVRzjO4IEZfGqXNFPPMvR6UUzDaIkrpYbwXBFMin
 GNPV3QISeXj9ROjJoCv0nclXETwWemykjZlD6b5krXn5TaJlFb+69qJvXrCLq5WY
 EoevSqKkB9HVK9si7P8Sh1cPGOr3kfiFPmMNKFVI8l0+iDFgBywOomWNS/JEzqu1
 nN142DKdL1W/rkeMUhbX2h11CZNvHKIOy3iaA4MTOing8/eMzyUUQ73Ck7odYs4f
 rZ0tTXKJhxojPvBpTxYe9SxM0bDLREiOv0zX76+sIuhbAQCmk0o=
 =MNNf
 -----END PGP SIGNATURE-----

Sync with Git 2.47.2
Git 2.47.2

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmdkT1sACgkQsLXohpav
# 5svdhRAAq0WoZIg+33vYNNVSTm3Ux9RJslmXs3lQuhuUJ61hK/28drSLU29GH7x7
# 3nmmjp1cegnXRVLBAfoYDdzPprNNrQFQEHQEzgG/GDZw0OXn+WTZuNyrrUYoa+sd
# QSLlElRj2qrpHIMOsMIBKBSNB+qjJHOMGdxcBAS768TfnQpGIpc1KJa24TxsVBzC
# ScP4uvrFfPyQrqFUgiUhCeqLnO/6T5i/QAn/8cS5a1+zor5ZHSlw28TZTOxN2odo
# Rulp/FtehiDEzmRowgD3M4fImAPY6Ib6VORCYASqpJFFla30tu2bQqEi6raOMTec
# hg5Ibkmj6fHFONaYvoTMRkYHmtUnNgIPU/CYPwswNk8w1+PPQfJ+TYjBXOQgdTLW
# F0azHBHh7NRmEHVydiF9CqjgNVRzjO4IEZfGqXNFPPMvR6UUzDaIkrpYbwXBFMin
# GNPV3QISeXj9ROjJoCv0nclXETwWemykjZlD6b5krXn5TaJlFb+69qJvXrCLq5WY
# EoevSqKkB9HVK9si7P8Sh1cPGOr3kfiFPmMNKFVI8l0+iDFgBywOomWNS/JEzqu1
# nN142DKdL1W/rkeMUhbX2h11CZNvHKIOy3iaA4MTOing8/eMzyUUQ73Ck7odYs4f
# rZ0tTXKJhxojPvBpTxYe9SxM0bDLREiOv0zX76+sIuhbAQCmk0o=
# =MNNf
# -----END PGP SIGNATURE-----
# gpg: Signature made Thu 19 Dec 2024 08:52:43 AM PST
# gpg:                using RSA key E1F036B1FEE7221FC778ECEFB0B5E88696AFE6CB
# gpg: Good signature from "Junio C Hamano <gitster@pobox.com>" [ultimate]
# gpg:                 aka "Junio C Hamano <junio@pobox.com>" [ultimate]
# gpg:                 aka "Junio C Hamano <jch@google.com>" [ultimate]

* tag 'v2.47.2':
  Git 2.47.2
  Git 2.46.3
  Git 2.45.3
  Git 2.44.3
  Git 2.43.6
  Git 2.42.4
  Git 2.41.3
  Git 2.40.4
  credential: disallow Carriage Returns in the protocol by default
  credential: sanitize the user prompt
  credential_format(): also encode <host>[:<port>]
  t7300: work around platform-specific behaviour with long paths on MinGW
  compat/regex: fix argument order to calloc(3)
  mingw: drop bogus (and unneeded) declaration of `_pgmptr`
  ci: remove 'Upload failed tests' directories' step from linux32 jobs
2025-01-13 12:55:26 -08:00
Elijah Newren
191f0c8db2 object-name: be more strict in parsing describe-like output
From Documentation/revisions.txt:
    '<describeOutput>', e.g. 'v1.7.4.2-679-g3bee7fb'::
      Output from `git describe`; i.e. a closest tag, optionally
      followed by a dash and a number of commits, followed by a dash, a
      'g', and an abbreviated object name.
which means that output of the format
    ${REFNAME}-${INTEGER}-g${HASH}
should parse to fully expanded ${HASH}.  This is fine.  However, we
currently don't validate any of ${REFNAME}-${INTEGER}, we only parse
-g${HASH} and assume the rest is valid.  That is problematic, since it
breaks things like

    git cat-file -p branchname:path/to/file/named/i-gaffed

which, when commit (or tree or blob) affed exists, will not return us
information about the file we are looking for but will instead
erroneously tell us about object affed.

A few additional notes:
  - This is a slight backward incompatibility break, because we used
    to allow ${GARBAGE}-g${HASH} as a way to spell ${HASH}.  However,
    a backward incompatible break is necessary, because there is no
    other way for someone to be more specific and disambiguate that they
    want the blob master:path/to/who-gabbed instead of the object abbed.
  - There is a possibility that check_refname_format() rules change in
    the future.  However, we can only realistically loosen the rules
    for what that function accepts rather than tighten.  If we were to
    tighten the rules, some real world repositories may already have
    refnames that suddenly become unacceptable and we break those
    repositories.  As such, any describe-like syntax of the form
    ${VALID_FOR_A_REFNAME}-${INTEGER}-g${HASH} that is valid with the
    changes in this commit will remain valid in the future.
  - The fact that check_refname_format() rules could loosen in the
    future is probably also an important reason to make this change.  If
    the rules loosen, there might be additional cases within
    ${GARBAGE}-g${HASH} that become ambiguous in the future.  While
    abbreviated hashes can be disambiguated by abbreviating less, it may
    well be that these alternative object names have no way of being
    disambiguated (much like pathnames cannot be).  Accepting all random
    ${GARBAGE} thus makes it difficult for us to allow future
    extensions to object naming.

So, tighten up the parsing to make sure ${REFNAME} and ${INTEGER} are
present in the string, and would be considered a valid ref and
non-negative integer.

Also, add a few tests for git describe using object names of the form
    ${REVISION_NAME}${MODIFIERS}
since an early version of this patch failed on constructs like
    git describe v2.48.0-rc2-161-g6c2274cdbc^0

Reported-by: Gabriel Amaral <gabriel-amaral@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-13 11:48:43 -08:00
Elijah Newren
71e19a0031 object-name: fix resolution of object names containing curly braces
Given a branch name of 'foo{bar', commands like

    git cat-file -p foo{bar:README.md

should succeed (assuming that branch had a README.md file, of course).
However, the change in cce91a2caef9 (Change 'master@noon' syntax to
'master@{noon}'., 2006-05-19) presumed that curly braces would always
come after an '@' or '^' and be paired, causing e.g. 'foo{bar:README.md'
to entirely miss the ':' and assume there's no object being referenced.
In short, git would report:

    fatal: Not a valid object name foo{bar:README.md

Change the parsing to only make the assumption of paired curly braces
immediately after either a '@' or '^' character appears.

Add tests for this, as well as for a few other test cases that initial
versions of this patch broke:
  * 'foo@@{...}'
  * 'foo^{/${SEARCH_TEXT_WITH_COLON}}:${PATH}'

Note that we'd prefer not duplicating the special logic for "@^" characters
here, because if get_oid_basic() or interpret_nth_prior_checkout() or
get_oid_basic() or similar gain extra methods of using curly braces,
then the logic in get_oid_with_context_1() would need to be updated as
well.  But it's not clear how to refactor all of these to have a simple
common callpoint with the specialized logic.

Reported-by: Gabriel Amaral <gabriel-amaral@github.com>
Helped-by: Michael Haggerty <mhagger@github.com>
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-13 11:48:28 -08:00
Junio C Hamano
b28fb93e51 Merge branch 'ps/build-sign-compare'
Last-minute fix for a regression in "git blame --abbrev=<length>"
when insane <length> is specified; we used to correctly cap it to
the hash output length but broke it during the cycle.

* ps/build-sign-compare:
  builtin/blame: fix out-of-bounds write with blank boundary commits
  builtin/blame: fix out-of-bounds read with excessive `--abbrev`
2025-01-10 09:19:34 -08:00
Patrick Steinhardt
e7fb2ca945 builtin/blame: fix out-of-bounds write with blank boundary commits
When passing the `-b` flag to git-blame(1), then any blamed boundary
commits which were marked as uninteresting will not get their actual
commit ID printed, but will instead be replaced by a couple of spaces.

The flag can lead to an out-of-bounds write as though when combined with
`--abbrev=` when the abbreviation length is longer than `GIT_MAX_HEXSZ`
as we simply use memset(3p) on that array with the user-provided length
directly. The result is most likely that we segfault.

An obvious fix would be to cull `length` to `GIT_MAX_HEXSZ` many bytes.
But when the underlying object ID is SHA1, and if the abbreviated length
exceeds the SHA1 length, it would cause us to print more bytes than
desired, and the result would be misaligned.

Instead, fix the bug by computing the length via strlen(3p). This makes
us write as many bytes as the formatted object ID requires and thus
effectively limits the length of what we may end up printing to the
length of its hash. If `--abbrev=` asks us to abbreviate to something
shorter than the full length of the underlying hash function it would be
handled by the call to printf(3p) correctly.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 06:56:55 -08:00
Patrick Steinhardt
1fbb8d7ecb builtin/blame: fix out-of-bounds read with excessive --abbrev
In 6411a0a896 (builtin/blame: fix type of `length` variable when
emitting object ID, 2024-12-06) we have fixed the type of the `length`
variable. In order to avoid a cast from `size_t` to `int` in the call to
printf(3p) with the "%.*s" formatter we have converted the code to
instead use fwrite(3p), which accepts the length as a `size_t`.

It was reported though that this makes us read over the end of the OID
array when the provided `--abbrev=` length exceeds the length of the
object ID. This is because fwrite(3p) of course doesn't stop when it
sees a NUL byte, whereas printf(3p) does.

Fix the bug by reverting back to printf(3p) and culling the provided
length to `GIT_MAX_HEXSZ` to keep it from overflowing when cast to an
`int`.

Reported-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-10 06:56:54 -08:00
Seyi Kuforiji
43850dcf9c t/unit-tests: convert hash to use clar test framework
Adapt the hash test functions to clar framework by using clar
assertions where necessary. Following the consensus to convert
the unit-tests scripts found in the t/unit-tests folder to clar driven by
Patrick Steinhardt. Test functions are structured as a standalone to
test individual hash string and literal case.

Mentored-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Seyi Kuforiji <kuforiji98@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-09 07:55:00 -08:00
Junio C Hamano
a60673e925 Merge branch 'js/reftable-realloc-errors-fix'
Last-minute fix to a recent update.

* js/reftable-realloc-errors-fix:
  t-reftable-basics: allow for `malloc` to be `#define`d
2025-01-08 14:10:27 -08:00
Johannes Schindelin
d02c37c3e6 t-reftable-basics: allow for malloc to be #defined
As indicated by the `#undef malloc` line in `reftable/basics.h`, it is
quite common to use allocators other than the default one by defining
`malloc` constants and friends.

This pattern is used e.g. in Git for Windows, which uses the powerful
and performant `mimalloc` allocator.

Furthermore, in `reftable/basics.c` this `#undef malloc` is
_specifically_ disabled by virtue of defining the
`REFTABLE_ALLOW_BANNED_ALLOCATORS` constant before including
`reftable/basic.h`, to ensure that such a custom allocator is also used
in the reftable code.

However, in 8db127d43f5b (reftable: avoid leaks on realloc error,
2024-12-28) and in 2cca185e8517 (reftable: fix allocation count on
realloc error, 2024-12-28), `reftable_set_alloc()` function calls were
introduced that pass `malloc`, `realloc` and `free` function pointers as
parameters _after_ `reftable/basics.h` ensured that they were no longer
`#define`d. This would override the custom allocator and re-set it to
the default allocator provided by, say, libc or MSVCRT.

This causes problems because those calls happen after the initial
allocator has already been used to initialize an array, which is
subsequently resized using the overridden default `realloc()` allocator.

You cannot mix and match allocators like that, which leads to a
`STATUS_HEAP_CORRUPTION` (C0000374) on Windows, and when running this
unit test through shell and/or `prove` (which only support 7-bit status
codes), it surfaces as exit code 127.

It is actually unnecessary to use those function pointers to
`malloc`/`realloc`/`free`, though: The `reftable` code goes out of its
way to fall back to the initial allocator when passing `NULL` parameters
instead. So let's do that instead of causing heap corruptions.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-08 09:41:52 -08:00
Justin Tobler
ca7158076f fsck: reject misconfigured fsck.skipList
In Git, fsck operations can ignore known broken objects via the
`fsck.skipList` configuration. This option expects a path to a file with
the list of object names. When the configuration is specified without a
path, an error message is printed, but the command continues as if the
configuration was not set. Configuring `fsck.skipList` without a value
is a misconfiguration so config parsing should be more strict and reject
it.

Update `git_fsck_config()` to no longer ignore misconfiguration of
`fsck.skipList`. The same behavior is also present for
`fetch.fsck.skipList` and `receive.fsck.skipList` so the configuration
parsers for these are updated to ensure the related operations remain
consistent.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 09:22:25 -08:00
Patrick Steinhardt
1568d1562e wrapper: allow generating insecure random bytes
The `csprng_bytes()` function generates randomness and writes it into a
caller-provided buffer. It abstracts over a couple of implementations,
where the exact one that is used depends on the platform.

These implementations have different guarantees: while some guarantee to
never fail (arc4random(3)), others may fail. There are two significant
failures to distinguish from one another:

  - Systemic failure, where e.g. opening "/dev/urandom" fails or when
    OpenSSL doesn't have a provider configured.

  - Entropy failure, where the entropy pool is exhausted, and thus the
    function cannot guarantee strong cryptographic randomness.

While we cannot do anything about the former, the latter failure can be
acceptable in some situations where we don't care whether or not the
randomness can be predicted.

Introduce a new `CSPRNG_BYTES_INSECURE` flag that allows callers to opt
into weak cryptographic randomness. The exact behaviour of the flag
depends on the underlying implementation:

    - `arc4random_buf()` never returns an error, so it doesn't change.

    - `getrandom()` pulls from "/dev/urandom" by default, which never
      blocks on modern systems even when the entropy pool is empty.

    - `getentropy()` seems to block when there is not enough randomness
      available, and there is no way of changing that behaviour.

    - `GtlGenRandom()` doesn't mention anything about its specific
      failure mode.

    - The fallback reads from "/dev/urandom", which also returns bytes in
      case the entropy pool is drained in modern Linux systems.

That only leaves OpenSSL with `RAND_bytes()`, which returns an error in
case the returned data wouldn't be cryptographically safe. This function
is replaced with a call to `RAND_pseudo_bytes()`, which can indicate
whether or not the returned data is cryptographically secure via its
return value. If it is insecure, and if the `CSPRNG_BYTES_INSECURE` flag
is set, then we ignore the insecurity and return the data regardless.

It is somewhat questionable whether we really need the flag in the first
place, or whether we wouldn't just ignore the potentially-insecure data.
But the risk of doing that is that we might have or grow callsites that
aren't aware of the potential insecureness of the data in places where
it really matters. So using a flag to opt-in to that behaviour feels
like the more secure choice.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 09:04:18 -08:00
Jeff King
ddb5287894 t7407: use test_grep
There are a few grep calls here that can benefit from test_grep, which
produces more user-friendly output when it fails.

One of these calls also passes "-sq", which is curious. The "-q" option
suppresses the matched output. But test output is either already
redirected to /dev/null in non-verbose mode, and in verbose mode it's
better to see the output. The "-s" option suppresses errors opening
files, but we are just grepping in the "expected" file we just
generated, so it should not be needed. Neither of these was really
hurting anything, but they are not a style we'd like to see emulated. So
get rid of them.

(It is also curious to grep in the expected file in the first place, but
that is because we are auto-generating the expectation from a Git
command. So this is double-checking it did what we wanted).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 08:31:45 -08:00
Jeff King
164a2516eb test-lib: add a few comments to LSan log checking
Commit b119a687d4 (test-lib: ignore leaks in the sanitizer's thread
code, 2025-01-01) added code to suppress a false positive in the leak
checker. But if you're just reading the code, the obscure grep call is a
bit of a head-scratcher. Let's add a brief comment explaining what's
going on (and anybody digging further can find this commit or that one
for all the details).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 08:18:15 -08:00
Jeff King
b9a9df93a3 test-lib: simplify lsan results check
We want to know if there are any leaks logged by LSan in the results
directory, so we run "find" on the containing directory and pipe it to
xargs. We can accomplish the same thing by just globbing in the shell
and passing the result to grep, which has a few advantages:

  - it's one fewer process to run

  - we can glob on the TEST_RESULTS_SAN_FILE pattern, which is what we
    checked at the beginning of the function, and is the same glob used
    to show the logs in check_test_results_san_file_

  - this correctly handles the case where TEST_OUTPUT_DIRECTORY has a
    space in it. For example doing:

       mkdir "/tmp/foo bar"
       TEST_OUTPUT_DIRECTORY="/tmp/foo bar" make SANITIZE=leak test

    would yield a lot of:

      grep: /tmp/foo: No such file or directory
      grep: bar/test-results/t0006-date.leak/trace.test-tool.582311: No such file or directory

    when there are leaks. We could do the same thing with "xargs
    --null", but that isn't portable.

We are now subject to command-line length limits, but that is also true
of the globbing cat used to show the logs themselves. This hasn't been a
problem in practice.

We do need to use "grep -s" for the case that the glob does not expand
(i.e., there are not any log files at all). This option is in POSIX, and
has been used in t7407 for several years without anybody complaining.
This also also naturally handles the case where the surrounding
directory has already been removed (in which case there are likewise no
files!), dropping the need to comment about it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 08:17:54 -08:00
Jeff King
8d24d56ce1 test-lib: invert return value of check_test_results_san_file_empty
We have a function to check whether LSan logged any leaks. It returns
success for no leaks, and non-zero otherwise. This is the simplest thing
for its callers, who want to say "if no leaks then return early". But
because it's implemented as a shell pipeline, you end up with the
awkward:

  ! find ... |
  xargs grep leaks |
  grep -v false-positives

where the "!" is actually negating the final grep. Switch the return
value (and name) to return success when there are leaks. This should
make the code a little easier to read, and the negation in the callers
still reads pretty naturally.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-07 08:09:14 -08:00
Matteo Bagnolini
866ea87703 t7110: replace test -f with test_path_is_* helpers
`test -f` and `! test -f` do not provide clear error messages when they fail.
To enhance debuggability, use `test_path_is_file` and `test_path_is_missing`,
which instead provide more informative error messages.

Note that `! test -f` checks if a path is not a file, while
`test_path_is_missing` verifies that a path does not exist. In this specific
case the tests are meant to check the absence of the path, making
`test_path_is_missing` a valid replacement.

Signed-off-by: Matteo Bagnolini <matteobagnolini2003@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-03 10:35:13 -08:00
Junio C Hamano
1b4e9a5f8b Merge branch 'ps/build-meson-html'
The build procedure based on meson learned to generate HTML
documention pages.

* ps/build-meson-html:
  Documentation: wire up sanity checks for Meson
  t/Makefile: make "check-meson" work with Dash
  meson: install static files for HTML documentation
  meson: generate articles
  Documentation: refactor "howto-index.sh" for out-of-tree builds
  Documentation: refactor "api-index.sh" for out-of-tree builds
  meson: generate user manual
  Documentation: inline user-manual.conf
  meson: generate HTML pages for all man page categories
  meson: fix generation of merge tools
  meson: properly wire up dependencies for our docs
  meson: wire up support for AsciiDoctor
2025-01-02 13:37:08 -08:00
Junio C Hamano
effbef2beb Merge branch 'jk/lsan-race-ignore-false-positive'
CI jobs that run threaded programs under LSan has been giving false
positives from time to time, which has been worked around.

This is an alternative to the jk/lsan-race-with-barrier topic with
much smaller change to the production code.

* jk/lsan-race-ignore-false-positive:
  test-lib: ignore leaks in the sanitizer's thread code
  test-lib: check leak logs for presence of DEDUP_TOKEN
  test-lib: simplify leak-log checking
  test-lib: rely on logs to detect leaks
  Revert barrier-based LSan threading race workaround
2025-01-02 13:37:08 -08:00
Jeff King
b119a687d4 test-lib: ignore leaks in the sanitizer's thread code
Our CI jobs sometimes see false positive leaks like this:

        =================================================================
        ==3904583==ERROR: LeakSanitizer: detected memory leaks

        Direct leak of 32 byte(s) in 1 object(s) allocated from:
            #0 0x7fa790d01986 in __interceptor_realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
            #1 0x7fa790add769 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
            #2 0x7fa790d117c5 in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
            #3 0x7fa790d11957 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:598
            #4 0x7fa790d03fe8 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:51
            #5 0x7fa790d013fd in __lsan_thread_start_func ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:440
            #6 0x7fa790adc3eb in start_thread nptl/pthread_create.c:444
            #7 0x7fa790b5ca5b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81

This is not a leak in our code, but appears to be a race between one
thread calling exit() while another one is in LSan's stack setup code.
You can reproduce it easily by running t0003 or t5309 with --stress
(these trigger it because of the threading in git-grep and index-pack
respectively).

This may be a bug in LSan, but regardless of whether it is eventually
fixed, it is useful to work around it so that we stop seeing these false
positives.

We can recognize it by the mention of the sanitizer functions in the
DEDUP_TOKEN line. With this patch, the scripts mentioned above should
run with --stress indefinitely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-01 14:17:05 -08:00
Jeff King
6fb8cb3d68 test-lib: check leak logs for presence of DEDUP_TOKEN
When we check the leak logs, our original strategy was to check for any
non-empty log file produced by LSan. We later amended that to ignore
noisy lines in 370ef7e40d (test-lib: ignore uninteresting LSan output,
2023-08-28).

This makes it hard to ignore noise which is more than a single line;
we'd have to actually parse the file to determine the meaning of each
line.

But there's an easy line-oriented solution. Because we always pass the
dedup_token_length option, the output will contain a DEDUP_TOKEN line
for each leak that has been found. So if we invert our strategy to stop
ignoring useless lines and only look for useful ones, we can just count
the number of DEDUP_TOKEN lines. If it's non-zero, then we found at
least one leak (it would even give us a count of unique leaks, but we
really only care if it is non-zero).

This should yield the same outcome, but will help us build more false
positive detection on top.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-01 14:17:05 -08:00
Jeff King
373a432696 test-lib: simplify leak-log checking
We have a function to count the number of leaks found (actually, it is
the number of processes which produced a log file). Once upon a time we
cared about seeing if this number increased between runs. But we
simplified that away in 95c679ad86 (test-lib: stop showing old leak
logs, 2024-09-24), and now we only care if it returns any results or
not.

In preparation for refactoring it further, let's drop the counting
function entirely, and roll it into the "is it empty" check. The outcome
should be the same, but we'll be free to return a boolean "did we find
anything" without worrying about somebody adding a new call to the
counting function.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-01 14:17:05 -08:00
Jeff King
5fa0c4dd29 test-lib: rely on logs to detect leaks
When we run with sanitizers, we set abort_on_error=1 so that the tests
themselves can detect problems directly (when the buggy program exits
with SIGABRT). This has one blind spot, though: we don't always check
the exit codes for all programs (e.g., helpers like upload-pack invoked
behind the scenes).

For ASan and UBSan this is mostly fine; they exit as soon as they see an
error, so the unexpected abort of the program causes the test to fail
anyway.

But for LSan, the program runs to completion, since we can only check
for leaks at the end. And in that case we could miss leak reports. And
thus we started checking LSan logs in faececa53f (test-lib: have the
"check" mode for SANITIZE=leak consider leak logs, 2022-07-28).
Originally the logs were optional, but logs are generated (and checked)
always as of 8c1d6691bc (test-lib: GIT_TEST_SANITIZE_LEAK_LOG enabled by
default, 2024-07-11). And we even check them for each test snippet, as
of cf1464331b (test-lib: check for leak logs after every test,
2024-09-24).

So now aborting on error is superfluous for LSan! We can get everything
we need by checking the logs. And checking the logs is actually
preferable, since it gives us more control over silencing false
positives (something we do not yet do, but will soon).

So let's tell LSan to just exit normally, even if it finds leaks. We can
do so with exitcode=0, which also suppresses the abort_on_error flag.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-01-01 14:17:05 -08:00
Junio C Hamano
d893741e02 Merge branch 'jk/lsan-race-with-barrier'
CI jobs that run threaded programs under LSan has been giving false
positives from time to time, which has been worked around.

* jk/lsan-race-with-barrier:
  grep: work around LSan threading race with barrier
  index-pack: work around LSan threading race with barrier
  thread-utils: introduce optional barrier type
  Revert "index-pack: spawn threads atomically"
  test-lib: use individual lsan dir for --stress runs
2025-01-01 09:21:15 -08:00
Junio C Hamano
73e35b172a Merge branch 'rs/reftable-realloc-errors'
The custom allocator code in the reftable library did not handle
failing realloc() very well, which has been addressed.

* rs/reftable-realloc-errors:
  t-reftable-merged: handle realloc errors
  reftable: handle realloc error in parse_names()
  reftable: fix allocation count on realloc error
  reftable: avoid leaks on realloc error
2025-01-01 09:21:13 -08:00
Junio C Hamano
e1d34f36ea Merge branch 'ms/t7611-test-path-is-file'
Test modernization.

* ms/t7611-test-path-is-file:
  t7611: replace test -f with test_path_is* helpers
2024-12-30 06:56:28 -08:00
Jeff King
d601aee605 test-lib: use individual lsan dir for --stress runs
When storing output in test-results/, we usually give each numbered run
in a --stress set its own output file. But we don't do that for storing
LSan logs, so something like:

  ./t0003-attributes.sh --stress

will have many scripts simultaneously creating, writing to, and deleting
the test-results/t0003-attributes.leak directory. This can cause logs
from one run to be attributed to another, spurious failures when
creation and deletion race, and so on.

This has always been broken, but nobody noticed because it's rare to do
a --stress run with LSan (since the point is for the code to run quickly
many times in order to hit races). But if you're trying to find a race
in the leak sanitizing code, it makes sense to use these together.

We can fix it by using $TEST_RESULTS_BASE, which already incorporates
the stress job suffix.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-30 06:18:57 -08:00
René Scharfe
1e78120928 t-reftable-merged: handle realloc errors
Check reallocation errors in unit tests, like everywhere else.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-28 08:00:45 -08:00
René Scharfe
2cca185e85 reftable: fix allocation count on realloc error
When realloc(3) fails, it returns NULL and keeps the original allocation
intact.  REFTABLE_ALLOC_GROW overwrites both the original pointer and
the allocation count variable in that case, simultaneously leaking the
original allocation and misrepresenting the number of storable items.

parse_names() avoids the leak by keeping the original pointer if
reallocation fails, but still increase the allocation count in such a
case as if it succeeded.  That's OK, because the error handling code
just frees everything and doesn't look at names_cap anymore.

reftable_buf_add() does the same, but here it is a problem as it leaves
the reftable_buf in a broken state, with ->alloc being roughly twice as
big as the actually allocated memory, allowing out-of-bounds writes in
subsequent calls.

Reimplement REFTABLE_ALLOC_GROW to avoid leaks, keep allocation counts
in sync and still signal failures to callers while avoiding code
duplication in callers.  Make it an expression that evaluates to 0 if no
reallocation is needed or it succeeded and 1 on failure while keeping
the original pointer and allocation counter values.

Adjust REFTABLE_ALLOC_GROW_OR_NULL to the new calling convention for
REFTABLE_ALLOC_GROW, but keep its support for non-size_t alloc variables
for now.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-28 08:00:44 -08:00
René Scharfe
8db127d43f reftable: avoid leaks on realloc error
When realloc(3) fails, it returns NULL and keeps the original allocation
intact.  REFTABLE_ALLOC_GROW overwrites both the original pointer and
the allocation count variable in that case, simultaneously leaking the
original allocation and misrepresenting the number of storable items.

parse_names() and reftable_buf_add() avoid leaking by restoring the
original pointer value on failure, but all other callers seem to be OK
with losing the old allocation.  Add a new variant of the macro,
REFTABLE_ALLOC_GROW_OR_NULL, which plugs the leak and zeros the
allocation counter.  Use it for those callers.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-28 08:00:44 -08:00
Patrick Steinhardt
d8af27d309 t/Makefile: make "check-meson" work with Dash
The "check-meson" target uses process substitution to check whether
extracted contents from "meson.build" match expected contents. Process
substitution is unportable though and thus the target will fail when
using for example Dash.

Fix this by writing data into a temporary directory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:28:11 -08:00
Patrick Steinhardt
cbcc2f7911 GIT-BUILD-OPTIONS: wire up NO_GITWEB option
Building our "gitweb" interface is optional in our Makefile and in Meson
and not wired up at all with CMake, but disabling it causes a couple of
tests in the t950* range that pull in "t/lib-gitweb.sh". This is because
the test library knows to execute gitweb-tests based on whether or not
Perl is available, but we may have Perl available and still end up not
building gitweb e.g. with `make test NO_GITWEB=YesPlease`.

Fix this issue by wiring up a new "NO_GITWEB" build option so that we
can skip these tests in case gitweb is not built.

Note that this new build option requires us to move the configuration of
GIT-BUILD-OPTIONS to a later point in our Meson build instructions. But
as that file is only consumed by our tests at runtime this change does
not cause any issues.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:17:19 -08:00
Meet Soni
cef3d4a89f t7611: replace test -f with test_path_is* helpers
Replace `test -f` and `test ! -f` with `test_path_is_file` and
`test_path_is_missing` for better debuggability.

While `test -f` ensures that the file exists and is a regular file,
`test_path_is_file` provides clearer error messages on failure.

On the other hand, `test ! -f` checks either the absence of a regular
file or the presence of any other filesystem object, but looking at
them in the test individually, all of them should've said `test ! -e`,
i.e. "there shouldn't be anything at given path on filesystem."

Replace these cases with `test_path_is_missing` for better
debuggability.

Helped-by: karthik nayak <karthik.188@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Meet Soni <meetsoni3017@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:13:59 -08:00
Patrick Steinhardt
5e7fe8a7b8 commit-reach: use size_t to track indices when computing merge bases
The functions `repo_get_merge_bases_many()` and friends accepts an array
of commits as well as a parameter that indicates how large that array
is. This parameter is using a signed integer, which leads to a couple of
warnings with -Wsign-compare.

Refactor the code to use `size_t` to track indices instead and adapt
callers accordingly. While most callers are trivial, there are two
callers that require a bit more scrutiny:

  - builtin/merge-base.c:show_merge_base() subtracts `1` from the
    `rev_nr` before calling `repo_get_merge_bases_many_dirty()`, so if
    the variable was `0` it would wrap. This code is fine though because
    its only caller will execute that code only when `argc >= 2`, and it
    follows that `rev_nr >= 2`, as well.

  - bisect.ccheck_merge_bases() similarly subtracts `1` from `rev_nr`.
    Again, there is only a single caller that populates `rev_nr` with
    `good_revs.nr`. And because a bisection always requires at least one
    good revision it follws that `rev_nr >= 1`.

Mark the file as -Wsign-compare-clean.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-12-27 08:12:40 -08:00
Junio C Hamano
6f8ae955bd Merge branch 'kn/reflog-migration'
"git refs migrate" learned to also migrate the reflog data across
backends.

* kn/reflog-migration:
  refs: mark invalid refname message for translation
  refs: add support for migrating reflogs
  refs: allow multiple reflog entries for the same refname
  refs: introduce the `ref_transaction_update_reflog` function
  refs: add `committer_info` to `ref_transaction_add_update()`
  refs: extract out refname verification in transactions
  refs/files: add count field to ref_lock
  refs: add `index` field to `struct ref_udpate`
  refs: include committer info in `ref_update` struct
2024-12-23 09:32:29 -08:00
Junio C Hamano
83c8f76235 Merge branch 'ps/ci-meson'
The meson-build procedure is integrated into CI to catch and
prevent bitrotting.

* ps/ci-meson:
  ci: wire up Meson builds
  t: introduce compatibility options to clar-based tests
  t: fix out-of-tree tests for some git-p4 tests
  Makefile: detect missing Meson tests
  meson: detect missing tests at configure time
  t/unit-tests: rename clar-based unit tests to have a common prefix
  Makefile: drop -DSUPPRESS_ANNOTATED_LEAKS
  ci/lib: support custom output directories when creating test artifacts
2024-12-23 09:32:25 -08:00
Junio C Hamano
88e59f8027 Merge branch 'js/range-diff-diff-merges'
"git range-diff" learned to optionally show and compare merge
commits in the ranges being compared, with the --diff-merges
option.

* js/range-diff-diff-merges:
  range-diff: introduce the convenience option `--remerge-diff`
  range-diff: optionally include merge commits' diffs in the analysis
2024-12-23 09:32:17 -08:00
Junio C Hamano
002a8a9d36 Merge branch 'as/show-index-uninitialized-hash'
Regression fix for 'show-index' when run outside of a repository.

* as/show-index-uninitialized-hash:
  t5300: add test for 'show-index --object-format'
  show-index: fix uninitialized hash function
2024-12-23 09:32:12 -08:00
Junio C Hamano
4156b6a741 Merge branch 'ps/build-sign-compare'
Start working to make the codebase buildable with -Wsign-compare.

* ps/build-sign-compare:
  t/helper: don't depend on implicit wraparound
  scalar: address -Wsign-compare warnings
  builtin/patch-id: fix type of `get_one_patchid()`
  builtin/blame: fix type of `length` variable when emitting object ID
  gpg-interface: address -Wsign-comparison warnings
  daemon: fix type of `max_connections`
  daemon: fix loops that have mismatching integer types
  global: trivial conversions to fix `-Wsign-compare` warnings
  pkt-line: fix -Wsign-compare warning on 32 bit platform
  csum-file: fix -Wsign-compare warning on 32-bit platform
  diff.h: fix index used to loop through unsigned integer
  config.mak.dev: drop `-Wno-sign-compare`
  global: mark code units that generate warnings with `-Wsign-compare`
  compat/win32: fix -Wsign-compare warning in "wWinMain()"
  compat/regex: explicitly ignore "-Wsign-compare" warnings
  git-compat-util: introduce macros to disable "-Wsign-compare" warnings
2024-12-23 09:32:11 -08:00
Junio C Hamano
f7c607fac3 Merge branch 'kn/reftable-writer-log-write-verify'
Reftable backend adds check for upper limit of log's update_index.

* kn/reftable-writer-log-write-verify:
  reftable/writer: ensure valid range for log's update_index
2024-12-23 09:32:08 -08:00
Junio C Hamano
1df37ef81a Merge branch 'tc/bundle-with-tag-remove-workaround'
"git bundle create" with an annotated tag on the positive end of
the revision range had a workaround code for older limitation in
the revision walker, which has become unnecessary.

* tc/bundle-with-tag-remove-workaround:
  bundle: remove unneeded code
2024-12-19 10:58:34 -08:00
Junio C Hamano
cb89eebf3b Merge branch 'js/log-remerge-keep-ancestry'
"git log -p --remerge-diff --reverse" was completely broken.

* js/log-remerge-keep-ancestry:
  log: --remerge-diff needs to keep around commit parents
2024-12-19 10:58:31 -08:00