eyecatchu/git - git - KM Solution Bank Gitea

mirror of https://github.com/git/git.git synced 2026-01-11 13:23:12 +09:00

Author	SHA1	Message	Date
Junio C Hamano	88134a8417	Merge branch 'ds/path-walk-2' "git pack-objects" learns to find delta bases from blobs at the same path, using the --path-walk API. * ds/path-walk-2: pack-objects: allow --shallow and --path-walk path-walk: add new 'edge_aggressive' option pack-objects: thread the path-based compression pack-objects: refactor path-walk delta phase scalar: enable path-walk during push via config pack-objects: enable --path-walk via config repack: add --path-walk option t5538: add tests to confirm deltas in shallow pushes pack-objects: introduce GIT_TEST_PACK_PATH_WALK p5313: add performance tests for --path-walk pack-objects: update usage to match docs pack-objects: add --path-walk option pack-objects: extract should_attempt_deltas()	2025-06-17 10:44:38 -07:00
Derrick Stolee	861d4bc292	pack-objects: introduce GIT_TEST_PACK_PATH_WALK There are many tests that validate whether 'git pack-objects' works as expected. Instead of duplicating these tests, add a new test environment variable, GIT_TEST_PACK_PATH_WALK, that implies --path-walk by default when specified. This was useful in testing the implementation of the --path-walk implementation, helping to find tests that are overly specific to the default object walk. These include: - t0411-clone-from-partial.sh : One test fetches from a repo that does not have the boundary objects. This causes the path-based walk to fail. Disable the variable for this test. - t5306-pack-nobase.sh : Similar to t0411, one test fetches from a repo without a boundary object. - t5310-pack-bitmaps.sh : One test compares the case when packing with bitmaps to the case when packing without them. Since we disable the test variable when writing bitmaps, this causes a difference in the object list (the --path-walk option adds an extra object). Specify --no-path-walk in both processes for the comparison. Another test checks for a specific delta base, but when computing dynamically without using bitmaps, the base object it too small to be considered in the delta calculations so no base is used. - t5316-pack-delta-depth.sh : This script cares about certain delta choices and their chain lengths. The --path-walk option changes how these chains are selected, and thus changes the results of this test. - t5322-pack-objects-sparse.sh : This demonstrates the effectiveness of the --sparse option and how it combines with --path-walk. - t5332-multi-pack-reuse.sh : This test verifies that the preferred pack is used for delta reuse when possible. The --path-walk option is not currently aware of the preferred pack at all, so finds a different delta base. - t7406-submodule-update.sh : When using the variable, the --depth option collides with the --path-walk feature, resulting in a warning message. Disable the variable so this warning does not appear. I want to call out one specific test change that is only temporary: - t5530-upload-pack-error.sh : One test cares specifically about an "unable to read" error message. Since the current implementation performs delta calculations within the path-walk API callback, a different "unable to get size" error message appears. When this is changed in a future refactoring, this test change can be reverted. Similar to GIT_TEST_NAME_HASH_VERSION, we do not add this option to the linux-TEST-vars CI build as that's already an overloaded build. Signed-off-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-05-16 12:15:39 -07:00
Patrick Steinhardt	88bef8db84	t5316: refactor `max_chain()` to not depend on Perl The `max_chain()` helper function is used to extract the maximum delta chain of a packfile as printed by git-index-pack(1). The script uses Perl to extract that data, but it can be trivially refactored to use awk(1) instead. Refactor the helper accordingly so that we can drop a couple of PERL_TEST_HELPERS prerequisites. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-07 14:47:41 -07:00
Patrick Steinhardt	23e21a58d5	t: introduce PERL_TEST_HELPERS prerequisite In the early days of Git, Perl was used quite prominently throughout the project. This has changed significantly as almost all of the executables we ship nowadays have eventually been rewritten in C. Only a handful of subsystems remain that require Perl: - gitweb, a read-only web interface. - A couple of scripts that allow importing repositories from GNU Arch, CVS and Subversion. - git-send-email(1), which can be used to send mails. - git-request-pull(1), which is used to request somebody to pull from a URL by sending an email. - git-filter-branch(1), which uses Perl with the `--state-branch` option. This command is typically recommended against nowadays in favor of git-filter-repo(1). - Our Perl bindings for Git. - The netrc Git credential helper. None of these subsystems can really be considered to be part of the "core" of Git, and an installation without them is fully functional. It is more likely than not that an end user wouldn't even notice that any features are missing if those tools weren't installed. But while Perl nowadays very much is an optional dependency of Git, there is a significant limitation when Perl isn't available: developers cannot run our test suite. Preceding commits have started to lift this restriction by removing the strict dependency on Perl in many central parts of the test library. But there are still many tests that rely on small Perl helpers to do various different things. Introduce a new PERL_TEST_HELPERS prerequisite that guards all tests that require Perl. This prerequisite is explicitly different than the preexisting PERL prerequisite: - PERL records whether or not features depending on the Perl interpreter are built. - PERL_TEST_HELPERS records whether or not a Perl interpreter is available for our tests. By having these two separate prerequisites we can thus distinguish between tests that inherently depend on Perl because the underlying feature does, and those tests that depend on Perl because the test itself is using Perl. Adapt all tests to set the PERL_TEST_HELPERS prerequisite as needed. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-04-07 14:47:37 -07:00
Patrick Steinhardt	fc1ddf42af	t: remove TEST_PASSES_SANITIZE_LEAK annotations Now that the default value for TEST_PASSES_SANITIZE_LEAK is `true` there is no longer a need to have that variable declared in all of our tests. Drop it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-11-21 08:23:48 +09:00
Ævar Arnfjörð Bjarmason	e75d2f7f73	revisions API: have release_revisions() release "filter" Extend the the release_revisions() function so that it frees the "filter" in the "struct rev_info". This in combination with a preceding change to free "cmdline" means that we can mark another set of tests as passing under "TEST_PASSES_SANITIZE_LEAK=true". The "filter" member was added recently in ffaa137f646 (revision: put object filter into struct rev_info, 2022-03-09), and this fixes leaks intruded in the subsequent leak 7940941de1f (pack-objects: use rev.filter when possible, 2022-03-09) and 105c6f14ad3 (bundle: parse filter capability, 2022-03-09). The "builtin/pack-objects.c" leak in 7940941de1f was effectively with us already, but the variable was referred to by a "static" file-scoped variable. The "bundle.c " leak in 105c6f14ad3 was newly introduced with the new "filter" feature for bundles. The "t5600-clone-fail-cleanup.sh" change here to add "TEST_PASSES_SANITIZE_LEAK=true" is one of the cases where run-command.c in not carrying the abort() exit code upwards would have had that test passing before, but now it actually passes[1]. We should fix the lack of 1=1 mapping of SANITIZE=leak testing to actual leaks some other time, but it's an existing edge case, let's just mark the really-passing test as passing for now. 1. https://lore.kernel.org/git/220303.86fsnz5o9w.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-13 23:56:09 -07:00
Junio C Hamano	a47fcfe871	Merge branch 'ab/only-single-progress-at-once' Further tweaks on progress API. * ab/only-single-progress-at-once: pack-bitmap-write.c: don't return without stop_progress() progress API: unify stop_progress{,_msg}(), fix trace2 bug progress.c: refactor stop_progress{,_msg}() to use helpers progress.c: use dereferenced "progress" variable, not "(*p_progress)" progress.h: format and be consistent with progress.c naming progress.c tests: test some invalid usage progress.c tests: make start/stop commands on stdin progress.c test helper: add missing braces leak tests: fix a memory leak in "test-progress" helper	2022-02-25 15:47:35 -08:00
Ævar Arnfjörð Bjarmason	74900a6b35	progress API: unify stop_progress{,_msg}(), fix trace2 bug Fix a bug that's been with us ever since 98a13647408 (trace2: log progress time and throughput, 2020-05-12), when the stop_progress_msg() API was used we didn't log a "region_leave" for the "region_enter" we start in "start_progress_delay()". The only user of the "stop_progress_msg()" function is "index-pack". Let's add a previously failing test to check that we have the same number of "region_enter" and "region_leave" events, with "-v" we'll log progress even in the test environment. In addition to that we've had a submarine bug here introduced with 9d81ecb52b5 (progress: add sparse mode to force 100% complete message, 2019-03-21). The "start_sparse_progress()" API would only do the right thing if the progress was ended with "stop_progress()", not "stop_progress_msg()". The only user of that API uses "stop_progress()", but let's still fix that along with the trace2 issue by making "stop_progress()" a trivial wrapper for "stop_progress_msg()". We can also drop the "if (progress)" test from "finish_if_sparse()". It's now a helper for the small "stop_progress_msg()" function. We'll already have returned from it if "progress" is "NULL". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-03 15:39:59 -08:00
Eric Sunshine	7abcbcb7ea	tests: fix broken &&-chains in `{...}` groups The top-level &&-chain checker built into t/test-lib.sh causes tests to magically exit with code 117 if the &&-chain is broken. However, it has the shortcoming that the magic does not work within `{...}` groups, `(...)` subshells, `$(...)` substitutions, or within bodies of compound statements, such as `if`, `for`, `while`, `case`, etc. `chainlint.sed` partly fills in the gap by catching broken &&-chains in `(...)` subshells, but bugs can still lurk behind broken &&-chains in the other cases. Fix broken &&-chains in `{...}` groups in order to reduce the number of possible lurking bugs. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-13 10:29:48 -08:00
Jeff King	6d52b6a5df	pack-objects: clamp negative depth to 0 A negative delta depth makes no sense, and the code is not prepared to handle it. If passed "--depth=-1" on the command line, then this line from break_delta_chains(): cur->depth = (total_depth--) % (depth + 1); triggers a divide-by-zero. This is undefined behavior according to the C standard, but on POSIX systems results in SIGFPE killing the process. This is certainly one way to inform the use that the command was invalid, but it's a bit friendlier to just treat it as "don't allow any deltas", which we already do for --depth=0. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:30:46 +09:00
Jeff King	49ac1d33bb	t5316: check behavior of pack-objects --depth=0 We'd expect this to cleanly produce no deltas at all (as opposed to getting confused by an out-of-bounds value), and it does. Note we have to adjust our max_chain test helper, which expected to find at least one delta. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:29:56 +09:00
Ævar Arnfjörð Bjarmason	1108cea7f8	tests: remove most uses of test_i18ncmp As a follow-up to d162b25f956 (tests: remove support for GIT_TEST_GETTEXT_POISON, 2021-01-20) remove most uses of test_i18ncmp via a simple s/test_i18ncmp/test_cmp/g search-replacement. I'm leaving t6300-for-each-ref.sh out due to a conflict with in-flight changes between "master" and "seen", as well as the prerequisite itself due to other changes between "master" and "next/seen" which add new test_i18ncmp uses. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 23:48:27 -08:00
Nguyễn Thái Ngọc Duy	c680668d1a	t/helper: merge test-genrandom into test-tool Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-27 08:45:47 -07:00
Ævar Arnfjörð Bjarmason	0d75bfe67b	tests: fix tests broken under GETTEXT_POISON=YesPlease The GETTEXT_POISON=YesPlease compile-time testing option added in my bb946bba76 ("i18n: add GETTEXT_POISON to simulate unfriendly translator", 2011-02-22) has been slowly bitrotting as strings have been marked for translation, and new tests have been added without running it. I brought this up on the list ("[BUG] test suite broken with GETTEXT_POISON=YesPlease", [1]) asking whether this mode was useful at all anymore. At least one person occasionally uses it, and Lars Schneider offered to change one of the the Travis builds to run in this mode, so fix up the failing ones. My test setup runs most of the tests, with the notable exception of skipping all the p4 tests, so it's possible that there's still some lurking regressions I haven't fixed. 1. <CACBZZX62+acvi1dpkknadTL827mtCm_QesGSZ=6+UnyeMpg8+Q@mail.gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-11 18:44:38 +09:00
Jeff King	7dbabbbebe	pack-objects: enforce --depth limit in reused deltas Since 898b14c (pack-objects: rework check_delta_limit usage, 2007-04-16), we check the delta depth limit only when figuring out whether we should make a new delta. We don't consider it at all when reusing deltas, which means that packing once with --depth=250, and then again with --depth=50, the second pack may still contain chains larger than 50. This is generally considered a feature, as the results of earlier high-depth repacks are carried forward, used for serving fetches, etc. However, since we started using cross-pack deltas in c9af708b1 (pack-objects: use mru list when iterating over packs, 2016-08-11), we are no longer bounded by the length of an existing delta chain in a single pack. Here's one particular pathological case: a sequence of N packs, each with 2 objects, the base of which is stored as a delta in a previous pack. If we chain all the deltas together, we have a cycle of length N. We break the cycle, but the tip delta is still at depth N-1. This is less unlikely than it might sound. See the included test for a reconstruction based on real-world actions. I ran into such a case in the wild, where a client was rapidly sending packs, and we had accumulated 10,000 before doing a server-side repack. The pack that "git repack" tried to generate had a very deep chain, which caused pack-objects to run out of stack space in the recursive write_one(). This patch bounds the length of delta chains in the output pack based on --depth, regardless of whether they are caused by cross-pack deltas or existed in the input packs. This fixes the problem, but does have two possible downsides: 1. High-depth aggressive repacks followed by "normal" repacks will throw away the high-depth chains. In the long run this is probably OK; investigation showed that high-depth repacks aren't actually beneficial, and we dropped the aggressive depth default to match the normal case in 07e7dbf0d (gc: default aggressive depth to 50, 2016-08-11). 2. If you really do want to store high-depth deltas on disk, they may be discarded and new delta computed when serving a fetch, unless you set pack.depth to match your high-depth size. The implementation uses the existing search for delta cycles. That lets us compute the depth of any node based on the depth of its base, because we know the base is DFS_DONE by the time we look at it (modulo any cycles in the graph, but we know there cannot be any because we break them as we see them). There is some subtlety worth mentioning, though. We record the depth of each object as we compute it. It might seem like we could save the per-object storage space by just keeping track of the depth of our traversal (i.e., have break_delta_chains() report how deep it went). But we may visit an object through multiple delta paths, and on subsequent paths we want to know its depth immediately, without having to walk back down to its final base (doing so would make our graph walk quadratic rather than linear). Likewise, one could try to record the depth not from the base, but from our starting point (i.e., start recursion_depth at 0, and pass "recursion_depth + 1" to each invocation of break_delta_chains()). And then when recursion_depth gets too big, we know that we must cut the delta chain. But that technique is wrong if we do not visit the nodes in topological order. In a chain A->B->C, it if we visit "C", then "B", then "A", we will never recurse deeper than 1 link (because we see at each node that we have already visited it). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-27 16:24:44 -08:00

15 Commits