eyecatchu/git - git - KM Solution Bank Gitea

mirror of https://github.com/git/git.git synced 2026-01-21 14:27:19 +09:00

Author	SHA1	Message	Date
Patrick Steinhardt	e5901df2f3	odb: drop unused `for_each_{loose,packed}_object()` functions We have converted all callers of `for_each_loose_object()` and `for_each_packed_object()` to use their new replacement functions instead. We can thus remove them now. Do so and inline `packfile_store_for_each_object_internal()` now that it only has a single callsite again. This makes it a bit easier to follow the callback indirection that is happening there. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:29 -08:00
Patrick Steinhardt	9a9a207cb3	reachable: convert to use `odb_for_each_object()` To figure out which objects expired objects we enumerate all loose and packed objects individually so that we can figure out their respective mtimes. Refactor the code to instead use `odb_for_each_object()` with a request that ask for the object mtime instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:29 -08:00
Patrick Steinhardt	b591d25a5d	builtin/pack-objects: use `packfile_store_for_each_object()` When enumerating objects that are supposed to be stored in a new cruft pack we use `for_each_packed_object()` and then derive each object's mtime individually. Refactor this logic to instead use the new `packfile_store_for_each_object()` function with an object info request that asks for the respective mtimes. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:29 -08:00
Patrick Steinhardt	16a2043a67	odb: introduce mtime fields for object info requests There are some use cases where we need to figure out the mtime for objects. Most importantly, this is the case when we want to prune unreachable objects. But getting at that data requires users to manually derive the info either via the loose object's mtime, the packfiles' mtime or via the ".mtimes" file. Introduce a new `struct object_info::mtimep` pointer that allows callers to request an object's mtime. This new field will be used in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:29 -08:00
Patrick Steinhardt	fe718f8e98	treewide: drop uses of `for_each_{loose,packed}_object()` We're using `for_each_loose_object()` and `for_each_packed_object()` at a couple of callsites to enumerate all loose and packed objects, respectively. These functions will be removed in a subsequent commit in favor of the newly introduced `odb_source_loose_for_each_object()` and `packfile_store_for_each_object()` replacements. Prepare for this by refactoring the sites accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:29 -08:00
Patrick Steinhardt	98f6927c60	treewide: enumerate promisor objects via `odb_for_each_object()` We have multiple callsites where we enumerate all promisor objects in the object database via `for_each_packed_object()`. This is done by passing the `ODB_FOR_EACH_OBJECT_PROMISOR_ONLY` flag, which causes us to skip over all non-promisor objects. These callsites can be trivially converted to `odb_for_each_object()` as we know to skip enumeration of loose objects in case the `PROMISOR_ONLY` flag was passed by the caller. Refactor the sites accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:28 -08:00
Patrick Steinhardt	7305e6d60e	builtin/fsck: refactor to use `odb_for_each_object()` In git-fsck(1) we have two callsites where we iterate over all objects via `for_each_loose_object()` and `for_each_packed_object()`. Both of these are trivially convertible with `odb_for_each_object()`. Refactor these callsites accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:28 -08:00
Patrick Steinhardt	85f8f1c2fa	odb: introduce `odb_for_each_object()` Introduce a new function `odb_for_each_object()` that knows to iterate through all objects part of a given object database. This function is essentially a simple wrapper around the object database sources. Subsequent commits will adapt callers to use this new function. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:28 -08:00
Patrick Steinhardt	b2aa629132	packfile: introduce function to iterate through objects Introduce a new function `packfile_store_for_each_object()`. This function is the equivalent to `odb_source_loose_for_each_object()` in that it: - Works on a single packfile store and thus per object source. - Passes a `struct object_info` to the callback function. As such, it provides the same callback interface as we already provide for loose objects now. These functions will be used in a subsequent step to implement `odb_for_each_object()`. The `for_each_packed_object()` function continues to exist for now, but it will be removed at the end of this patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:28 -08:00
Patrick Steinhardt	900d897436	packfile: extract function to iterate through objects of a store In the next commit we're about to introduce a new function that knows to iterate through objects of a given packfile store. Same as with the equivalent function for loose objects, this new function will also be agnostic of backends by using a `struct object_info`. Prepare for this by extracting a new shared function to iterate through a single packfile store. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:28 -08:00
Patrick Steinhardt	b8578bfc57	object-file: introduce function to iterate through objects We have multiple divergent interfaces to iterate through objects of a specific backend: - `for_each_loose_object()` yields all loose objects. - `for_each_packed_object()` (somewhat obviously) yields all packed objects. These functions have different function signatures, which makes it hard to create a common abstraction layer that covers both of these. Introduce a new function `odb_source_loose_for_each_object()` to plug this gap. This function doesn't take any data specific to loose objects, but instead it accepts a `struct object_info` that will be populated the exact same as if `odb_source_loose_read_object()` was called. The benefit of this new interface is that we can continue to pass backend-specific data, as `struct object_info` contains a union for these exact use cases. This will allow us to unify how we iterate through objects across both loose and packed objects in a subsequent commit. The `for_each_loose_object()` function continues to exist for now, but it will be removed at the end of this patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:28 -08:00
Patrick Steinhardt	46732b8ee6	object-file: extract function to read object info from path Extract a new function that allows us to read object info for a specific loose object via a user-supplied path. This function will be used in a subsequent commit. Note that this also allows us to drop `stat_loose_object()`, which is a simple wrapper around `odb_loose_path()` plus lstat(3p). Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:28 -08:00
Patrick Steinhardt	710c9c431e	odb: fix flags parameter to be unsigned The `flags` parameter accepted by various `for_each_object()` functions is a bitfield of multiple flags. Such parameters are typically unsigned in the Git codebase, but we use `enum odb_for_each_object_flags` in some places. Adapt these function signatures to use the correct type. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:27 -08:00
Patrick Steinhardt	fa087f57c7	odb: rename `FOR_EACH_OBJECT_` flags Rename the `FOR_EACH_OBJECT_` flags to have an `ODB_` prefix. This prepares us for a new upcoming `odb_for_each_object()` function and ensures that both the function and its flags have the same prefix. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-15 05:50:27 -08:00
Junio C Hamano	ec16dde5c8	Merge branch 'ps/packfile-store-in-odb-source' into ps/odb-for-each-object * ps/packfile-store-in-odb-source: packfile: move MIDX into packfile store packfile: refactor `find_pack_entry()` to work on the packfile store packfile: inline `find_kept_pack_entry()` packfile: only prepare owning store in `packfile_store_prepare()` packfile: only prepare owning store in `packfile_store_get_packs()` packfile: move packfile store into object source packfile: refactor misleading code when unusing pack windows packfile: refactor kept-pack cache to work with packfile stores packfile: pass source to `prepare_pack()` packfile: create store via its owning source odb: properly close sources before freeing them builtin/gc: fix condition for whether to write commit graphs	2026-01-15 05:50:16 -08:00
Junio C Hamano	c8e1706e8d	Merge branch 'ps/read-object-info-improvements' into ps/odb-for-each-object * ps/read-object-info-improvements: packfile: drop repository parameter from `packed_object_info()` packfile: skip unpacking object header for disk size requests packfile: disentangle return value of `packed_object_info()` packfile: always populate pack-specific info when reading object info packfile: extend `is_delta` field to allow for "unknown" state packfile: always declare object info to be OI_PACKED object-file: always set OI_LOOSE when reading object info	2026-01-15 05:47:47 -08:00
Patrick Steinhardt	12d3b58b55	packfile: drop repository parameter from `packed_object_info()` The function `packed_object_info()` takes a packfile and offset and returns the object info for the corresponding object. Despite these two parameters though it also takes a repository pointer. This is redundant information though, as `struct packed_git` already has a repository pointer that is always populated. Drop the redundant parameter. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-12 06:51:15 -08:00
Patrick Steinhardt	5ff29698e0	packfile: skip unpacking object header for disk size requests While most of the object info requests for a packed object require us to unpack its headers, reading its disk size doesn't. We still unpack the object header in that case though, which is unnecessary work. Skip reading the header if only the disk size is requested. This leads to a small speedup when reading disk size, only. The following benchmark was done in the Git repository: Benchmark 1: ./git rev-list --disk-usage HEAD (rev = HEAD~) Time (mean ± σ): 105.2 ms ± 0.6 ms [User: 91.4 ms, System: 13.3 ms] Range (min … max): 103.7 ms … 106.0 ms 27 runs Benchmark 2: ./git rev-list --disk-usage HEAD (rev = HEAD) Time (mean ± σ): 96.7 ms ± 0.4 ms [User: 86.2 ms, System: 10.0 ms] Range (min … max): 96.2 ms … 98.1 ms 30 runs Summary ./git rev-list --disk-usage HEAD (rev = HEAD) ran 1.09 ± 0.01 times faster than ./git rev-list --disk-usage HEAD (rev = HEAD~) Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-12 06:51:14 -08:00
Patrick Steinhardt	8908c303da	packfile: disentangle return value of `packed_object_info()` The `packed_object_info()` function returns the type of the packed object. While we use an `enum object_type` to store the return value, this type is not to be confused with the actual object type. It _may_ contain the object type, but it may just as well encode that the given packed object is stored as a delta. We have removed the only caller that relied on this returned object type in the preceding commit, so let's simplify semantics and return either 0 on success or a negative error code otherwise. This unblocks a small optimization where we can skip reading the object type altogether. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-12 06:51:14 -08:00
Patrick Steinhardt	57c168dc38	packfile: always populate pack-specific info when reading object info When reading object information via `packed_object_info()` we may not populate the object info's packfile-specific fields. This leads to inconsistent object info depending on whether the info was populated via `packfile_store_read_object_info()` or `packed_object_info()`. Fix this inconsistency so that we can always assume the pack info to be populated when reading object info from a pack. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-12 06:51:14 -08:00
Patrick Steinhardt	27d9486cbc	packfile: extend `is_delta` field to allow for "unknown" state The `struct object_info::u::packed::is_delta` field determines whether or not a specific object is stored as a delta. It only stores whether or not the object is stored as delta, so it is treated as a boolean value. This boolean is insufficient though: when reading a packed object via `packfile_store_read_object_info()` we know to skip parsing the actual object when the user didn't request any object-specific data. In that case we won't read the object itself, but will only look up its position in the packfile. Consequently, we do not know whether it is a delta or not. This isn't really an issue right now, as the check for an empty request is broken. But a subsequent commit will fix it, and once we do we will have the need to also represent an "unknown" delta state. Prepare for this change by introducing a new enum that encodes the object type. We don't use the "unknown" state just yet, but will start to do so in a subsequent commit. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-12 06:51:14 -08:00
Patrick Steinhardt	7a4bd1b836	packfile: always declare object info to be OI_PACKED When reading object info via a packfile we yield one of two types: - The object can either be OI_PACKED, which is what a caller would typically expect. - Or it can be OI_DBCACHED if it is stored in the delta base cache. The latter really is an implementation detail though, and callers typically don't care at all about the difference. Furthermore, the information whether or not it is part of the delta base cache can already be derived via the `is_delta` field, so the fact that we discern between OI_PACKED and OI_DBCACHED only further complicates the interface. There aren't all that many callers that care about the `whence` field in the first place. In fact, there's only three: - `packfile_store_read_object_info()` checks for `whence == OI_PACKED` and then populates the packfile information of the object info structure. We now start to do this also for deltified objects, which gives its callers strictly more information. - `repack_local_links()` wants to determine whether the object is part of a promisor pack and checks for `whence == OI_PACKED`. If so, it verifies that the packfile is a promisor pack. It's arguably wrong to declare that an object is not part of a promisor pack only because it is stored in the delta base cache. - `is_not_in_promisor_pack_obj()` does the same, but checks that a specific object is _not_ part of a promisor pack. The same reasoning as above applies. Drop the OI_DBCACHED enum completely. None of the callers seem to care about the distinction. Note that this also fixes a segfault introduced in 8c1b84bc97 (streaming: move logic to read packed objects streams into backend, 2025-11-23), which refactors how we stream packed objects. The intent is to only read packed objects in case they are stored non-deltified as we'd otherwise have to deflate them first. But the check for whether or not the object is stored as a delta was unconditionally done via `oi.u.packed.is_delta`, which is only valid in case `oi.whence` is `OI_PACKED`. But under some circumstances we got `OI_DBCACHED` here, which means that none of the `oi.u.packed` fields were initialized at all. Consequently, we assumed the object was not stored as a delta, and then try to read the object from `oi.u.packed.pack`, which is a `NULL` pointer and thus causes a segfault. Add a test case for this issue so that this cannot regress in the future anymore. Reported-by: Matt Smiley <msmiley@gitlab.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-12 06:51:14 -08:00
Patrick Steinhardt	123e8186a7	object-file: always set OI_LOOSE when reading object info There are some early returns in `odb_source_loose_read_object_info()` in cases where we don't have to open the loose object. These return paths do not set `struct object_info::whence` to `OI_LOOSE` though, so it becomes impossible for the caller to tell the format of such an object. The root cause of this really is that we have so many different return paths in the function. As a consequence, it's harder than necessary to make sure that all successful exit paths sot up the `whence` field as expected. Address this by refactoring the function to have a single exit path. Like this, we can trivially set up the `whence` field when we exit successfully from the function. Note that we also: - Rename `status` to `ret` to match our usual coding style, but also to show that the old `status` variable is now always getting the expected value. Furthermore, the value is not initialized anymore, which has the consequence that most compilers will warn for exit paths where we forgot to set it. - Move the setup of scratch pointers closer to `parse_loose_header()` to show where it's needed. - Guard a couple of variables on cleanup so that they only get released in case they have been set up. - Reset `oi->delta_base_oid` towards the end of the function, together with all the other object info pointers. Overall, all these changes result in a diff that is somewhat hard to read. But the end result is significantly easier to read and reason about, so I'd argue this one-time churn is worth it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-12 06:51:14 -08:00
Junio C Hamano	8745eae506	The 17th batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-12 05:19:53 -08:00
Junio C Hamano	5323fafdf4	Merge branch 'js/mailmap-karsten-blees' Mailmap update for Karsten * js/mailmap-karsten-blees: .mailmap: replace Karsten Blees' default address	2026-01-12 05:19:52 -08:00
Junio C Hamano	6506be0304	Merge branch 'ps/t1300-2021-use-test-path-is-helpers' Test updates. * ps/t1300-2021-use-test-path-is-helpers: t1300: use test helpers instead of `test` command	2026-01-12 05:19:52 -08:00
Junio C Hamano	3235ef374e	Merge branch 'rs/commit-stack' Code clean-up, unifying various hand-rolled "list of commit objects" and use the commit_stack API. * rs/commit-stack: commit-reach: use commit_stack commit-graph: use commit_stack commit: add commit_stack_grow() shallow: use commit_stack pack-bitmap-write: use commit_stack commit: add commit_stack_init() test-reach: use commit_stack remote: use commit_stack for src_commits remote: use commit_stack for sent_tips remote: use commit_stack for local_commits name-rev: use commit_stack midx: use commit_stack log: use commit_stack revision: export commit_stack	2026-01-12 05:19:52 -08:00
Junio C Hamano	0320bcd743	Merge branch 'sb/bundle-uri-without-uri' Diagnose invalid bundle-URI that lack the URI entry, instead of crashing. * sb/bundle-uri-without-uri: bundle-uri: validate that bundle entries have a uri	2026-01-12 05:19:52 -08:00
Junio C Hamano	6a4b4e7880	Merge branch 'ja/doc-synopsis-style-more' More doc style updates. * ja/doc-synopsis-style-more: doc: convert git-remote to synopsis style doc: convert git stage to use synopsis block doc: convert git-status tables to AsciiDoc format doc: convert git-status to synopsis style doc: fix t0450-txt-doc-vs-help to select only first synopsis block	2026-01-12 05:19:52 -08:00
Johannes Schindelin	e97678c4ef	.mailmap: replace Karsten Blees' default address As per a recent email by Karsten, the @dcon.de address no longer works: https://lore.kernel.org/git/77e768b2-6693-454f-9e11-fb0acdec703c@gmail.com Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-10 09:32:51 -08:00
Patrick Steinhardt	a282a8f163	packfile: move MIDX into packfile store The multi-pack index still is tracked as a member of the object database source, but ultimately the MIDX is always tied to one specific packfile store. Move the structure into `struct packfile_store` accordingly. This ensures that the packfile store now keeps track of all data related to packfiles. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-09 06:40:08 -08:00
Patrick Steinhardt	a593373b09	packfile: refactor `find_pack_entry()` to work on the packfile store The function `find_pack_entry()` doesn't work on a specific packfile store, but instead works on the whole repository. This causes a bit of a conceptual mismatch in its callers: - `packfile_store_freshen_object()` supposedly acts on a store, and its callers know to iterate through all sources already. - `packfile_store_read_object_info()` behaves likewise. The only exception that doesn't know to handle iteration through sources is `has_object_pack()`, but that function is trivial to adapt. Refactor the code so that `find_pack_entry()` works on the packfile store level instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-09 06:40:07 -08:00
Patrick Steinhardt	6acefa0d2c	packfile: inline `find_kept_pack_entry()` The `find_kept_pack_entry()` function is only used in `has_object_kept_pack()`, which is only a trivial wrapper itself. Inline the latter into the former. Furthermore, reorder the code so that we can drop the declaration of the function in "packfile.h". This allows us to make the function file-local. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-09 06:40:07 -08:00
Patrick Steinhardt	8384cbcb4c	packfile: only prepare owning store in `packfile_store_prepare()` When calling `packfile_store_prepare()` we prepare not only the provided packfile store, but also all those of all other sources part of the same object database. This was required when the store was still sitting on the object database level. But now that it sits on the source level it's not anymore. Refactor the code so that we only prepare the single packfile store passed by the caller. Adapt callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-09 06:40:07 -08:00
Patrick Steinhardt	7b330a11de	packfile: only prepare owning store in `packfile_store_get_packs()` When calling `packfile_store_get_packs()` we prepare not only the provided packfile store, but also all those of all other sources part of the same object database. This was required when the store was still sitting on the object database level. But now that it sits on the source level it's not anymore. Adapt the code so that we only prepare the MIDX of the provided store. All callers only work in the context of a single store or call the function in a loop over all sources, so this change shouldn't have any practical effects. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-09 06:40:07 -08:00
Patrick Steinhardt	84f0e60b28	packfile: move packfile store into object source The packfile store is a member of `struct object_database`, which means that we have a single store per database. This doesn't really make much sense though: each source connected to the database has its own set of packfiles, so there is a conceptual mismatch here. This hasn't really caused much of a problem in the past, but with the advent of pluggable object databases this is becoming more of a problem because some of the sources may not even use packfiles in the first place. Move the packfile store down by one level from the object database into the object database source. This ensures that each source now has its own packfile store, and we can eventually start to abstract it away entirely so that the caller doesn't even know what kind of store it uses. Note that we only need to adjust a relatively small number of callers, way less than one might expect. This is because most callers are using `repo_for_each_pack()`, which handles enumeration of all packfiles that exist in the repository. So for now, none of these callers need to be adapted. The remaining callers that iterate through the packfiles directly and that need adjustment are those that are a bit more tangled with packfiles. These will be adjusted over time. Note that this patch only moves the packfile store, and there is still a bunch of functions that seemingly operate on a packfile store but that end up iterating over all sources. These will be adjusted in subsequent commits. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-09 06:40:07 -08:00
Patrick Steinhardt	eb9ec52d95	packfile: refactor misleading code when unusing pack windows The function `unuse_one_window()` is responsible for unmapping one of the packfile windows, which is done when we have exceeded the allowed number of window. The function receives a `struct packed_git` as input, which serves as an additional packfile that should be considered to be closed. If not given, we seemingly skip that and instead go through all of the repository's packfiles. The conditional that checks whether we have a packfile though does not make much sense anymore, as we dereference the packfile regardless of whether or not it is a `NULL` pointer to derive the repository's packfile store. The function was originally introduced via f0e17e86e1 (pack: move release_pack_memory(), 2017-08-18), and here we indeed had a caller that passed a `NULL` pointer. That caller was later removed via 9827d4c185 (packfile: drop release_pack_memory(), 2019-08-12), so starting with that commit we always pass a `struct packed_git`. In 9c5ce06d74 (packfile: use `repository` from `packed_git` directly, 2024-12-03) we then inadvertently started to rely on the fact that the pointer is never `NULL` because we use it now to identify the repository. Arguably, it didn't really make sense in the first place that the caller provides a packfile, as the selected window would have been overridden anyway by the subsequent loop over all packfiles if there was an older window. So the overall logic is quite misleading overall. The only case where it _could_ make a difference is when there were two packfiles with the same `last_used` value, but that case doesn't ever happen because the `pack_used_ctr` is strictly increasing. Refactor the code so that we instead pass in the object database to help make the code less misleading. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-09 06:40:06 -08:00
Patrick Steinhardt	085de91b95	packfile: refactor kept-pack cache to work with packfile stores The kept pack cache is a cache of packfiles that are marked as kept either via an accompanying ".kept" file or via an in-memory flag. The cache can be retrieved via `kept_pack_cache()`, where one needs to pass in a repository. Ultimately though the kept-pack cache is a property of the packfile store, and this causes problems in a subsequent commit where we want to move down the packfile store to be a per-object-source entity. Prepare for this and refactor the kept-pack cache to work on top of a packfile store instead. While at it, rename both the function and flags specific to the kept-pack cache so that they can be properly attributed to the respective subsystems. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-09 06:40:06 -08:00
Patrick Steinhardt	0316c63ca4	packfile: pass source to `prepare_pack()` When preparing a packfile we pass various pieces attached to the pack's object database source via the `struct prepare_pack_data`. Refactor this code to instead pass in the source directly. This reduces the number of variables we need to pass and allows for a subsequent refactoring where we start to prepare the pack via the source. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-09 06:40:06 -08:00
Patrick Steinhardt	480336a9ce	packfile: create store via its owning source In subsequent patches we're about to move the packfile store from the object database layer into the object database source layer. Once done, we'll have one packfile store per source, where the source is owning the store. Prepare for this future and refactor `packfile_store_new()` to be initialized via an object database source instead of via the object database itself. This refactoring leads to a weird in-between state where the store is owned by the object database but created via the source. But this makes subsequent refactorings easier because we can now start to access the owning source of a given store. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-09 06:40:06 -08:00
Junio C Hamano	d529f3a197	The 16th batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-08 16:40:12 +09:00
Junio C Hamano	2db806d817	Merge branch 'en/ort-recursive-d-f-conflict-fix' The ort merge machinery hit an assertion failure in a history with criss-cross merges renamed a directory and a non-directory, which has been corrected. * en/ort-recursive-d-f-conflict-fix: merge-ort: fix corner case recursive submodule/directory conflict handling	2026-01-08 16:40:12 +09:00
Junio C Hamano	512351f2a8	Merge branch 'dd/t5403-modernise' Test micro-clean-up. * dd/t5403-modernise: t5403: use test_path_is_file instead of test -f	2026-01-08 16:40:12 +09:00
Junio C Hamano	c0754dc423	Merge branch 'ds/diff-lazy-fetch-with-name-only-fix' Running "git diff" with "--name-only" and other options that allows us not to look at the blob contents, while objects that are lazily fetched from a promisor remote, caused use-after-free, which has been corrected. * ds/diff-lazy-fetch-with-name-only-fix: diff: avoid segfault with freed entries	2026-01-08 16:40:11 +09:00
Junio C Hamano	d28d2be5f2	Merge branch 'rs/tag-wo-the-repository' Code clean-up. * rs/tag-wo-the-repository: tag: stop using the_repository tag: support arbitrary repositories in parse_tag() tag: support arbitrary repositories in gpg_verify_tag() tag: use algo of repo parameter in parse_tag_buffer()	2026-01-08 16:40:11 +09:00
Junio C Hamano	f1ec43d4d2	Merge branch 'ps/odb-misc-fixes' into ps/packfile-store-in-odb-source * ps/odb-misc-fixes: odb: properly close sources before freeing them builtin/gc: fix condition for whether to write commit graphs	2026-01-07 09:37:29 +09:00
Patrick Steinhardt	3d09968656	odb: properly close sources before freeing them It is possible to hit a memory leak when reading data from a submodule via git-grep(1): Direct leak of 192 byte(s) in 1 object(s) allocated from: #0 0x55555562e726 in calloc (git+0xda726) #1 0x555555964734 in xcalloc ../wrapper.c:154:8 #2 0x555555835136 in load_multi_pack_index_one ../midx.c:135:2 #3 0x555555834fd6 in load_multi_pack_index ../midx.c:382:6 #4 0x5555558365b6 in prepare_multi_pack_index_one ../midx.c:716:17 #5 0x55555586c605 in packfile_store_prepare ../packfile.c:1103:3 #6 0x55555586c90c in packfile_store_reprepare ../packfile.c:1118:2 #7 0x5555558546b3 in odb_reprepare ../odb.c:1106:2 #8 0x5555558539e4 in do_oid_object_info_extended ../odb.c:715:4 #9 0x5555558533d1 in odb_read_object_info_extended ../odb.c:862:8 #10 0x5555558540bd in odb_read_object ../odb.c:920:6 #11 0x55555580a330 in grep_source_load_oid ../grep.c:1934:12 #12 0x55555580a13a in grep_source_load ../grep.c:1986:10 #13 0x555555809103 in grep_source_is_binary ../grep.c:2014:7 #14 0x555555807574 in grep_source_1 ../grep.c:1625:8 #15 0x555555807322 in grep_source ../grep.c:1837:10 #16 0x5555556a5c58 in run ../builtin/grep.c:208:10 #17 0x55555562bb42 in void* ThreadStartFunc<false>(void*) lsan_interceptors.cpp.o #18 0x7ffff7a9a979 in start_thread (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x9a979) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab) #19 0x7ffff7b22d2b in __GI___clone3 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x122d2b) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab) The root caues of this leak is the way we set up and release the submodule: 1. We use `repo_submodule_init()` to initialize a new repository. This repository is stored in `repos_to_free`. 2. We now read data from the submodule repository. 3. We then call `repo_clear()` on the submodule repositories. 4. `repo_clear()` calls `odb_free()`. 5. `odb_free()` calls `odb_free_sources()` followed by `odb_close()`. The issue here is the 5th step: we call `odb_free_sources()` _before_ we call `odb_close()`. But `odb_free_sources()` already frees all sources, so the logic that closes them in `odb_close()` now becomes a no-op. As a consequence, we never explicitly close sources at all. Fix the leak by closing the store before we free the sources. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-07 09:16:51 +09:00
Patrick Steinhardt	b3449b1517	builtin/gc: fix condition for whether to write commit graphs When performing auto-maintenance we check whether commit graphs need to be generated by counting the number of commits that are reachable by any reference, but not covered by a commit graph. This search is performed by iterating through all references and then doing a depth-first search until we have found enough commits that are not present in the commit graph. This logic has a memory leak though: Direct leak of 16 byte(s) in 1 object(s) allocated from: #0 0x55555562e433 in malloc (git+0xda433) #1 0x555555964322 in do_xmalloc ../wrapper.c:55:8 #2 0x5555559642e6 in xmalloc ../wrapper.c:76:9 #3 0x55555579bf29 in commit_list_append ../commit.c:1872:35 #4 0x55555569f160 in dfs_on_ref ../builtin/gc.c:1165:4 #5 0x5555558c33fd in do_for_each_ref_iterator ../refs/iterator.c:431:12 #6 0x5555558af520 in do_for_each_ref ../refs.c:1828:9 #7 0x5555558ac317 in refs_for_each_ref ../refs.c:1833:9 #8 0x55555569e207 in should_write_commit_graph ../builtin/gc.c:1188:11 #9 0x55555569c915 in maintenance_is_needed ../builtin/gc.c:3492:8 #10 0x55555569b76a in cmd_maintenance ../builtin/gc.c:3542:9 #11 0x55555575166a in run_builtin ../git.c:506:11 #12 0x5555557502f0 in handle_builtin ../git.c:779:9 #13 0x555555751127 in run_argv ../git.c:862:4 #14 0x55555575007b in cmd_main ../git.c:984:19 #15 0x5555557523aa in main ../common-main.c:9:11 #16 0x7ffff7a2a4d7 in __libc_start_call_main (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a4d7) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab) #17 0x7ffff7a2a59a in __libc_start_main@GLIBC_2.2.5 (/nix/store/xx7cm72qy2c0643cm1ipngd87aqwkcdp-glibc-2.40-66/lib/libc.so.6+0x2a59a) (BuildId: cddea92d6cba8333be952b5a02fd47d61054c5ab) #18 0x5555555f0934 in _start (git+0x9c934) The root cause of this memory leak is our use of `commit_list_append()`. This function expects as parameters the item to append and the _tail_ of the list to append. This tail will then be overwritten with the new tail of the list so that it can be used in subsequent calls. But we call it with `commit_list_append(parent->item, &stack)`, so we end up losing everything but the new item. This issue only surfaces when counting merge commits. Next to being a memory leak, it also shows that we're in fact miscounting as we only respect children of the last parent. All previous parents are discarded, so their children will be disregarded unless they are hit via another reference. While crafting a test case for the issue I was puzzled that I couldn't establish the proper border at which the auto-condition would be fulfilled. As it turns out, there's another bug: if an object is at the tip of any reference we don't mark it as seen. Consequently, if it is the tip of or reachable via another ref, we'd count that object multiple times. Fix both of these bugs so that we properly count objects without leaking any memory. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-07 09:16:50 +09:00
Junio C Hamano	e0bfec3dfc	The 15th batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-06 16:33:53 +09:00
Junio C Hamano	d39e3ed716	Merge branch 'rs/parse-config-expiry-simplify' Code clean-up. * rs/parse-config-expiry-simplify: config: use git_parse_int() in git_config_get_expiry_in_days()	2026-01-06 16:33:53 +09:00

1 2 3 4 5 ...

79446 Commits