eyecatchu/git - git - KM Solution Bank Gitea

mirror of https://github.com/git/git.git synced 2026-01-12 05:43:12 +09:00

Author	SHA1	Message	Date
Junio C Hamano	69b737999c	Merge branch 'ps/reftable-stack-compaction' The code paths to compact multiple reftable files have been updated to correctly deal with multiple compaction triggering at the same time. * ps/reftable-stack-compaction: reftable/stack: handle locked tables during auto-compaction reftable/stack: fix corruption on concurrent compaction reftable/stack: use lock_file when adding table to "tables.list" reftable/stack: do not die when fsyncing lock file files reftable/stack: simplify tracking of table locks reftable/stack: update stats on failed full compaction reftable/stack: test compaction with already-locked tables reftable/stack: extract function to setup stack with N tables reftable/stack: refactor function to gather table sizes	2024-08-15 13:22:13 -07:00
Junio C Hamano	7b11e20bff	Merge branch 'cp/unit-test-reftable-tree' A test in reftable library has been rewritten using the unit test framework. * cp/unit-test-reftable-tree: t-reftable-tree: improve the test for infix_walk() t-reftable-tree: add test for non-existent key t-reftable-tree: split test_tree() into two sub-test functions t: move reftable/tree_test.c to the unit testing framework reftable: remove unnecessary curly braces in reftable/tree.c	2024-08-14 14:54:56 -07:00
Junio C Hamano	d65332f241	Merge branch 'cp/unit-test-reftable-pq' The tests for "pq" part of reftable library got rewritten to use the unit test framework. * cp/unit-test-reftable-pq: t-reftable-pq: add tests for merged_iter_pqueue_top() t-reftable-pq: add test for index based comparison t-reftable-pq: make merged_iter_pqueue_check() callable by reference t-reftable-pq: make merged_iter_pqueue_check() static t: move reftable/pq_test.c to the unit testing framework reftable: change the type of array indices to 'size_t' in reftable/pq.c reftable: remove unnecessary curly braces in reftable/pq.c	2024-08-14 14:54:48 -07:00
Patrick Steinhardt	f234df07f6	reftable/stack: handle locked tables during auto-compaction When compacting tables, it may happen that we want to compact a set of tables which are already locked by a concurrent process that compacts them. In the case where we wanted to perform a full compaction of all tables it is sensible to bail out in this case, as we cannot fulfill the requested action. But when performing auto-compaction it isn't necessarily in our best interest of us to abort the whole operation. For example, due to the geometric compacting schema that we use, it may be that process A takes a lot of time to compact the bulk of all tables whereas process B appends a bunch of new tables to the stack. B would in this case also notice that it has to compact the tables that process A is compacting already and thus also try to compact the same range, probably including the new tables it has appended. But because those tables are locked already, it will fail and thus abort the complete auto-compaction. The consequence is that the stack will grow longer and longer while A isn't yet done with compaction, which will lead to a growing performance impact. Instead of aborting auto-compaction altogether, let's gracefully handle this situation by instead compacting tables which aren't locked. To do so, instead of locking from the beginning of the slice-to-be-compacted, we start locking tables from the end of the slice. Once we hit the first table that is locked already, we abort. If we succeeded to lock two or more tables, then we simply reduce the slice of tables that we're about to compact to those which we managed to lock. This ensures that we can at least make some progress for compaction in said scenario. It also helps in other scenarios, like for example when a process died and left a stale lockfile behind. In such a case we can at least ensure some compaction on a best-effort basis. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 10:14:43 -07:00
Patrick Steinhardt	ed1ad6b44d	reftable/stack: fix corruption on concurrent compaction The locking employed by compaction uses the following schema: 1. Lock "tables.list" and verify that it matches the version we have loaded in core. 2. Lock each of the tables in the user-supplied range of tables that we are supposed to compact. These locks prohibit any concurrent process to compact those tables while we are doing that. 3. Unlock "tables.list". This enables concurrent processes to add new tables to the stack, but also allows them to compact tables outside of the range of tables that we have locked. 4. Perform the compaction. 5. Lock "tables.list" again. 6. Move the compacted table into place. 7. Write the new order of tables, including the compacted table, into the lockfile. 8. Commit the lockfile into place. Letting concurrent processes modify the "tables.list" file while we are doing the compaction is very much part of the design and thus expected. After all, it may take some time to compact tables in the case where we are compacting a lot of very large tables. But there is a bug in the code. Suppose we have two processes which are compacting two slices of the table. Given that we lock each of the tables before compacting them, we know that the slices must be disjunct from each other. But regardless of that, compaction performed by one process will always impact what the other process needs to write to the "tables.list" file. Right now, we do not check whether the "tables.list" has been changed after we have locked it for the second time in (5). This has the consequence that we will always commit the old, cached in-core tables to disk without paying to respect what the other process has written. This scenario would then lead to data loss and corruption. This can even happen in the simpler case of one compacting process and one writing process. The newly-appended table by the writing process would get discarded by the compacting process because it never sees the new table. Fix this bug by re-checking whether our stack is still up to date after locking for the second time. If it isn't, then we adjust the indices of tables to replace in the updated stack. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 10:14:43 -07:00
Patrick Steinhardt	128b9aa3e9	reftable/stack: use lock_file when adding table to "tables.list" When modifying "tables.list", we need to lock the list before updating it to ensure that no concurrent writers modify the list at the same point in time. While we do this via the `lock_file` subsystem when compacting the stack, we manually handle the lock when adding a new table to it. While not wrong, it is at least inconsistent. Refactor the code to consistently lock "tables.list" via the `lock_file` subsytem. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 10:14:43 -07:00
Patrick Steinhardt	7ee307da1b	reftable/stack: do not die when fsyncing lock file files We use `fsync_component_or_die()` when committing an addition to the "tables.list" lock file, which unsurprisingly dies in case the fsync fails. Given that this is part of the reftable library, we should never die and instead let callers handle the error. Adapt accordingly and use `fsync_component()` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 10:14:43 -07:00
Patrick Steinhardt	558f6fbeb1	reftable/stack: simplify tracking of table locks When compacting tables, we store the locks of all tables we are about to compact in the `table_locks` array. As we currently only ever compact all tables in the user-provided range or none, we simply track those locks via the indices of the respective tables in the merged stack. This is about to change though, as we will introduce a mode where auto compaction gracefully handles the case of already-locked files. In this case, it may happen that we only compact a subset of the user-supplied range of tables. In this case, the indices will not necessarily match the lock indices anymore. Refactor the code such that we track the number of locks via a separate variable. The resulting code is expected to perform the same, but will make it easier to perform the described change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 10:14:42 -07:00
Patrick Steinhardt	5f0ed603a1	reftable/stack: update stats on failed full compaction When auto-compaction fails due to a locking error, we update the statistics to indicate this failure. We're not doing the same when performing a full compaction. Fix this inconsistency by using `stack_compact_range_stats()`, which handles the stat update for us. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 10:14:42 -07:00
Patrick Steinhardt	8030100bda	reftable/stack: test compaction with already-locked tables We're lacking test coverage for compacting tables when some of the tables that we are about to compact are locked. Add two tests that exercise this, one for auto-compaction and one for full compaction. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 10:14:42 -07:00
Patrick Steinhardt	9a833ca35d	reftable/stack: extract function to setup stack with N tables We're about to add two tests, and both of them will want to initialize the reftable stack with a set of N tables. Introduce a new function that handles this and refactor existing tests that use such a setup to use it. Note that this changes the exact records contained in the preexisting tests. This is fine though as we only care about the shape of the stack here, not the shape of each table. Furthermore, with this change we now start to disable auto compaction when writing the tables, as otherwise we might not end up with the expected amount of new tables added. This also slightly changes the behaviour of these tests, but the properties we care for remain intact. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 10:14:42 -07:00
Patrick Steinhardt	ed7d2f4770	reftable/stack: refactor function to gather table sizes Refactor the function that gathers table sizes to be more idiomatic. For one, use `REFTABLE_CALLOC_ARRAY()` instead of `reftable_calloc()`. Second, avoid using an integer to iterate through the tables in the reftable stack given that `stack_len` itself is using a `size_t`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-08 10:14:41 -07:00
Chandra Pratap	ec9c0704fc	t: move reftable/tree_test.c to the unit testing framework reftable/tree_test.c exercises the functions defined in reftable/tree.{c, h}. Migrate reftable/tree_test.c to the unit testing framework. Migration involves refactoring the tests to use the unit testing framework instead of reftable's test framework and renaming the tests to align with unit-tests' standards. Also add a comment to help understand the test routine. Note that this commit mostly moves the test from reftable/ to t/unit-tests/ and most of the refactoring is performed by the trailing commits. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-04 09:50:26 -07:00
Chandra Pratap	e5a0f7076f	reftable: remove unnecessary curly braces in reftable/tree.c According to Documentation/CodingGuidelines, single-line control-flow statements must omit curly braces (except for some special cases). Make reftable/tree.c adhere to this guideline. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-04 09:50:18 -07:00
Chandra Pratap	2e707447e1	t-reftable-pq: make merged_iter_pqueue_check() static merged_iter_pqueue_check() is a function previously defined in reftable/pq_test.c (now t/unit-tests/t-reftable-pq.c) and used in the testing of a priority queue as defined by reftable/pq.{c, h}. As such, this function is only called by reftable/pq_test.c and it makes little sense to expose it to non-testing code via reftable/pq.h. Hence, make this function static and remove its prototype from reftable/pq.h. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-01 09:07:29 -07:00
Chandra Pratap	a08ea27cd0	t: move reftable/pq_test.c to the unit testing framework reftable/pq_test.c exercises a priority queue defined by reftable/pq.{c, h}. Migrate reftable/pq_test.c to the unit testing framework. Migration involves refactoring the tests to use the unit testing framework instead of reftable's test framework, and renaming the tests to align with unit-tests' standards. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-01 09:07:29 -07:00
Chandra Pratap	2a85906348	reftable: change the type of array indices to 'size_t' in reftable/pq.c The variables 'i', 'j', 'k' and 'min' are used as indices for 'pq->heap', which is an array. Additionally, 'pq->len' is of type 'size_t' and is often used to assign values to these variables. Hence, change the type of these variables from 'int' to 'size_t'. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-01 09:07:28 -07:00
Chandra Pratap	f1b60b7c66	reftable: remove unnecessary curly braces in reftable/pq.c According to Documentation/CodingGuidelines, control-flow statements with a single line as their body must omit curly braces. Make reftable/pq.c conform to this guideline. Besides that, remove unnecessary newlines and variable assignment. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-01 09:07:28 -07:00
Junio C Hamano	6c70d65712	Merge branch 'cp/unit-test-reftable-merged' Another reftable test has been ported to use the unit test framework. * cp/unit-test-reftable-merged: t-reftable-merged: add test for REFTABLE_FORMAT_ERROR t-reftable-merged: use reftable_ref_record_equal to compare ref records t-reftable-merged: add tests for reftable_merged_table_max_update_index t-reftable-merged: improve the const-correctness of helper functions t-reftable-merged: improve the test t_merged_single_record() t: harmonize t-reftable-merged.c with coding guidelines t: move reftable/merged_test.c to the unit testing framework	2024-07-31 13:34:19 -07:00
Junio C Hamano	9118e46e81	Merge branch 'cp/unit-test-reftable-record' A test in reftable library has been rewritten using the unit test framework. * cp/unit-test-reftable-record: t-reftable-record: add tests for reftable_log_record_compare_key() t-reftable-record: add tests for reftable_ref_record_compare_name() t-reftable-record: add index tests for reftable_record_is_deletion() t-reftable-record: add obj tests for reftable_record_is_deletion() t-reftable-record: add log tests for reftable_record_is_deletion() t-reftable-record: add ref tests for reftable_record_is_deletion() t-reftable-record: add comparison tests for obj records t-reftable-record: add comparison tests for index records t-reftable-record: add comparison tests for ref records t-reftable-record: add reftable_record_cmp() tests for log records t: move reftable/record_test.c to the unit testing framework	2024-07-15 10:11:44 -07:00
Chandra Pratap	9cdfd1d7df	t: move reftable/merged_test.c to the unit testing framework reftable/merged_test.c exercises the functions defined in reftable/merged.{c, h}. Migrate reftable/merged_test.c to the unit testing framework. Migration involves refactoring the tests to use the unit testing framework instead of reftable's test framework and renaming the tests according to unit-tests' naming conventions. Also, move strbuf_add_void() and noop_flush() from reftable/test_framework.c to the ported test. This is because both these functions are used in the merged tests and reftable/test_framework.{c, h} is not #included in the ported test. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-07-12 09:55:39 -07:00
Junio C Hamano	7b472da915	Merge branch 'ps/use-the-repository' A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help transition the codebase to rely less on the availability of the singleton the_repository instance. * ps/use-the-repository: hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE` t/helper: remove dependency on `the_repository` in "proc-receive" t/helper: fix segfault in "oid-array" command without repository t/helper: use correct object hash in partial-clone helper compat/fsmonitor: fix socket path in networked SHA256 repos replace-object: use hash algorithm from passed-in repository protocol-caps: use hash algorithm from passed-in repository oidset: pass hash algorithm when parsing file http-fetch: don't crash when parsing packfile without a repo hash-ll: merge with "hash.h" refs: avoid include cycle with "repository.h" global: introduce `USE_THE_REPOSITORY_VARIABLE` macro hash: require hash algorithm in `empty_tree_oid_hex()` hash: require hash algorithm in `is_empty_{blob,tree}_oid()` hash: make `is_null_oid()` independent of `the_repository` hash: convert `oidcmp()` and `oideq()` to compare whole hash global: ensure that object IDs are always padded hash: require hash algorithm in `oidread()` and `oidclr()` hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()` hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions	2024-07-02 09:59:00 -07:00
Chandra Pratap	ba9661b457	t: move reftable/record_test.c to the unit testing framework reftable/record_test.c exercises the functions defined in reftable/record.{c, h}. Migrate reftable/record_test.c to the unit testing framework. Migration involves refactoring the tests to use the unit testing framework instead of reftable's test framework, and renaming the tests to fit unit-tests' naming scheme. While at it, change the type of index variable 'i' to 'size_t' from 'int'. This is because 'i' is used in comparison against 'ARRAY_SIZE(x)' which is of type 'size_t'. Also, use set_hash() which is defined locally in the test file instead of set_test_hash() which is defined by reftable/test_framework.{c, h}. This is fine to do as both these functions are similarly implemented, and reftable/test_framework.{c, h} is not #included in the ported test. Get rid of reftable_record_print() from the tests as well, because it clutters the test framework's output and we have no way of verifying the output. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Acked-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-07-02 08:12:24 -07:00
Junio C Hamano	4216329457	Merge branch 'ps/no-writable-strings' Building with "-Werror -Wwrite-strings" is now supported. * ps/no-writable-strings: (27 commits) config.mak.dev: enable `-Wwrite-strings` warning builtin/merge: always store allocated strings in `pull_twohead` builtin/rebase: always store allocated string in `options.strategy` builtin/rebase: do not assign default backend to non-constant field imap-send: fix leaking memory in `imap_server_conf` imap-send: drop global `imap_server_conf` variable mailmap: always store allocated strings in mailmap blob revision: always store allocated strings in output encoding remote-curl: avoid assigning string constant to non-const variable send-pack: always allocate receive status parse-options: cast long name for OPTION_ALIAS http: do not assign string constant to non-const field compat/win32: fix const-correctness with string constants pretty: add casts for decoration option pointers object-file: make `buf` parameter of `index_mem()` a constant object-file: mark cached object buffers as const ident: add casts for fallback name and GECOS entry: refactor how we remove items for delayed checkouts line-log: always allocate the output prefix line-log: stop assigning string constant to file parent buffer ...	2024-06-17 15:55:58 -07:00
Junio C Hamano	40a163f217	Merge branch 'ps/ref-storage-migration' A new command has been added to migrate a repository that uses the files backend for its ref storage to use the reftable backend, with limitations. * ps/ref-storage-migration: builtin/refs: new command to migrate ref storage formats refs: implement logic to migrate between ref storage formats refs: implement removal of ref storages worktree: don't store main worktree twice reftable: inline `merged_table_release()` refs/files: fix NULL pointer deref when releasing ref store refs/files: extract function to iterate through root refs refs/files: refactor `add_pseudoref_and_head_entries()` refs: allow to skip creation of reflog entries refs: pass storage format to `ref_store_init()` explicitly refs: convert ref storage format to an enum setup: unset ref storage when reinitializing repository version	2024-06-17 15:55:55 -07:00
Patrick Steinhardt	8a676bdc5c	hash-ll: merge with "hash.h" The "hash-ll.h" header was introduced via d1cbe1e6d8 (hash-ll.h: split out of hash.h to remove dependency on repository.h, 2023-04-22) to make explicit the split between hash-related functions that rely on the global `the_repository`, and those that don't. This split is no longer necessary now that we we have removed the reliance on `the_repository`. Merge "hash-ll.h" back into "hash.h". This causes some code units to not include "repository.h" anymore, which requires us to add some forward declarations. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:33 -07:00
Junio C Hamano	092b33da2b	Merge branch 'ps/ref-storage-migration' into ps/use-the-repository * ps/ref-storage-migration: builtin/refs: new command to migrate ref storage formats refs: implement logic to migrate between ref storage formats refs: implement removal of ref storages worktree: don't store main worktree twice reftable: inline `merged_table_release()` refs/files: fix NULL pointer deref when releasing ref store refs/files: extract function to iterate through root refs refs/files: refactor `add_pseudoref_and_head_entries()` refs: allow to skip creation of reflog entries refs: pass storage format to `ref_store_init()` explicitly refs: convert ref storage format to an enum setup: unset ref storage when reinitializing repository version	2024-06-13 09:39:08 -07:00
Junio C Hamano	56346ba24e	Merge branch 'cp/reftable-unit-test' Basic unit tests for reftable have been reimplemented under the unit test framework. * cp/reftable-unit-test: t: improve the test-case for parse_names() t: add test for put_be16() t: move tests from reftable/record_test.c to the new unit test t: move tests from reftable/stack_test.c to the new unit test t: move reftable/basics_test.c to the unit testing framework	2024-06-12 13:37:14 -07:00
Patrick Steinhardt	66f892bb07	reftable: cast away constness when assigning constants to records The reftable records are used in multiple ways throughout the reftable library. In many of those cases they merely act as input to a function without getting modified by it at all. Most importantly, this happens when writing records and when querying for records. We rely on this in our tests and thus assign string constants to those fields, which is about to generate warnings as those fields are of type `char `. While we could go through the process and instead allocate those strings in all of our tests, this feels quite unnecessary. Instead, add casts to `char ` for all of those strings. As this is part of our tests, this also nicely serves as a demonstration that nothing writes or frees those string constants, which would otherwise lead to segfaults. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-07 10:30:49 -07:00
Patrick Steinhardt	b567004b4b	global: improve const correctness when assigning string constants We're about to enable `-Wwrite-strings`, which changes the type of string constants to `const char[]`. Fix various sites where we assign such constants to non-const variables. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-07 10:30:48 -07:00
Patrick Steinhardt	b5d7db9e83	reftable: inline `merged_table_release()` The function `merged_table_release()` releases a merged table, whereas `reftable_merged_table_free()` releases a merged table and then also free's its pointer. But all callsites of `merged_table_release()` are in fact followed by `reftable_merged_table_free()`, which is redundant. Inline `merged_table_release()` into `reftable_merged_table_free()` to get rid of this redundance. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-06 09:04:32 -07:00
Junio C Hamano	67ce50ba26	Merge branch 'ps/reftable-reusable-iterator' Code clean-up to make the reftable iterator closer to be reusable. * ps/reftable-reusable-iterator: reftable/merged: adapt interface to allow reuse of iterators reftable/stack: provide convenience functions to create iterators reftable/reader: adapt interface to allow reuse of iterators reftable/generic: adapt interface to allow reuse of iterators reftable/generic: move seeking of records into the iterator reftable/merged: simplify indices for subiterators reftable/merged: split up initialization and seeking of records reftable/reader: set up the reader when initializing table iterator reftable/reader: inline `reader_seek_internal()` reftable/reader: separate concerns of table iter and reftable reader reftable/reader: unify indexed and linear seeking reftable/reader: avoid copying index iterator reftable/block: use `size_t` to track restart point index	2024-05-30 14:15:12 -07:00
Chandra Pratap	afe5b9e7ec	t: move tests from reftable/record_test.c to the new unit test common_prefix_size(), get_be24() and put_be24() are functions defined in reftable/basics.{c, h}. Move the tests for these functions from reftable/record_test.c to the newly ported test. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-30 07:30:10 -07:00
Chandra Pratap	f74e1865fe	t: move tests from reftable/stack_test.c to the new unit test parse_names() and names_equal() are functions defined in reftable/basics.{c, h}. Move the tests for these functions from reftable/stack_test.c to the newly ported test. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-30 07:30:10 -07:00
Chandra Pratap	b34116a30c	t: move reftable/basics_test.c to the unit testing framework reftable/basics_test.c exercise the functions defined in reftable/basics.{c, h}. Migrate reftable/basics_test.c to the unit testing framework. Migration involves refactoring the tests to use the unit testing framework instead of reftable's test framework. Mentored-by: Patrick Steinhardt <ps@pks.im> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Chandra Pratap <chandrapratap3519@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-30 07:30:10 -07:00
Patrick Steinhardt	369b84196e	reftable/merged: adapt interface to allow reuse of iterators Refactor the interfaces exposed by `struct reftable_merged_table` and `struct merged_iter` such that they support iterator reuse. This is done by separating initialization of the iterator and seeking on it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:04:19 -07:00
Patrick Steinhardt	08efe69212	reftable/stack: provide convenience functions to create iterators There exist a bunch of call sites in the reftable backend that want to create iterators for a reftable stack. This is rather convoluted right now, where you always have to go via the merged table. And it is about to become even more convoluted when we split up iterator initialization and seeking in the next commit. Introduce convenience functions that allow the caller to create an iterator from a reftable stack directly without going through the merged table. Adapt callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:04:19 -07:00
Patrick Steinhardt	0e7be2b3ea	reftable/reader: adapt interface to allow reuse of iterators Refactor the interfaces exposed by `struct reftable_reader` and `struct table_iterator` such that they support iterator reuse. This is done by separating initialization of the iterator and seeking on it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:04:18 -07:00
Patrick Steinhardt	d76f0d3f57	reftable/generic: adapt interface to allow reuse of iterators Refactor the interfaces exposed by `struct reftable_table` and `struct reftable_iterator` such that they support iterator reuse. This is done by separating initialization of the iterator and seeking on it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:04:18 -07:00
Patrick Steinhardt	5bf96e0c39	reftable/generic: move seeking of records into the iterator Reftable iterators are created by seeking on the parent structure of a corresponding record. For example, to create an iterator for the merged table you would call `reftable_merged_table_seek_ref()`. Most notably, it is not posible to create an iterator and then seek it afterwards. While this may be a bit easier to reason about, it comes with two significant downsides. The first downside is that the logic to find records is split up between the parent data structure and the iterator itself. Conceptually, it is more straight forward if all that logic was contained in a single place, which should be the iterator. The second and more significant downside is that it is impossible to reuse iterators for multiple seeks. Whenever you want to look up a record, you need to re-create the whole infrastructure again, which is quite a waste of time. Furthermore, it is impossible to optimize seeks, such as when seeking the same record multiple times. To address this, we essentially split up the concerns properly such that the parent data structure is responsible for setting up the iterator via a new `init_iter()` callback, whereas the iterator handles seeks via a new `seek()` callback. This will eventually allow us to call `seek()` on the iterator multiple times, where every iterator can potentially optimize for certain cases. Note that at this point in time we are not yet ready to reuse the iterators. This will be left for a future patch series. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:04:18 -07:00
Patrick Steinhardt	701713a254	reftable/merged: simplify indices for subiterators When seeking on a merged table, we perform the seek for each of the subiterators. If the subiterator has the desired record we add it to the priority queue, otherwise we skip it and don't add it to the stack of subiterators hosted by the merged table. The consequence of this is that the index of the subiterator in the merged table does not necessarily correspond to the index of it in the merged iterator. Next to being potentially confusing, it also means that we won't easily be able to re-seek the merged iterator because we have no clear connection between both of the data structures. Refactor the code so that the index stays the same in both structures. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:04:18 -07:00
Patrick Steinhardt	e08f49a4f5	reftable/merged: split up initialization and seeking of records To initialize a `struct merged_iter`, we need to seek all subiterators to the wanted record and then add their results to the priority queue used to sort the records. This logic is split up across two functions, `merged_table_seek_record()` and `merged_iter_init()`. The scope of these functions is somewhat weird though, where `merged_iter_init()` is only responsible for adding the records of the subiterators to the priority queue. Clarify the scope of those functions such that `merged_iter_init()` is only responsible for initializing the iterator's structure. Performing the subiterator seeks are now part of `merged_table_seek_record()`. This step is required to move seeking of records into the generic `struct reftable_iterator` infrastructure. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:04:17 -07:00
Patrick Steinhardt	c82692f755	reftable/reader: set up the reader when initializing table iterator All the seeking functions accept a `struct reftable_reader` as input such that they can use the reader to look up the respective blocks. Refactor the code to instead set up the reader as a member of `struct table_iter` during initialization such that we don't have to pass the reader on every single call. This step is required to move seeking of records into the generic `struct reftable_iterator` infrastructure. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:04:17 -07:00
Patrick Steinhardt	f1e3c12196	reftable/reader: inline `reader_seek_internal()` We have both `reader_seek()` and `reader_seek_internal()`, where the former function only exists so that we can exit early in case the given table has no records of the sought-after type. Merge these two functions into one. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:04:17 -07:00
Patrick Steinhardt	81a03a3236	reftable/reader: separate concerns of table iter and reftable reader In "reftable/reader.c" we implement two different interfaces: - The reftable reader contains the logic to read reftables. - The table iterator is used to iterate through a single reftable read by the reader. The way those two types are used in the code is somewhat confusing though because seeking inside a table is implemented as if it was part of the reftable reader, even though it is ultimately more of a detail implemented by the table iterator. Make the boundary between those two types clearer by renaming functions that seek records in a table such that they clearly belong to the table iterator's logic. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:04:17 -07:00
Patrick Steinhardt	dfdd1455bb	reftable/reader: unify indexed and linear seeking In `reader_seek_internal()` we either end up doing an indexed seek when there is one or a linear seek otherwise. These two code paths are disjunct without a good reason, where the indexed seek will cause us to exit early. Refactor the two code paths such that it becomes possible to share a bit more code between them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:04:16 -07:00
Patrick Steinhardt	9a59b65dba	reftable/reader: avoid copying index iterator When doing an indexed seek we need to walk down the multi-level index until we finally hit a record of the desired indexed type. This loop performs a copy of the index iterator on every iteration, which is both hard to understand and completely unnecessary. Refactor the code so that we use a single iterator to walk down the indices, only. Note that while this should improve performance, the improvement is negligible in all but the most unreasonable repositories. This is because the effect is only really noticeable when we have to walk down many levels of indices, which is not something that a repository would typically have. So the motivation for this change is really only about readability. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:04:16 -07:00
Patrick Steinhardt	d537ce6b9e	reftable/block: use `size_t` to track restart point index The function `block_reader_restart_offset()` gets the offset of the `i`th restart point. `i` is a signed integer though, which is certainly not the correct type to track indices like this. Furthermore, both callers end up passing a `size_t`. Refactor the code to use a `size_t` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:04:16 -07:00
Patrick Steinhardt	f663d34306	reftable: make the compaction factor configurable When auto-compacting, the reftable library packs references such that the sizes of the tables form a geometric sequence. The factor for this geometric sequence is hardcoded to 2 right now. We're about to expose this as a config option though, so let's expose the factor via write options. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:02:39 -07:00
Patrick Steinhardt	8e9e136d61	reftable: use `uint16_t` to track restart interval The restart interval can at most be `UINT16_MAX` as specified in the technical documentation of the reftable format. Furthermore, it cannot ever be negative. Regardless of that we use an `int` to track the restart interval. Change the type to use an `uint16_t` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-13 17:02:38 -07:00

1 2 3 4 5

235 Commits