eyecatchu/git - git - KM Solution Bank Gitea

mirror of https://github.com/git/git.git synced 2026-01-11 21:33:13 +09:00

Author	SHA1	Message	Date
Junio C Hamano	527292f4f2	Merge branch 'bc/sha1-256-interop-02' into seen The code to maintain mapping between object names in multiple hash functions is being added, written in Rust. * bc/sha1-256-interop-02: object-file-convert: always make sure object ID algo is valid rust: add a small wrapper around the hashfile code rust: add a new binary object map format rust: add functionality to hash an object rust: add a build.rs script for tests hash: expose hash context functions to Rust write-or-die: add an fsync component for the object map csum-file: define hashwrite's count as a uint32_t rust: add additional helpers for ObjectID hash: add a function to look up hash algo structs rust: add a hash algorithm abstraction rust: add a ObjectID struct hash: use uint32_t for object_id algorithm conversion: don't crash when no destination algo repository: require Rust support for interoperability	2026-01-10 21:53:59 -08:00
Junio C Hamano	82bfc1a079	Merge branch 'ps/history' into seen "git history" history rewriting UI. * ps/history: builtin/history: implement "reword" subcommand builtin: add new "history" command wt-status: provide function to expose status for trees replay: yield the object ID of the final rewritten commit replay: small set of cleanups builtin/replay: move core logic into "libgit.a" builtin/replay: extract core logic to replay revisions	2026-01-10 21:53:58 -08:00
Junio C Hamano	8ae8076468	Merge branch 'cc/lop-filter-auto' into seen "auto filter" logic for large-object promisor remote. Comments? * cc/lop-filter-auto: fetch-pack: wire up and enable auto filter logic promisor-remote: keep advertised filter in memory list-objects-filter-options: implement auto filter resolution list-objects-filter-options: support 'auto' mode for --filter doc: fetch: document `--filter=<filter-spec>` option fetch: make filter_options local to cmd_fetch() clone: make filter_options local to cmd_clone() promisor-remote: allow a client to store fields promisor-remote: refactor initialising field lists	2026-01-10 21:53:58 -08:00
Junio C Hamano	04ff6f5818	Merge branch 'en/xdiff-cleanup-3' into seen Preparation of xdiff/ codebase to work with Rust Comments? * en/xdiff-cleanup-3: SQUASH??? cocci xdiff: move xdl_cleanup_records() from xprepare.c to xdiffi.c xdiff: remove dependence on xdlclassifier from xdl_cleanup_records() xdiff: replace xdfile_t.dend with xdfenv_t.delta_end xdiff: replace xdfile_t.dstart with xdfenv_t.delta_start xdiff: cleanup xdl_trim_ends() xdiff: use xdfenv_t in xdl_trim_ends() and xdl_cleanup_records() xdiff: let patience and histogram benefit from xdl_trim_ends() xdiff: don't waste time guessing the number of lines xdiff: make classic diff explicit by creating xdl_do_classic_diff() ivec: introduce the C side of ivec	2026-01-10 21:53:54 -08:00
Junio C Hamano	0338af6379	Merge branch 'tb/macos-iconv-workarounds' into jch The iconv library on macOS fails to correctly handle stateful ISO/IEC 2022 encoded strings. Work it around instead of replacing it wholesale from homebrew. * tb/macos-iconv-workarounds: utf8.c: Enable workaround for iconv under macOS 14/15 utf8.c: Prepare workaround for iconv under macOS 14/15	2026-01-10 21:53:37 -08:00
Junio C Hamano	08e166be23	Merge branch 'jk/parse-int' into jch Introduce a more robust way to parse a decimal integer stored in a piece of memory that is not necessarily terminated with NUL (which Asan strict-string-check complains even when use of strtol() is safe due to varified existence of whitespace after the digits). * jk/parse-int: fsck: use parse_unsigned_from_buf() for parsing timestamp cache-tree: use parse_int_from_buf() parse: add functions for parsing from non-string buffers parse: prefer bool to int for boolean returns	2026-01-10 21:53:34 -08:00
Patrick Steinhardt	baf46a880d	builtin: add new "history" command When rewriting history via git-rebase(1) there are a few very common use cases: - The ordering of two commits should be reversed. - A commit should be split up into two commits. - A commit should be dropped from the history completely. - Multiple commits should be squashed into one. - Editing an existing commit that is not the tip of the current branch. While these operations are all doable, it often feels needlessly kludgey to do so by doing an interactive rebase, using the editor to say what one wants, and then perform the actions. Also, some operations like splitting up a commit into two are way more involved than that and require a whole series of commands. Rebases also do not update dependent branches. The use of stacked branches has grown quite common with competing version control systems like Jujutsu though, so it clearly is a need that users have. While rebases _can_ serve this use case if one always works on the latest stacked branch, it is somewhat awkward and very easy to get wrong. Add a new "history" command to plug these gaps. This command will have several different subcommands to imperatively rewrite history for common use cases like the above. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-09 06:42:10 -08:00
Patrick Steinhardt	8ad30d58f5	builtin/replay: move core logic into "libgit.a" Move the core logic used to replay commits into "libgit.a" so that it can be easily reused by other commands. It will be used in a subsequent commit where we're about to introduce a new git-history(1) command. Note that with this change we have no sign-comparison warnings anymore, and neither do we depend on `the_repository`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-09 06:42:09 -08:00
Torsten Bögershausen	be5864e08d	utf8.c: Enable workaround for iconv under macOS 14/15 The previous commit introduced a workaround in utf8.c to deal with broken iconv implementations. It is enabled when A MacOS version is used that has a buggy iconv library and there is no external library provided (and linked against) from neither MacPorts nor Homebrew. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-09 06:19:58 -08:00
Junio C Hamano	b39aad0b0d	Merge branch 'rs/macos-iconv-workaround' Workaround the "iconv" shipped as part of macOS, which is broken handling stateful ISO/IEC 2022 encoded strings. * rs/macos-iconv-workaround: macOS: use iconv from Homebrew if needed and present macOS: make Homebrew use configurable	2026-01-06 16:33:52 +09:00
Ezekiel Newren	7ba3a48c3d	ivec: introduce the C side of ivec Trying to use Rust's Vec in C, or git's ALLOC_GROW() macros (via wrapper functions) in Rust is painful because: * C doesn't define its own vector type, and even though Rust does have Vec its painful to use on the C side (more on that below). However its still not viable to use Rust's Vec type because Git needs to be able to compile without Rust. So ivec was created expressley to be interoperable between C and Rust without needing Rust. * C doing vector things the Rust way would require wrapper functions, and Rust doing vector things the C way would require wrapper functions, so ivec was created to ensure a consistent contract between the 2 languages for how to manipulate a vector. * Currently, Rust defines its own 'Vec' type that is generic, but its memory allocator and struct layout weren't designed for interoperability with C (or any language for that matter), meaning that the C side cannot push to or expand a 'Vec' without defining wrapper functions in Rust that C can call. Without special care, the two languages might use different allocators (malloc/free on the C side, and possibly something else in Rust), which would make it difficult for a function in one language to free elements allocated by a call from a function in the other language. * Similarly, git defines ALLOC_GROW() and related macros in git-compat-util.h. While we could add functions allowing Rust to invoke something similar to those macros, passing three variables (pointer, length, allocated_size) instead of a single variable (vector) across the language boundary requires more cognitive overhead for readers to keep track of and makes it easier to make mistakes. Further, for low-level components that we want to eventually convert to pure Rust, such triplets would feel very out of place. To address these issue, introduce a new type, ivec -- short for interoperable vector. (We refer to it as 'ivec' generally, though on the Rust side the struct is called IVec to match Rust style.) This new type is specifically designed for FFI purposes, so that both languages handle the vector in the same way, though it could be used on either side independently. This type is designed such that it can easily be replaced by a Rust 'Vec' once interoperability is no longer a concern. One particular item to note is that Git's macros to handle vec operations infer the amount that a vec needs to grow from the size of a pointer, but that makes it somewhat specific to the macros used in C. To avoid defining every ivec function as a macro I opted to also include an element_size field that allows concrete functions like push() to know how much to grow the memory. This element_size also helps in verifying that the ivec is correct when passing from C to Rust. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2026-01-04 11:44:51 +09:00
René Scharfe	cee341e9dd	macOS: use iconv from Homebrew if needed and present The library function iconv(3) supplied with macOS versions 15.7.2 (Sequoia) and 26.1 (Tahoe) is unreliable when doing conversions from ISO-2022-JP to UTF-8 in multiple steps; t3900 reports this breakage: not ok 17 - ISO-2022-JP should be shown in UTF-8 now not ok 25 - ISO-2022-JP should be shown in UTF-8 now not ok 38 - commit --fixup into ISO-2022-JP from UTF-8 As a workaround, use libiconv from Homebrew, if available. Search it in its default locations: /opt/homebrew for Apple Silicon and /usr/local for macOS Intel, with the former taking precedence. Respect ICONVDIR if already set by the user, though. Helped-by: Koji Nakamaru <koji.nakamaru@gree.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-12-25 16:43:10 +09:00
René Scharfe	363837afe7	macOS: make Homebrew use configurable On macOS we opportunistically use Homebrew-installed versions of gettext(3) and msgfmt(1). Make that behavior configurable by providing make variables to disable Homebrew usage (NO_HOMEBREW) and to allow using a non-default installation location (HOMEBREW_PREFIX). Include and link only the gettext keg via the symlink opt/gettext pointing to its installed version instead of using the Homebrew prefix. This is simpler and prevents accidentally including other libraries. Suggested-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Suggested-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-12-25 16:43:09 +09:00
Christian Couder	13bf8f5bcb	list-objects-filter-options: support 'auto' mode for --filter In a following commit, we are going to allow passing "auto" as a <filterspec> to the `--filter=<filterspec>` option, but only for some commands. Other commands that support the `--filter=<filterspec>` option should still die() when 'auto' is passed. Let's set up the "list-objects-filter-options.{c,h}" infrastructure to support that: - Add a new `unsigned int allow_auto_filter : 1;` flag to `struct list_objects_filter_options` which specifies if "auto" is accepted or not. - Change gently_parse_list_objects_filter() to parse "auto" if it's accepted. - Make sure we die() if "auto" is combined with another filter. - Update list_objects_filter_release() to preserve the allow_auto_filter flag, as this function is often called (via opt_parse_list_objects_filter) to reset the struct before parsing a new value. Let's also update `list-objects-filter.c` to recognize the new `LOFC_AUTO` choice. Since "auto" must be resolved to a concrete filter before filtering actually begins, initializing a filter with `LOFC_AUTO` is invalid and will trigger a BUG(). Note that ideally combining "auto" with "auto" could be allowed, but in practice, it's probably not worth the added code complexity. And if we really want it, nothing prevents us to allow it in future work. If we ever want to give a meaning to combining "auto" with a different filter too, nothing prevents us to do that in future work either. While at it, let's add a new "u-list-objects-filter-options.c" file for `struct list_objects_filter_options` related unit tests. For now it only tests gently_parse_list_objects_filter() though. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-12-23 22:43:05 +09:00
Junio C Hamano	86ebd83e6a	Merge branch 'jc/memzero-array' Further application of MEMZERO_ARRAY() macro to the rest of the code base. * jc/memzero-array: cocci: use MEMZERO_ARRAY() a bit more coccicheck: emit the contents of cocci patch	2025-12-23 11:33:16 +09:00
Junio C Hamano	396df67739	Merge branch 'tc/memzero-array' MEMZERO_ARRAY() helper is introduced to avoid clearing only the first N bytes of an N-element array whose elements are larger than a byte. * tc/memzero-array: contrib/coccinelle: pass include paths to spatch(1) git-compat-util: introduce MEMZERO_ARRAY() macro	2025-12-23 11:33:16 +09:00
Junio C Hamano	448673412d	Merge branch 'jc/macports-darwinports' Makefile in-comment doc update. * jc/macports-darwinports: Makefile: help macOS novices by mentioning MacPorts	2025-12-22 14:57:48 +09:00
Junio C Hamano	91bfbf49b6	Merge branch 'rs/ban-mktemp' Rewrite the only use of "mktemp()" that is subject to TOCTOU race and Stop using the insecure "mktemp()" function. * rs/ban-mktemp: compat: remove gitmkdtemp() banned.h: ban mktemp(3) compat: remove mingw_mktemp() compat: use git_mkdtemp() wrapper: add git_mkdtemp()	2025-12-16 11:08:35 +09:00
Junio C Hamano	dbe54273a7	Merge branch 'ps/object-read-stream' The "git_istream" abstraction has been revamped to make it easier to interface with pluggable object database design. * ps/object-read-stream: streaming: drop redundant type and size pointers streaming: move into object database subsystem streaming: refactor interface to be object-database-centric streaming: move logic to read packed objects streams into backend streaming: move logic to read loose objects streams into backend streaming: make the `odb_read_stream` definition public streaming: get rid of `the_repository` streaming: rely on object sources to create object stream packfile: introduce function to read object info from a store streaming: move zlib stream into backends streaming: create structure for filtered object streams streaming: create structure for packed object streams streaming: create structure for loose object streams streaming: create structure for in-core object streams streaming: allocate stream inside the backend-specific logic streaming: explicitly pass packfile info when streaming a packed object streaming: propagate final object type via the stream streaming: drop the `open()` callback function streaming: rename `git_istream` into `odb_read_stream`	2025-12-16 11:08:34 +09:00
Junio C Hamano	d2e4099968	coccicheck: emit the contents of cocci patch Telling the user "you got some error messages" without showing what the errors are is almost useless in CI environment, as the errors cannot be examined without downloading build artifacts. Arrange it to spew out the output when it fails. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-12-13 10:47:59 +09:00
Junio C Hamano	d4b732899e	Makefile: help macOS novices by mentioning MacPorts Since Aug 2006, the DarwinPorts project renamed themselves as MacPorts. Those who are not intimately familiar with the Opensource ecosystem around macOS from olden days, the name DarwinPorts may not ring a bell, even when they are using MacPorts. Signed-off-by: Junio C Hamano <gitster@pobox.com> Reviewed-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-12-12 11:19:43 +09:00
Toon Claes	467860bc0b	contrib/coccinelle: pass include paths to spatch(1) In the previous commit a new coccinelle rule is added. But neiter `make coccicheck` nor `meson compile coccicheck` did detect a case in builtin/last-modified.c. This case involves the field `scratch` in `struct last_modified`. This field is of type `struct bitmap` and that struct has a member `eword_t words`. Both are defined in `ewah/ewok.h`. Now, while builtin/last-modified.c does include that header (with the subdir in the #include directive), it seems coccinelle does not process it. So it's unaware of the type of `words` in the bitmap, and it doesn't recognize the rule from previous commit that uses: type T; T ptr; Fix coccicheck by passing all possible include paths inside the Git project so spatch(1) can find the headers and can determine the types. Signed-off-by: Toon Claes <toon@iotcl.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-12-11 14:44:43 +09:00
René Scharfe	10bba537c4	compat: remove gitmkdtemp() gitmkdtemp() has become a trivial wrapper around git_mkdtemp(). Remove this now unnecessary layer of indirection. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-12-07 07:28:13 +09:00
Junio C Hamano	5eadcbf815	Merge branch 'js/strip-scalar-too' "make strip" has been taught to strip "scalar" as well as "git". * js/strip-scalar-too: make strip: include `scalar`	2025-12-05 14:49:56 +09:00
Junio C Hamano	aea8cc3a10	Merge branch 'jk/asan-bonanza' Various issues detected by Asan have been corrected. * jk/asan-bonanza: t: enable ASan's strict_string_checks option fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated fsck: remove redundant date timestamp check fsck: avoid strcspn() in fsck_ident() fsck: assert newline presence in fsck_ident() cache-tree: avoid strtol() on non-string buffer Makefile: turn on NO_MMAP when building with ASan pack-bitmap: handle name-hash lookups in incremental bitmaps compat/mmap: mark unused argument in git_munmap()	2025-11-30 18:31:41 -08:00
Jeff King	b5b6c11a70	parse: add functions for parsing from non-string buffers If you have a buffer that is not NUL-terminated but want to parse an integer, there aren't many good options. If you use strtol() and friends, you risk running off the end of the buffer if there is no non-digit terminating character. And even if you carefully make sure that there is such a character, ASan's strict-string-check mode will still complain. You can copy bytes into a temporary buffer, terminate it, and then call strtol(), but doing so adds some pitfalls (like making sure you soak up whitespace and leading +/- signs, and reporting overflow for overly long input). Or you can hand-parse the digits, but then you need to take some care to handle overflow (and again, whitespace and +/- signs). These things aren't impossible to do right, but it's error-prone to have to do them in every spot that wants to do such parsing. So let's add some functions which can be used across the code base. There are a few choices regarding the interface and the implementation. First, the implementation: - I went with with parsing the digits (rather than buffering and passing to libc functions). It ends up being a similar amount of code because we have to do some parsing either way. And likewise overflow detection depends on the exact type the caller wants, so we either have to do it by hand or write a separate wrapper for strtol(), strtoumax(), and so on. - Unsigned overflow detection is done using the same techniques as in unsigned_add_overflows(), etc. We can't use those macros directly because our core function is type-agnostic (so the caller passes in the max value, rather than us deriving it on the fly). This is similar to how git_parse_int(), etc, work. - Signed overflow detection assumes that we can express a negative value with magnitude one larger than our maximum positive value (e.g., -128..127 for a signed 8-bit value). I doubt this is guaranteed by the standard, but it should hold in practice, and we make the same assumption in git_parse_int(), etc. The nice thing about this is that we can derive the range from the number of bits in the type. For ints, you obviously could use INT_MIN..INT_MAX, but for an arbitrary type, we can use maximum_signed_value_of_type(). - I didn't bother with handling bases other than 10. It would complicate the code, and I suspect it won't be needed. We could probably retro-fit it later without too much work, if need be. For the interface: - What do we call it? We have git_parse_int() and friends, which aim to make parsing less error-prone. And in some ways, these are just buffer (rather than string) versions of those functions. But not entirely. Those functions are aimed at parsing a single user-facing value. So they accept a unit prefix (e.g., "10k"), which we won't always want. And they insist that the whole string is consumed (rather than passing back an "end" pointer). We also have strtol_i() and strtoul_ui() wrappers, which try to make error handling simpler (especially around overflow), but mostly behave like their libc counterparts. These also don't pass out an end pointer, though. So I started a new namespace, "parse_<type>_from_buf". - Like those other functions above, we use an out-parameter to store the result, which lets us return an error code directly. This avoids the complicated errno dance for detecting overflow that you get with strtol(). What should the error code look like? git_parse_int() uses a bool for success/failure. But strtol_ui() uses the syscall-like "0 is success, -1 is error" convention. I went with the bool approach here. Since the names are closest to those functions, I thought it would cause the least confusion. - Unlike git_parse_signed() and friends, we do not insist that the entire buffer be consumed. For parsing a specific standalone string that makes sense, but within an unterminated buffer you are much more likely to be parsing multiple fields from a larger data set. We pass out an "end" pointer the same way strtol() does. Another option is to accept the input as an in-out parameter and advance the pointer ourselves (and likewise shrink the length pointer). That would let you do something like: if (!parse_int_from_buf(&p, &len, &out)) return error(...); /* "p" and "len" were adjusted automatically / if (!len \|\| p++ != ' ') return error(...); That saves a few lines of code in some spots, but requires a few more in others (depending on whether the caller has a length in the first place or is using an end pointer). Of the two callers I intend to immediately convert, we have one of each type! I went with the strtol() approach as flexible and time-tested. - We could likewise take the input buffer as two pointers (start and end) rather than a pointer and a length. That again makes life easier for some callers and harder for others. I stuck with pointer and length as the more usual interface. - What happens when a caller passes in a NULL end pointer? This is allowed by strtol(). But I think it's often a sign of a lurking bug, because there's no way to know how much was consumed (and even if a caller wants to assume everything is consumed, you have no way to verify it). So it is simply an error in this interface (you'd get a segfault). I am tempted to say that if the end pointer is NULL the functions could confirm that the entire buffer was consumed, as a convenience. But that felt a bit magical and surprising. Like git_parse_*(), there is a generic signed/unsigned helper, and then we can add type-specific helpers on top. I've added an int helper here to start, and we'll add more as we convert callers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-30 10:03:43 -08:00
Junio C Hamano	9841b05cbc	Merge branch 'jk/asan-bonanza' into jk/parse-int * jk/asan-bonanza: t: enable ASan's strict_string_checks option fsck: avoid parse_timestamp() on buffer that isn't NUL-terminated fsck: remove redundant date timestamp check fsck: avoid strcspn() in fsck_ident() fsck: assert newline presence in fsck_ident() cache-tree: avoid strtol() on non-string buffer Makefile: turn on NO_MMAP when building with ASan pack-bitmap: handle name-hash lookups in incremental bitmaps compat/mmap: mark unused argument in git_munmap()	2025-11-30 10:03:37 -08:00
Patrick Steinhardt	1599b68d5e	streaming: move into object database subsystem The "streaming" terminology is somewhat generic, so it may not be immediately obvious that "streaming.{c,h}" is specific to the object database. Rectify this by moving it into the "odb/" directory so that it can be immediately attributed to the object subsystem. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-23 12:56:46 -08:00
Jeff King	a9990f8ec0	Makefile: turn on NO_MMAP when building with ASan Git often uses mmap() to access on-disk files. This leaves a blind spot in our SANITIZE=address builds, since ASan does not seem to handle mmap at all. Nor does the OS notice most out-of-bounds access, since it tends to round up to the nearest page size (so depending on how big the map is, you might have to overrun it by up to 4095 bytes to trigger a segfault). The previous commit demonstrates a memory bug that we missed. We could have made a new test where the out-of-bounds access was much larger, or where the mapped file ended closer to a page boundary. But the point of running the test suite with sanitizers is to catch these problems without having to construct specific tests. Let's enable NO_MMAP for our ASan builds by default, which should give us better coverage. This does increase the memory usage of Git, since we're copying from the filesystem into heap. But the repositories in the test suite tend to be small, so the overhead isn't really noticeable (and ASan already has quite a performance penalty). There are a few other known bugs that this patch will help flush out. However, they aren't directly triggered in the test suite (yet). So it's safe to turn this on now without breaking the test suite, which will help us add new tests to demonstrate those other bugs as we fix them. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-18 09:36:06 -08:00
brian m. carlson	44ed7c1886	rust: add a small wrapper around the hashfile code Our new binary object map code avoids needing to be intimately involved with file handling by simply writing data to an object implement Write. This makes it very easy to test by writing to a Cursor wrapping a Vec for tests, and thus decouples it from intimate knowledge about how we handle files. However, we will actually want to write our data to an actual file, since that's the most practical way to persist data. Implement a wrapper around the hashfile code that implements the Write trait so that we can write our object map into a file. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-17 14:24:16 -08:00
brian m. carlson	ca05bfbae0	rust: add a new binary object map format Our current loose object format has a few problems. First, it is not efficient: the list of object IDs is not sorted and even if it were, there would not be an efficient way to look up objects in both algorithms. Second, we need to store mappings for things which are not technically loose objects but are not packed objects, either, and so cannot be stored in a pack index. These kinds of things include shallows, their parents, and their trees, as well as submodules. Yet we also need to implement a sensible way to store the kind of object so that we can prune unneeded entries. For instance, if the user has updated the shallows, we can remove the old values. For these reasons, introduce a new binary object map format. The careful reader will notice that it resembles very closely the pack index v3 format. Add an in-memory object map as well, and allow writing to a batched map, which can then be written later as one of the binary object maps. Include several tests for round tripping and data lookup across algorithms. Note that the use of this code elsewhere in Git will involve some C code and some C-compatible code in Rust that will be introduced in a future commit. Thus, for example, we ignore the fact that if there is no current batch and the caller asks for data to be written, this code does nothing, mostly because this code also does not involve itself with opening or manipulating files. The C code that we will add later will implement this functionality at a higher level and take care of this, since the code which is necessary for writing to the object store is deeply involved with our C abstractions and it would require extensive work (which would not be especially valuable at this point) to port those to Rust. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-17 14:24:16 -08:00
brian m. carlson	ddeec7a34f	rust: add a build.rs script for tests Cargo uses the build.rs script to determine how to compile and link a binary. The only binary we're generating, however, is for our tests, but in a future commit, we're going to link against libgit.a for some functionality and we'll need to make sure the test binaries are complete. Add a build.rs file for this case and specify the files we're going to be linking against. Because we cannot specify different dependencies when building our static library versus our tests, update the Makefile to specify these dependencies for our static library to avoid race conditions during build. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-17 14:24:15 -08:00
brian m. carlson	0404613aa0	rust: add a ObjectID struct We'd like to be able to write some Rust code that can work with object IDs. Add a structure here that's identical to struct object_id in C, for easy use in sharing across the FFI boundary. We will use this structure in several places in hot paths, such as index-pack or pack-objects when converting between algorithms, so prioritize efficient interchange over a more idiomatic Rust approach. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-17 14:24:14 -08:00
Johannes Schindelin	c64eb849b1	make strip: include `scalar` When Scalar was made a canonical part of Git in 7b5c93c6c68 (scalar: include in standard Git build & installation, 2022-09-02), it was added to all relevant Makefile targets except for the `strip` target. Let's correct that. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-17 14:05:05 -08:00
Jiang Xin	878fef8ebf	t/unit-tests: add UTF-8 width tests for CJK chars The file "builtin/repo.c" uses utf8_strwidth() to calculate the display width of UTF-8 characters in a table, but the resulting output is still misaligned. Add test cases for both utf8_strwidth and utf8_strnwidth to verify that they correctly compute the display width for UTF-8 characters. Also updated the build configuration in Makefile and meson.build to include the new test suite in the build process. Signed-off-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-11-16 16:04:24 -08:00
Junio C Hamano	c1b23bd8aa	Merge branch 'tb/incremental-midx-part-3.1' Clean-up "git repack" machinery to prepare for incremental update of midx files. * tb/incremental-midx-part-3.1: (49 commits) builtin/repack.c: clean up unused `#include`s repack: move `write_cruft_pack()` out of the builtin repack: move `write_filtered_pack()` out of the builtin repack: move `pack_kept_objects` to `struct pack_objects_args` repack: move `finish_pack_objects_cmd()` out of the builtin builtin/repack.c: pass `write_pack_opts` to `finish_pack_objects_cmd()` repack: extract `write_pack_opts_is_local()` repack: move `find_pack_prefix()` out of the builtin builtin/repack.c: use `write_pack_opts` within `write_cruft_pack()` builtin/repack.c: introduce `struct write_pack_opts` repack: 'write_midx_included_packs' API from the builtin builtin/repack.c: inline packs within `write_midx_included_packs()` builtin/repack.c: pass `repack_write_midx_opts` to `midx_included_packs` builtin/repack.c: inline `remove_redundant_bitmaps()` builtin/repack.c: reorder `remove_redundant_bitmaps()` repack: keep track of MIDX pack names using existing_packs builtin/repack.c: use a string_list for 'midx_pack_names' builtin/repack.c: extract opts struct for 'write_midx_included_packs()' builtin/repack.c: remove ref snapshotting from builtin repack: remove pack_geometry API from the builtin ...	2025-10-29 12:38:24 -07:00
Junio C Hamano	fe95c55549	Merge branch 'ps/ci-rust' CI improvements to handle the recent Rust integration better. * ps/ci-rust: rust: support for Windows ci: verify minimum supported Rust version ci: check for common Rust mistakes via Clippy rust/varint: add safety comments ci: check formatting of our Rust code ci: deduplicate calls to `apt-get update`	2025-10-28 10:29:09 -07:00
Junio C Hamano	282a9684ab	Merge branch 'en/make-libgit-a' Instead of three library archives (one for git, one for reftable, and one for xdiff), roll everything into a single libgit.a archive. This would help later effort to FFI into Rust. * en/make-libgit-a: make: delete REFTABLE_LIB, add reftable to LIB_OBJS make: delete XDIFF_LIB, add xdiff to LIB_OBJS	2025-10-17 14:02:16 -07:00
Taylor Blau	09797bd966	repack: move `write_cruft_pack()` out of the builtin In an identical fashion as the previous commit, move the function `write_cruft_pack()` into its own compilation unit, and make the function visible through the repack.h API. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-10-16 10:08:57 -07:00
Taylor Blau	7ac4231b42	repack: move `write_filtered_pack()` out of the builtin In a similar fashion as in previous commits, move the function `write_filtered_pack()` out of the builtin and into its own compilation unit. This function is now part of the repack.h API, but implemented in its own "repack-filtered.c" unit as it is a separate component from other kinds of repacking operations. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-10-16 10:08:57 -07:00
Taylor Blau	ccb7f822d5	builtin/repack.c: remove ref snapshotting from builtin When writing a MIDX, 'git repack' takes a snapshot of the repository's references and writes the result out to a file, which it then passes to 'git multi-pack-index write' via the '--refs-snapshot'. This is done in order to make bitmap selections with respect to what we are packing, thus avoiding a race where an incoming reference update causes us to try and write a bitmap for a commit not present in the MIDX. Extract this functionality out into a new repack-midx.c compilation unit, and expose the necessary functions via the repack.h API. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-10-16 10:08:55 -07:00
Taylor Blau	62d3fa09b3	repack: remove pack_geometry API from the builtin Now that the pack_geometry API is fully factored and isolated from the rest of the builtin, declare it within repack.h and move its implementation to "repack-geometry.c" as a separate component. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-10-16 10:08:55 -07:00
Taylor Blau	29e935515d	builtin/repack.c: remove "repack_promisor_objects()" from the builtin Now that we have properly factored the portion of the builtin which is responsible for repacking promisor objects, we can move that function (and associated dependencies) out of the builtin entirely. Similar to previous extractions, this function is declared in repack.h, but implemented in a separate repack-promisor.c file. This is done to separate promisor-specific repacking functionality from generic repack utilities (like "existing_packs", and "generated_pack" APIs). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-10-16 10:08:55 -07:00
Taylor Blau	c7a120722e	repack: introduce new compilation unit Over the years, builtin/repack.c has turned into a grab-bag of functionality powering the 'git repack' builtin. Among its many capabilities, it: - can build and spawn 'git pack-objects' commands, which in turn generate new packs - has infrastructure to manage the set of existing packs in a repository - has infrastructure to split a sequence of packs into a geometric progression based on object size - can manage both generating and combining cruft packs together - can write new MIDXs to name a few. As a result, this builtin has accumulated a lot of code, making adding new functionality difficult. In the future, 'repack' will learn how to manage a chain of incremental MIDXs, adding yet more functionality into the builtin. As a prerequisite step, let's first move some of the functionality in the builtin into its own repack.[ch]. This will be done over the course of many steps, since there are many individual components, some of which will end up in other, yet-to-exist compilation units of their own. Some of the code movement here is also non-trivial, so performing it in individual steps will make it easier to verify. Let's start by migrating 'struct pack_objects_args' (and the related corresponding pack_objects_args_release() function) into repack.h, and teach both the Makefile and Meson how to build the new compilation unit. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-10-16 10:08:53 -07:00
Patrick Steinhardt	e509b5b8be	rust: support for Windows The initial patch series that introduced Rust into the core of Git only cared about macOS and Linux. This specifically leaves out Windows, which indeed fails to build right now due to two issues: - The Rust runtime requires `GetUserProfileDirectoryW()`, but we don't link against "userenv.dll". - The path of the Rust library built on Windows is different than on most other systems systems. Fix both of these issues to support Windows. Note that this commit fixes the Meson-based job in GitHub's CI. Meson auto-detects the availability of Rust, and as the Windows runner has Rust installed by default it already enabled Rust support there. But due to the above issues that job fails consistently. Install Rust on GitLab CI, as well, to improve test coverage there. Based-on-patch-by: Johannes Schindelin <johannes.schindelin@gmx.de> Based-on-patch-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-10-15 08:10:17 -07:00
Junio C Hamano	f50f046794	Merge branch 'kn/reftable-consistency-checks' The reftable backend learned to sanity check its on-disk data more carefully. * kn/reftable-consistency-checks: refs/reftable: add fsck check for checking the table name reftable: add code to facilitate consistency checks fsck: order 'fsck_msg_type' alphabetically Documentation/fsck-msgids: remove duplicate msg id reftable: check for trailing newline in 'tables.list' refs: move consistency check msg to generic layer refs: remove unused headers	2025-10-13 22:00:35 -07:00
Junio C Hamano	75f8dfabaa	Merge branch 'ps/rust-balloon' Dip our toes a bit to (optionally) use Rust implemented helper called from our C code. * ps/rust-balloon: ci: enable Rust for breaking-changes jobs ci: convert "pedantic" job into full build with breaking changes BreakingChanges: announce Rust becoming mandatory varint: reimplement as test balloon for Rust varint: use explicit width for integers help: report on whether or not Rust is enabled Makefile: introduce infrastructure to build internal Rust library Makefile: reorder sources after includes meson: add infrastructure to build internal Rust library	2025-10-08 12:17:55 -07:00
Junio C Hamano	5f91b2c43f	Merge branch 'ps/rust-balloon' into ps/ci-rust * ps/rust-balloon: ci: enable Rust for breaking-changes jobs ci: convert "pedantic" job into full build with breaking changes BreakingChanges: announce Rust becoming mandatory varint: reimplement as test balloon for Rust varint: use explicit width for integers help: report on whether or not Rust is enabled Makefile: introduce infrastructure to build internal Rust library Makefile: reorder sources after includes meson: add infrastructure to build internal Rust library	2025-10-07 10:55:39 -07:00
Karthik Nayak	9051638519	reftable: add code to facilitate consistency checks The `git refs verify` command is used to run consistency checks on the reference backends. This command is also invoked when users run 'git fsck'. While the files-backend has some fsck checks added, the reftable backend lacks such checks. Let's add the required infrastructure and a check to test for the files present in the reftable directory. Since the reftable library is treated as an independent library we should ensure that the library code works independently without knowledge about Git's internals. To do this, add both 'reftable/fsck.c' and 'reftable/reftable-fsck.h'. Which provide an entry point 'reftable_fsck_check' for running fsck checks over a provided reftable stack. The callee provides the function with callbacks to handle issue and information reporting. The added check, goes over all tables in the reftable stack validates that they have a valid name. It not, it raises an error. While here, move 'reftable/error.o' in the Makefile to retain lexicographic ordering. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-10-07 09:22:58 -07:00
Ezekiel Newren	f3b4c89d59	make: delete REFTABLE_LIB, add reftable to LIB_OBJS Same idea as the previous commit except that I don't know when or if reftable will be turned into a Rust crate. Signed-off-by: Ezekiel Newren <ezekielnewren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2025-10-03 09:37:58 -07:00

1 2 3 4 5 ...

3222 Commits