Jeff King c4c9089584 cache-tree: avoid strtol() on non-string buffer
A cache-tree extension entry in the index looks like this:

  <name> NUL <entry_nr> SPACE <subtree_nr> NEWLINE <binary_oid>

where the "_nr" items are human-readable base-10 ASCII. We parse them
with strtol(), even though we do not have a NUL-terminated string (we'd
generally have an mmap() of the on-disk index file). For a well-formed
entry, this is not a problem; strtol() will stop when it sees the
newline. But there are two problems:

  1. A corrupted entry could omit the newline, causing us to read
     further. You'd mostly get stopped by seeing non-digits in the oid
     field (and if it is likewise truncated, there will still be 20 or
     more bytes of the index checksum). So it's possible, though
     unlikely, to read off the end of the mmap'd buffer. Of course a
     malicious index file can fake the oid and the index checksum to all
     (ASCII) 0's.

     This is further complicated by the fact that mmap'd buffers tend to
     be zero-padded up to the page boundary. So to run off the end, the
     index size also has to be a multiple of the page size. This is also
     unlikely, though you can construct a malicious index file that
     matches this.

     The security implications aren't too interesting. The index file is
     a local file anyway (so you can't attack somebody by cloning, but
     only if you convince them to operate in a .git directory you made,
     at which point attacking .git/config is much easier). And it's just
     a read overflow via strtol(), which is unlikely to buy you much
     beyond a crash.

  2. ASan has a strict_string_checks option, which tells it to make sure
     that options to string functions (like strtol) have some eventual
     NUL, without regard to what the function would actually do (like
     stopping at a newline here). This option sometimes has false
     positives, but it can point to sketchy areas (like this one) where
     the input we use doesn't exhibit a problem, but different input
     _could_ cause us to misbehave.

Let's fix it by just parsing the values ourselves with a helper function
that is careful not to go past the end of the buffer. There are a few
behavior changes here that should not matter:

  - We do not consider overflow, as strtol() would. But nor did the
    original code. However, we don't trust the value we get from the
    on-disk file, and if it says to read 2^30 entries, we would notice
    that we do not have that many and bail before reading off the end of
    the buffer.

  - Our helper does not skip past extra leading whitespace as strtol()
    would, but according to gitformat-index(5) there should not be any.

  - The original quit parsing at a newline or a NUL byte, but now we
    insist on a newline (which is what the documentation says, and what
    Git has always produced).

Since we are providing our own helper function, we can tweak the
interface a bit to make our lives easier. The original code does not use
strtol's "end" pointer to find the end of the parsed data, but rather
uses a separate loop to advance our "buf" pointer to the trailing
newline. We can instead provide a helper that advances "buf" as it
parses, letting us read strictly left-to-right through the buffer.

I didn't add a new test here. It's surprisingly difficult to construct
an index of exactly the right size due to the way we pad entries. But it
is easy to trigger the problem in existing tests when using ASan's
strict string checking, coupled with a recent change to use NO_MMAP with
ASan builds. So:

  make SANITIZE=address
  cd t
  ASAN_OPTIONS=strict_string_checks=1 ./t0090-cache-tree.sh

triggers it reliably. Technically it is not deterministic because there
is ~8% chance (it's 1-(255/256)^20, or ^32 for sha256) that the trailing
checksum hash has a NUL byte in it. But we compute enough cache-trees in
the course of that script that we are very likely to hit the problem in
one of them.

We can look at making strict_string_checks the default for ASan builds,
but there are some other cases we'd want to fix first.

Reported-by: correctmost <cmlists@sent.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2025-11-18 09:36:06 -08:00
2025-06-03 08:55:24 -07:00
2025-06-15 21:54:23 -07:00
2025-06-15 21:54:23 -07:00
2025-06-15 21:54:23 -07:00
2025-05-27 13:59:09 -07:00
2023-11-26 10:07:06 +09:00
2025-05-15 13:46:47 -07:00
2024-09-20 14:40:41 -07:00
2024-09-06 09:31:15 -07:00
2024-12-18 10:44:30 -08:00
2024-09-23 10:35:07 -07:00
2025-03-26 16:26:09 +09:00
2025-06-15 21:54:23 -07:00
2025-04-24 17:25:33 -07:00
2024-09-16 10:46:00 -07:00
2024-06-14 10:26:33 -07:00
2024-06-14 10:26:33 -07:00
2024-01-23 10:40:10 -08:00
2025-05-27 13:59:11 -07:00
2025-04-24 17:25:33 -07:00
2025-06-15 21:54:23 -07:00
2025-03-05 10:37:44 -08:00
2024-06-14 10:26:33 -07:00
2025-01-21 08:44:54 -08:00
2025-01-21 08:44:54 -08:00
2024-12-23 09:32:11 -08:00
2024-04-05 15:21:14 -07:00
2024-12-18 10:44:31 -08:00
2025-03-03 13:49:23 -08:00
2025-05-05 14:56:24 -07:00
2024-10-23 16:16:36 -04:00
2024-10-23 16:16:36 -04:00
2024-10-23 16:16:36 -04:00
2024-09-19 13:46:00 -07:00
2025-06-15 21:57:08 -07:00
2025-03-03 08:17:47 -08:00
2024-12-18 10:44:31 -08:00
2023-11-26 10:07:05 +09:00
2023-06-28 14:06:39 -07:00
2025-05-08 12:36:31 -07:00
2025-05-15 13:46:47 -07:00
2025-06-06 08:12:24 -07:00
2025-01-31 10:06:10 -08:00
2023-06-28 14:06:39 -07:00
2024-10-23 16:16:36 -04:00
2023-11-26 10:07:05 +09:00
2023-11-26 10:07:05 +09:00
2024-06-14 10:26:33 -07:00
2024-12-18 10:44:31 -08:00
2025-05-22 14:48:37 -07:00
2024-02-26 15:34:01 -08:00
2024-07-08 14:53:10 -07:00
2025-04-24 17:25:33 -07:00
2025-04-23 13:58:50 -07:00
2025-05-12 13:06:26 -07:00
2024-10-21 16:05:04 -04:00
2024-06-14 10:26:33 -07:00
2024-12-18 10:44:30 -08:00
2024-12-18 10:44:30 -08:00
2025-02-03 16:12:42 -08:00
2025-02-03 16:12:42 -08:00
2024-12-18 10:44:30 -08:00
2024-12-18 10:44:30 -08:00
2023-11-26 10:07:05 +09:00
2025-03-03 13:49:19 -08:00
2024-09-19 13:46:01 -07:00
2024-04-05 15:21:14 -07:00
2025-06-15 21:57:08 -07:00
2024-06-14 10:26:33 -07:00
2024-09-19 13:46:12 -07:00
2024-12-18 10:44:30 -08:00
2023-11-26 10:07:05 +09:00
2024-12-27 08:12:40 -08:00
2024-09-30 11:23:03 -07:00
2024-06-14 10:26:33 -07:00
2023-09-15 17:08:46 -07:00
2025-01-13 12:55:26 -08:00
2025-01-13 12:55:26 -08:00
2024-12-23 09:32:11 -08:00
2024-05-17 10:33:39 -07:00
2025-03-03 13:49:26 -08:00
2024-12-18 10:44:30 -08:00
2024-12-18 10:44:30 -08:00
2025-05-15 13:46:47 -07:00
2025-03-03 13:49:27 -08:00
2025-02-06 14:56:45 -08:00
2023-06-28 14:06:39 -07:00
2025-01-17 13:30:02 -08:00
2025-05-15 17:24:55 -07:00
2024-06-14 10:26:33 -07:00

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.adoc to get started, then see Documentation/giteveryday.adoc for a useful minimum set of commands, and Documentation/git-<commandname>.adoc for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.adoc (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission and Documentation/CodingGuidelines).

Those wishing to help with error message, usage and informational message string translations (localization l10) should see po/README.md (a po file is a Portable Object file that holds the translations).

To subscribe to the list, send an email to git+subscribe@vger.kernel.org (see https://subspace.kernel.org/subscribing.html for details). The mailing list archives are available at https://lore.kernel.org/git/, https://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks
Description
No description provided
Readme 581 MiB
Languages
C 50.5%
Shell 38.7%
Perl 4.5%
Tcl 3.2%
Python 0.8%
Other 2.1%