Phillip Wood 1f2e05f0b7 wildmatch: fix exponential behavior
When dowild() cannot match a '*' or '/**/' wildcard then it must return
WM_ABORT_TO_STARSTAR or WM_ABORT_ALL respectively. Failure to observe
this results in unnecessary backtracking and the time taken for a failed
match increases exponentially with the number of wildcards in the
pattern [1]. Unfortunately in some instances dowild() returns WM_NOMATCH
for a failed match resulting in long match times for patterns containing
multiple wildcards as can be seen in the following benchmark.
(Note that the timings in the Benchmark 1 are really measuring the time
to execute test-tool rather than the time to match the pattern)

Benchmark 1: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a"
  Time (mean ± σ):      22.8 ms ±   1.7 ms    [User: 12.1 ms, System: 10.6 ms]
  Range (min … max):    19.4 ms …  26.9 ms    113 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: t/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a*a*a*a*a*a*a*a*a"
  Time (mean ± σ):      5.244 s ±  0.228 s    [User: 5.229 s, System: 0.010 s]
  Range (min … max):    4.969 s …  5.707 s    10 runs

  Warning: Ignoring non-zero exit code.

Summary
  't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a"' ran
  230.37 ± 20.04 times faster than 't/helper/test-tool wildmatch wildmatch aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab "*a*a*a*a*a*a*a*a*a"'

The security implications are limited as it only affects operations that
are potentially DoS vectors. For example by creating a blob containing
such a pattern a malicious user can exploit this behavior to use large
amounts of CPU time on a remote server by pushing the blob and then
creating a new clone with --filter=sparse:oid. However this filter type
is usually disabled as it is known to consume large amounts of CPU time
even without this bug.

The WM_MATCH changed in the first hunk of this patch comes from the
original implementation imported from rsync in 5230f605e1 (Import
wildmatch from rsync, 2012-10-15). Compared to the others converted here
it is fairly harmless as it only triggers at the end of the pattern and
so will only cause a single unnecessary backtrack. The others introduced
by 6f1a31f0aa (wildmatch: advance faster in <asterisk> + <literal>
patterns, 2013-01-01) and 46983441ae (wildmatch: make a special case for
"*/" with FNM_PATHNAME, 2013-01-01) are more pernicious and will cause
exponential behavior.

A new test is added to protect against future regressions.

[1] https://research.swtch.com/glob

Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-20 10:58:53 -07:00
2022-10-07 17:19:59 -07:00
2022-12-11 01:27:25 +08:00
2022-07-10 14:43:34 -07:00
2023-03-20 10:58:53 -07:00
2023-01-19 13:48:26 -08:00
2022-08-03 13:36:09 -07:00
2022-10-12 09:13:25 -07:00
2022-10-30 21:04:44 -04:00
2021-03-13 16:00:09 -08:00
2022-07-19 12:45:31 -07:00
2021-03-13 16:00:09 -08:00
2023-01-05 11:14:28 +09:00
2022-03-28 10:25:52 -07:00
2022-12-13 21:25:15 +09:00
2023-02-06 09:38:31 +01:00
2023-02-06 09:43:39 +01:00
2023-02-06 09:43:28 +01:00
2022-04-06 15:21:59 -07:00
2023-02-06 09:43:39 +01:00
2022-05-02 09:50:37 -07:00
2022-09-14 12:56:39 -07:00
2021-03-13 16:00:09 -08:00
2022-09-14 12:56:39 -07:00
2022-12-15 09:09:38 +09:00
2022-09-14 12:56:39 -07:00
2022-10-30 21:04:41 -04:00
2022-08-29 14:55:11 -07:00
2022-08-22 15:08:30 -07:00
2022-12-13 21:23:36 +09:00
2021-08-24 15:32:37 -07:00
2023-02-14 14:15:57 -08:00
2023-02-06 09:43:28 +01:00
2022-08-03 13:36:09 -07:00
2022-12-15 09:09:38 +09:00
2022-04-06 09:42:12 -07:00
2022-12-01 18:38:07 +09:00
2022-10-30 14:04:51 -04:00
2022-08-30 14:16:49 -07:00
2022-10-24 12:45:26 -07:00
2022-10-24 12:45:26 -07:00
2022-10-30 21:04:44 -04:00
2022-12-15 09:09:38 +09:00
2022-12-13 21:09:40 +09:00
2022-12-13 21:09:40 +09:00

Build status

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from https://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-<commandname>.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission and Documentation/CodingGuidelines).

Those wishing to help with error message, usage and informational message string translations (localization l10) should see po/README.md (a po file is a Portable Object file that holds the translations).

To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org (not the Git list). The mailing list archives are available at https://lore.kernel.org/git/, http://marc.info/?l=git and other archival sites.

Issues which are security relevant should be disclosed privately to the Git Security mailing list git-security@googlegroups.com.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks
Description
No description provided
Readme 581 MiB
Languages
C 50.5%
Shell 38.7%
Perl 4.5%
Tcl 3.2%
Python 0.8%
Other 2.1%