From f421e029aef755c6db4f8662e2b6f01c95700959 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 7 Jul 2020 17:41:51 -0400 Subject: [PATCH 1/6] t6000: use test_tick consistently The first two commits created in t6000 are done without test_tick, meaning they use the current system clock. After that, we create one with test_tick, which means it uses a deterministic time in the past. The result of the "symleft flag bit is propagated down from tag" test relies on the output order of commits from git-log, which in turn depends on these timestamps. So this test is technically dependent on the system clock time, though in practice it would only matter if your system clock was set before test_tick's default time (which is in 2005). However, let's use test_tick consistently for those early commits (and update the expected output to match). This makes the test deterministic, which is in turn easier to reason about and debug. Note that there's also a fourth commit here, and it does not use test_tick. It does have a deterministic timestamp because of the prior use of test_tick in the script, but it will always be the same time as the third commit. Let's use test_tick here, too, for consistency. The matching timestamps between the third and fourth commit are not an important part of the test. We could also use test_commit in all of these cases, as it runs test_tick under the hood. But it would be awkward to do so, as these tests diverge from the usual test_commit patterns (e.g., by creating multiple files in a single commit). Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- t/t6000-rev-list-misc.sh | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh index 3dc1ad8f71..3bb0e4ff8f 100755 --- a/t/t6000-rev-list-misc.sh +++ b/t/t6000-rev-list-misc.sh @@ -8,6 +8,7 @@ test_expect_success setup ' echo content1 >wanted_file && echo content2 >unwanted_file && git add wanted_file unwanted_file && + test_tick && git commit -m one ' @@ -21,6 +22,7 @@ test_expect_success 'rev-list --objects with pathspecs and deeper paths' ' mkdir foo && >foo/file && git add foo/file && + test_tick && git commit -m two && git rev-list --objects HEAD -- foo >output && @@ -69,6 +71,7 @@ test_expect_success '--no-object-names and --object-names are last-one-wins' ' ' test_expect_success 'rev-list A..B and rev-list ^A B are the same' ' + test_tick && git commit --allow-empty -m another && git tag -a -m "annotated" v1.0 && git rev-list --objects ^v1.0^ v1.0 >expect && @@ -84,10 +87,10 @@ test_expect_success 'propagate uninteresting flag down correctly' ' test_expect_success 'symleft flag bit is propagated down from tag' ' git log --format="%m %s" --left-right v1.0...master >actual && cat >expect <<-\EOF && - > two - > one < another < that + > two + > one EOF test_cmp expect actual ' From fccf41e35aed528b7257a4e966437ccd92672bb1 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 9 Jul 2020 16:35:50 -0400 Subject: [PATCH 2/6] t9700: loosen ident timezone regex A few of the perl tests in t9700 ask for the author and committer ident, and then make sure we get something sensible. For the timestamp portion, we just match [0-9]+, because the actual value will depend on when the test is run. However, we do require that the timezone be "+0000". This works reliably because we set $TZ in test-lib.sh. But in preparation for changing the default timezone, let's be a bit more flexible. We don't actually care about the exact value here, just that we were able to get a sensible output from the perl module's access methods. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- t/t9700/test.pl | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/t/t9700/test.pl b/t/t9700/test.pl index 34cd01366f..c59a015f89 100755 --- a/t/t9700/test.pl +++ b/t/t9700/test.pl @@ -59,15 +59,15 @@ ok($@, "config_bool: non-boolean values fail"); open STDERR, ">&", $tmpstderr or die "cannot restore STDERR"; # ident -like($r->ident("aUthor"), qr/^A U Thor [0-9]+ \+0000$/, +like($r->ident("aUthor"), qr/^A U Thor [0-9]+ [+-]\d{4}$/, "ident scalar: author (type)"); -like($r->ident("cOmmitter"), qr/^C O Mitter [0-9]+ \+0000$/, +like($r->ident("cOmmitter"), qr/^C O Mitter [0-9]+ [+-]\d{4}$/, "ident scalar: committer (type)"); is($r->ident("invalid"), "invalid", "ident scalar: invalid ident string (no parsing)"); my ($name, $email, $time_tz) = $r->ident('author'); is_deeply([$name, $email], ["A U Thor", "author\@example.com"], "ident array: author"); -like($time_tz, qr/[0-9]+ \+0000/, "ident array: author"); +like($time_tz, qr/[0-9]+ [+-]\d{4}/, "ident array: author"); is_deeply([$r->ident("Name 123 +0000")], ["Name", "email", "123 +0000"], "ident array: ident string"); is_deeply([$r->ident("invalid")], [], "ident array: invalid ident string"); From f4eec0ba842a599649bb9a2abf06bf227c087d65 Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 9 Jul 2020 16:39:09 -0400 Subject: [PATCH 3/6] t5539: make timestamp requirements more explicit The test for "no shallow lines after receiving ACK ready" is very sensitive to the timestamps of the commits we create. It's looking for the fetch negotiation to send a "ready", which in turn depends on the order in which we traverse commits during the negotiation. It works reliably now because the base commit "7" is created without test_commit, and thus gets a commit time matching the current system clock. Whereas the new commits created in this test do use test_commit, and get the usual test_tick time from 2005. So the fetch into the "clone" repository results in a commit graph like this (I omitted some of the "unrelated" commits for clarity; they're all just a sequence of test_ticks): $ git log --graph --format='%ct %s %d' * 1112912953 new (origin/master, origin/HEAD) * 1594322236 7 (grafted, master) * 1112912893 unrelated15 (origin/unrelated15, unrelated15) [...] * 1112912053 unrelated1 (origin/unrelated1, unrelated1) * 1112911993 new-too (HEAD -> newnew, tag: new-too) The important things to see are: - "7" is way in the future compared to the other commits - "new-too" in the fetching repo is older than "new" (and its "unrelated" ancestors) in the shallow repo If we change our "setup shallow clone" step to use test_tick, too (and get rid of the dependency on the system clock), then the test will fail. The resulting graph looks like this: $ git log --graph --format='%ct %s %d' * 1112913373 new (origin/master, origin/HEAD) * 1112912353 7 (grafted, master) * 1112913313 unrelated15 (origin/unrelated15, unrelated15) [...] * 1112912473 unrelated1 (origin/unrelated1, unrelated1) * 1112912413 new-too (HEAD -> newnew, tag: new-too) Our "new-too" is still older than "new" and "unrelated", but now "7" is older than all of them (because it advanced test_tick, which the other tests built on top of). In the original, we advertised "7" as the first "have" before anything else, but now "new-too" is more recent. You'd see the same thing in the unlikely event that the system clock was set before our test_tick default in 2005. Let's make the timing requirements more explicit. The important thing is that the client advertise all of its shared commits first, before presenting its unique "new-too" commit. We can do that and get rid of the system clock dependency at the same time by creating all of the shared commits around time X (using test_tick), and then creating "new-too" with some time long before X. The resulting graph looks like this: $ git log --graph --format='%ct %s %d' * 1500001380 new (origin/master, origin/HEAD) * 1500000420 7 (grafted, master) * 1500001320 unrelated15 (origin/unrelated15, unrelated15) [...] * 1500000480 unrelated1 (origin/unrelated1, unrelated1) * 1400000060 new-too (HEAD -> newnew, tag: new-too) That also lets us get rid of the hacky test_tick added by f0e802ca20 (t5539: update a flaky test, 2014-07-14). That was clearly dancing around the same problem, but only addressed the relationship between commits created in the two subshells (which did use test_tick, but overlapped because increments of test_tick in subshells are lost). Now that we're using consistent and well-placed times for both lines of history, we don't have to care about a one-tick difference between the two sides. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- t/t5539-fetch-http-shallow.sh | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/t/t5539-fetch-http-shallow.sh b/t/t5539-fetch-http-shallow.sh index c0d02dee89..82aa99ae87 100755 --- a/t/t5539-fetch-http-shallow.sh +++ b/t/t5539-fetch-http-shallow.sh @@ -9,10 +9,12 @@ start_httpd commit() { echo "$1" >tracked && git add tracked && + test_tick && git commit -m "$1" } test_expect_success 'setup shallow clone' ' + test_tick=1500000000 && commit 1 && commit 2 && commit 3 && @@ -48,7 +50,6 @@ EOF test_expect_success 'no shallow lines after receiving ACK ready' ' ( cd shallow && - test_tick && for i in $(test_seq 15) do git checkout --orphan unrelated$i && @@ -66,6 +67,7 @@ test_expect_success 'no shallow lines after receiving ACK ready' ' ( cd clone && git checkout --orphan newnew && + test_tick=1400000000 && test_commit new-too && # NEEDSWORK: If the overspecification of the expected result is reduced, we # might be able to run this test in all protocol versions. From 96ac26fd056b175a9ac36cccc07da2906a15cc5d Mon Sep 17 00:00:00 2001 From: Jeff King Date: Tue, 14 Jul 2020 08:31:42 -0400 Subject: [PATCH 4/6] t9100: explicitly unset GIT_COMMITTER_DATE The early part of t9100 creates an unusual "doubled" history in the "git-svn" ref. When we get to t9100.17, it looks like this: $ git log --oneline --graph git-svn [...] * efd0303 detect node change from file to directory #2 |\ * | 3e727c0 detect node change from file to directory #2 |/ * 3b00468 try a deep --rmdir with a commit |\ * | b4832d8 try a deep --rmdir with a commit |/ * f0d7bd5 import for git svn Each commit we make with "git commit" is paired with one from "git svn set-tree", with the latter as a merge of the first and its grandparent. Later, t9100.17 wants to check that "git svn fetch" gets the same trees. And it does, but just one copy of each. So it uses rev-list to get the tree of each commit and pipes it to "uniq" to drop the duplicates. Our input isn't sorted, but it will find adjacent duplicates. This works reliably because the order of commits from rev-list always shows the duplicates next to each other. For any one of those merges, we could choose to show its duplicate or the grandparent first. But barring clocks running backwards, the duplicate will always have a time equal to or greater than the grandparent. Even if equal, we break ties by showing the first-parent first, so the duplicates remain adjacent. But this would break if the timestamps stopped moving in chronological order. Normally we would rely on test_tick for this, but we have _two_ sources of time here: - "git commit" creates one commit based on GIT_COMMITTER_DATE (which respects test_tick) - the "svn set-tree" one is based on subversion, which does not have an easy way to specify a timestamp So using test_tick actually breaks the test, because now the duplicates are far in the past, and we'll show the grandparent before the duplicate. And likewise, a proposed change to set GIT_COMMITTER_DATE in all scripts will break it. We _could_ fix this by sorting before removing duplicates, but presumably it's a useful part of the test to make sure the trees appear in the same order in both spots. Likewise, we could use something like: perl -ne 'print unless $seen{$_}++' to remove duplicates without impacting the order. But that doesn't work either, because there are actually multiple (non-duplicate) commits with the same trees (we change a file mode and then change it back). So we'd actually have to de-duplicate the combination of subject and tree. Which then further throws off t9100.18, which compares the tree hashes exactly; we'd have to strip the result back down. Since this test _isn't_ buggy, the simplest thing is to just work around the proposed change by documenting our expectation that git-created commits are correctly interleaved using the current time. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- t/t9100-git-svn-basic.sh | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/t/t9100-git-svn-basic.sh b/t/t9100-git-svn-basic.sh index 2c309a57d9..a89d338aa7 100755 --- a/t/t9100-git-svn-basic.sh +++ b/t/t9100-git-svn-basic.sh @@ -8,6 +8,10 @@ GIT_SVN_LC_ALL=${LC_ALL:-$LANG} . ./lib-git-svn.sh +# Make sure timestamps of commits created by Git interleave +# with those created by "svn set-tree". +unset GIT_COMMITTER_DATE + case "$GIT_SVN_LC_ALL" in *.UTF-8) test_set_prereq UTF8 From f2e3937d94c0859c8616861b1d6c87e4efe0790d Mon Sep 17 00:00:00 2001 From: Jeff King Date: Thu, 9 Jul 2020 16:42:04 -0400 Subject: [PATCH 5/6] test-lib: set deterministic default author/committer date We always set the name and email for committer and author idents to make the test suite more deterministic, but not timestamps. Many scripts use test_tick to get consistent and sensibly incrementing timestamps as they create commits. But other scripts don't particularly care about the timestamp, and are happy to use whatever the current system time is. This non-determinism can be annoying: - when debugging a test, comparing results between two runs can be difficult, because the commit ids change - this can sometimes cause tests to be racy. E.g., traversal order depends on timestamp order. Even in a well-ordered set of commands, because our timestamp granularity is one second, two commits might sometimes have the same timestamp and sometimes differ. Let's set a default timestamp for all scripts to use. Any that use test_tick already will be unaffected (because their first test_tick call will overwrite our default), but it will make things a bit more deterministic for those that don't. We should be able to choose any time we want here. I picked this one because: - it differs from the initial test_tick default, which may make it easier to distinguish when debugging tests. I picked "April 1st 13:14:15" in the hope that it might stand out. - it's slightly before the test_tick default. Some tests create some commits before the first call to test_tick, so using an older timestamps for those makes sense chronologically. Note that this isn't how things currently work (where system times are usually more recent than test_tick), but that also allows us to flush out a few hidden timestamp dependencies (like the one recently fixed in t5539). - we could likewise pick any timezone we want. Choosing +0000 would have required fixing up fewer tests, but we're more likely to turn up interesting cases by not matching $TZ exactly. And since test_tick already checks "-0700", let's try something in the "+" zone range for variety. It's possible that the non-deterministic times could help flush out bugs (e.g., if something broke when the clock flipped over to 2021, our test suite would let us know). But historically that hasn't been the case; all time-dependent outcomes we've seen turned out to be accidentally flaky tests (which we fixed by using test_tick). If we do want to cover handling the current time, we should dedicate one script to doing so, and have it unset GIT_COMMITTER_DATE explicitly. Signed-off-by: Jeff King Signed-off-by: Junio C Hamano --- t/test-lib.sh | 3 +++ 1 file changed, 3 insertions(+) diff --git a/t/test-lib.sh b/t/test-lib.sh index 0ea1e5a05e..6a76d8fbdc 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -408,15 +408,18 @@ TEST_AUTHOR_LOCALNAME=author TEST_AUTHOR_DOMAIN=example.com GIT_AUTHOR_EMAIL=${TEST_AUTHOR_LOCALNAME}@${TEST_AUTHOR_DOMAIN} GIT_AUTHOR_NAME='A U Thor' +GIT_AUTHOR_DATE='1112354055 +0200' TEST_COMMITTER_LOCALNAME=committer TEST_COMMITTER_DOMAIN=example.com GIT_COMMITTER_EMAIL=${TEST_COMMITTER_LOCALNAME}@${TEST_COMMITTER_DOMAIN} GIT_COMMITTER_NAME='C O Mitter' +GIT_COMMITTER_DATE='1112354055 +0200' GIT_MERGE_VERBOSITY=5 GIT_MERGE_AUTOEDIT=no export GIT_MERGE_VERBOSITY GIT_MERGE_AUTOEDIT export GIT_AUTHOR_EMAIL GIT_AUTHOR_NAME export GIT_COMMITTER_EMAIL GIT_COMMITTER_NAME +export GIT_COMMITTER_DATE GIT_AUTHOR_DATE export EDITOR # Tests using GIT_TRACE typically don't want : output From 180a4d76ac8f7c3e97be023e5943f719f3efb55b Mon Sep 17 00:00:00 2001 From: Jeff King Date: Wed, 15 Jul 2020 03:42:50 -0400 Subject: [PATCH 6/6] t9100: stop depending on commit timestamps An earlier "fix" to this script gave up updating it not to rely on the current time because we cannot control what timestamp subversion gives its commits. We however could solve the issue in a different way and still use deterministic timestamps on Git commits. One fix would be to sort the list of trees before removing duplicates, but that loses information: - we do care that the fetched history is in the same order - there's a tree which appears twice in the history, and we'd want to make sure that it's there both times So instead, let's de-duplicate using a hash (preserving the order), and drop only lines with identical trees and subjects (preserving the tree which appears twice, since it has different subjects each time). Signed-off-by: Jeff King Acked-by: Eric Wong Signed-off-by: Junio C Hamano --- t/t9100-git-svn-basic.sh | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/t/t9100-git-svn-basic.sh b/t/t9100-git-svn-basic.sh index a89d338aa7..8dd9645ce5 100755 --- a/t/t9100-git-svn-basic.sh +++ b/t/t9100-git-svn-basic.sh @@ -8,10 +8,6 @@ GIT_SVN_LC_ALL=${LC_ALL:-$LANG} . ./lib-git-svn.sh -# Make sure timestamps of commits created by Git interleave -# with those created by "svn set-tree". -unset GIT_COMMITTER_DATE - case "$GIT_SVN_LC_ALL" in *.UTF-8) test_set_prereq UTF8 @@ -204,8 +200,9 @@ GIT_SVN_ID=alt export GIT_SVN_ID test_expect_success "$name" \ 'git svn init "$svnrepo" && git svn fetch && - git rev-list --pretty=raw remotes/git-svn | grep ^tree | uniq > a && - git rev-list --pretty=raw remotes/alt | grep ^tree | uniq > b && + git log --format="tree %T %s" remotes/git-svn | + awk "!seen[\$0]++ { print \$1, \$2 }" >a && + git log --format="tree %T" alt >b && test_cmp a b' name='check imported tree checksums expected tree checksums'