Junio C Hamano [Fri, 30 Apr 2021 04:50:26 +0000 (13:50 +0900)]
Merge branch 'ds/sparse-index-protections'
Builds on top of the sparse-index infrastructure to mark operations
that are not ready to mark with the sparse index, causing them to
fall back on fully-populated index that they always have worked with.
* ds/sparse-index-protections: (47 commits)
name-hash: use expand_to_path()
sparse-index: expand_to_path()
name-hash: don't add directories to name_hash
revision: ensure full index
resolve-undo: ensure full index
read-cache: ensure full index
pathspec: ensure full index
merge-recursive: ensure full index
entry: ensure full index
dir: ensure full index
update-index: ensure full index
stash: ensure full index
rm: ensure full index
merge-index: ensure full index
ls-files: ensure full index
grep: ensure full index
fsck: ensure full index
difftool: ensure full index
commit: ensure full index
checkout: ensure full index
...
Junio C Hamano [Fri, 30 Apr 2021 04:50:25 +0000 (13:50 +0900)]
Merge branch 'ds/maintenance-prefetch-fix'
The prefetch task in "git maintenance" assumed that "git fetch"
from any remote would fetch all its local branches, which would
fetch too much if the user is interested in only a subset of
branches there.
* ds/maintenance-prefetch-fix:
maintenance: respect remote.*.skipFetchAll
maintenance: use 'git fetch --prefetch'
fetch: add --prefetch option
maintenance: simplify prefetch logic
Junio C Hamano [Fri, 30 Apr 2021 04:50:24 +0000 (13:50 +0900)]
Merge branch 'ow/push-quiet-set-upstream'
"git push --quiet --set-upstream" was not quiet when setting the
upstream branch configuration, which has been corrected.
* ow/push-quiet-set-upstream:
transport: respect verbosity when setting upstream
Junio C Hamano [Fri, 30 Apr 2021 04:50:24 +0000 (13:50 +0900)]
Merge branch 'mt/pkt-write-errors'
When packet_write() fails, we gave an extra error message
unnecessarily, which has been corrected.
* mt/pkt-write-errors:
pkt-line: do not report packet write errors twice
Junio C Hamano [Fri, 30 Apr 2021 04:50:24 +0000 (13:50 +0900)]
Merge branch 'jk/promisor-optim'
Handling of "promisor packs" that allows certain objects to be
missing and lazily retrievable has been optimized (a bit).
* jk/promisor-optim:
revision: avoid parsing with --exclude-promisor-objects
lookup_unknown_object(): take a repository argument
is_promisor_object(): free tree buffer after parsing
Junio C Hamano [Wed, 21 Apr 2021 00:21:10 +0000 (17:21 -0700)]
The twelfth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 21 Apr 2021 00:23:37 +0000 (17:23 -0700)]
Merge branch 'js/access-nul-emulation-on-windows'
Portability fix.
* js/access-nul-emulation-on-windows:
msvc: avoid calling `access("NUL", flags)`
Junio C Hamano [Wed, 21 Apr 2021 00:23:37 +0000 (17:23 -0700)]
Merge branch 'sg/bugreport-fixes'
The dependencies for config-list.h and command-list.h were broken
when the former was split out of the latter, which has been
corrected.
* sg/bugreport-fixes:
Makefile: add missing dependencies of 'config-list.h'
Junio C Hamano [Wed, 21 Apr 2021 00:23:36 +0000 (17:23 -0700)]
Merge branch 'jc/doc-do-not-capitalize-clarification'
Doc update for developers.
* jc/doc-do-not-capitalize-clarification:
doc: clarify "do not capitalize the first word" rule
Junio C Hamano [Wed, 21 Apr 2021 00:23:36 +0000 (17:23 -0700)]
Merge branch 'ab/usage-error-docs'
Documentation updates, with unrelated comment updates, too.
* ab/usage-error-docs:
api docs: document that BUG() emits a trace2 error event
api docs: document BUG() in api-error-handling.txt
usage.c: don't copy/paste the same comment three times
Junio C Hamano [Wed, 21 Apr 2021 00:23:36 +0000 (17:23 -0700)]
Merge branch 'ab/detox-gettext-tests'
Test clean-up.
* ab/detox-gettext-tests:
tests: remove all uses of test_i18cmp
Junio C Hamano [Wed, 21 Apr 2021 00:23:36 +0000 (17:23 -0700)]
Merge branch 'jt/fetch-pack-request-fix'
* jt/fetch-pack-request-fix:
fetch-pack: buffer object-format with other args
Junio C Hamano [Wed, 21 Apr 2021 00:23:35 +0000 (17:23 -0700)]
Merge branch 'hn/reftable-tables-doc-update'
Doc updte.
* hn/reftable-tables-doc-update:
reftable: document an alternate cleanup method on Windows
Junio C Hamano [Wed, 21 Apr 2021 00:23:35 +0000 (17:23 -0700)]
Merge branch 'jk/pack-objects-bitmap-progress-fix'
When "git pack-objects" makes a literal copy of a part of existing
packfile using the reachability bitmaps, its update to the progress
meter was broken.
* jk/pack-objects-bitmap-progress-fix:
pack-objects: update "nr_seen" progress based on pack-reused count
Junio C Hamano [Wed, 21 Apr 2021 00:23:34 +0000 (17:23 -0700)]
Merge branch 'ab/userdiff-tests'
A bit of code clean-up and a lot of test clean-up around userdiff
area.
* ab/userdiff-tests:
blame tests: simplify userdiff driver test
blame tests: don't rely on t/t4018/ directory
userdiff: remove support for "broken" tests
userdiff tests: list builtin drivers via test-tool
userdiff tests: explicitly test "default" pattern
userdiff: add and use for_each_userdiff_driver()
userdiff style: normalize pascal regex declaration
userdiff style: declare patterns with consistent style
userdiff style: re-order drivers in alphabetical order
Junio C Hamano [Wed, 21 Apr 2021 00:23:34 +0000 (17:23 -0700)]
Merge branch 'ar/userdiff-scheme'
Userdiff patterns for "Scheme" has been added.
* ar/userdiff-scheme:
userdiff: add support for Scheme
Junio C Hamano [Fri, 16 Apr 2021 20:53:00 +0000 (13:53 -0700)]
The eleventh (aka "ort") batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 16 Apr 2021 20:53:34 +0000 (13:53 -0700)]
Merge branch 'ah/merge-ort-ubsan-fix'
Code clean-up for merge-ort backend.
* ah/merge-ort-ubsan-fix:
merge-ort: only do pointer arithmetic for non-empty lists
Junio C Hamano [Fri, 16 Apr 2021 20:53:34 +0000 (13:53 -0700)]
Merge branch 'en/ort-readiness'
Plug the ort merge backend throughout the rest of the system, and
start testing it as a replacement for the recursive backend.
* en/ort-readiness:
Add testing with merge-ort merge strategy
t6423: mark remaining expected failure under merge-ort as such
Revert "merge-ort: ignore the directory rename split conflict for now"
merge-recursive: add a bunch of FIXME comments documenting known bugs
merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict
t: mark several submodule merging tests as fixed under merge-ort
merge-ort: implement CE_SKIP_WORKTREE handling with conflicted entries
t6428: new test for SKIP_WORKTREE handling and conflicts
merge-ort: support subtree shifting
merge-ort: let renormalization change modify/delete into clean delete
merge-ort: have ll_merge() use a special attr_index for renormalization
merge-ort: add a special minimal index just for renormalization
merge-ort: use STABLE_QSORT instead of QSORT where required
Junio C Hamano [Fri, 16 Apr 2021 20:53:33 +0000 (13:53 -0700)]
Merge branch 'en/ort-perf-batch-10'
Various rename detection optimization to help "ort" merge strategy
backend.
* en/ort-perf-batch-10:
diffcore-rename: determine which relevant_sources are no longer relevant
merge-ort: record the reason that we want a rename for a file
diffcore-rename: add computation of number of unknown renames
diffcore-rename: check if we have enough renames for directories early on
diffcore-rename: only compute dir_rename_count for relevant directories
merge-ort: record the reason that we want a rename for a directory
merge-ort, diffcore-rename: tweak dirs_removed and relevant_source type
diffcore-rename: take advantage of "majority rules" to skip more renames
Derrick Stolee [Fri, 16 Apr 2021 12:49:59 +0000 (12:49 +0000)]
maintenance: respect remote.*.skipFetchAll
If a remote has the skipFetchAll setting enabled, then that remote is
not intended for frequent fetching. It makes sense to not fetch that
data during the 'prefetch' maintenance task. Skip that remote in the
iteration without error. The skip_default_update member is initialized
in remote.c:handle_config() as part of initializing the 'struct remote'.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Fri, 16 Apr 2021 12:49:58 +0000 (12:49 +0000)]
maintenance: use 'git fetch --prefetch'
The 'prefetch' maintenance task previously forced the following refspec
for each remote:
+refs/heads/*:refs/prefetch/<remote>/*
If a user has specified a more strict refspec for the remote, then this
prefetch task downloads more objects than necessary.
The previous change introduced the '--prefetch' option to 'git fetch'
which manipulates the remote's refspec to place all resulting refs into
refs/prefetch/, with further partitioning based on the destinations of
those refspecs.
Update the documentation to be more generic about the destination refs.
Do not mention custom refspecs explicitly, as that does not need to be
highlighted in this documentation. The important part of placing refs in
refs/prefetch/ remains.
Reported-by: Tom Saeger <tom.saeger@oracle.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Fri, 16 Apr 2021 12:49:57 +0000 (12:49 +0000)]
fetch: add --prefetch option
The --prefetch option will be used by the 'prefetch' maintenance task
instead of sending refspecs explicitly across the command-line. The
intention is to modify the refspec to place all results in
refs/prefetch/ instead of anywhere else.
Create helper method filter_prefetch_refspec() to modify a given refspec
to fit the rules expected of the prefetch task:
* Negative refspecs are preserved.
* Refspecs without a destination are removed.
* Refspecs whose source starts with "refs/tags/" are removed.
* Other refspecs are placed within "refs/prefetch/".
Finally, we add the 'force' option to ensure that prefetch refs are
replaced as necessary.
There are some interesting cases that are worth testing.
An earlier version of this change dropped the "i--" from the loop that
deletes a refspec item and shifts the remaining entries down. This
allowed some refspecs to not be modified. The subtle part about the
first --prefetch test is that the "refs/tags/*" refspec appears directly
before the "refs/heads/bogus/*" refspec. Without that "i--", this
ordering would remove the "refs/tags/*" refspec and leave the last one
unmodified, placing the result in "refs/heads/*".
It is possible to have an empty refspec. This is typically the case for
remotes other than the origin, where users want to fetch a specific tag
or branch. To correctly test this case, we need to further remove the
upstream remote for the local branch. Thus, we are testing a refspec
that will be deleted, leaving nothing to fetch.
Helped-by: Tom Saeger <tom.saeger@oracle.com>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Johannes Schindelin [Fri, 16 Apr 2021 11:21:01 +0000 (13:21 +0200)]
msvc: avoid calling `access("NUL", flags)`
Apparently this is not supported with Microsoft's Universal C Runtime.
So let's not actually do that.
Instead, just return success because we _know_ that we expect the `NUL`
device to be present.
Side note: it is possible to turn off the "Null device driver" and
thereby disable `NUL`. Too many things are broken if this driver is
disabled, therefore it is not worth bothering to try to detect its
presence when `access()` is called.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Matheus Tavares [Thu, 15 Apr 2021 21:57:52 +0000 (18:57 -0300)]
pkt-line: do not report packet write errors twice
On write() errors, packet_write() dies with the same error message that
is already printed by its callee, packet_write_gently(). This produces
an unnecessarily verbose and repetitive output:
error: packet write failed
fatal: packet write failed: <strerror() message>
In addition to that, packet_write_gently() does not always fulfill its
caller expectation that errno will be properly set before a non-zero
return. In particular, that is not the case for a "data exceeds max
packet size" error. So, in this case, packet_write() will call
die_errno() and print an strerror(errno) message that might be totally
unrelated to the actual error.
Fix both those issues by turning packet_write() and
packet_write_gently() into wrappers to a common lower level function
that doesn't print the error message, but instead returns it on a buffer
for the caller to die() or error() as appropriate.
Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 15 Apr 2021 20:35:41 +0000 (13:35 -0700)]
The tenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 15 Apr 2021 20:36:01 +0000 (13:36 -0700)]
Merge branch 'jz/apply-3way-cached'
"git apply" now takes "--3way" and "--cached" at the same time, and
work and record results only in the index.
* jz/apply-3way-cached:
git-apply: allow simultaneous --cached and --3way options
Junio C Hamano [Thu, 15 Apr 2021 20:36:01 +0000 (13:36 -0700)]
Merge branch 'ab/complete-cherry-pick-head'
The command line completion (in contrib/) has learned that
CHERRY_PICK_HEAD is a possible pseudo-ref.
* ab/complete-cherry-pick-head:
bash completion: complete CHERRY_PICK_HEAD
Junio C Hamano [Thu, 15 Apr 2021 20:36:00 +0000 (13:36 -0700)]
Merge branch 'jz/apply-run-3way-first'
"git apply --3way" has always been "to fall back to 3-way merge
only when straight application fails". Swap the order of falling
back so that 3-way is always attempted first (only when the option
is given, of course) and then straight patch application is used as
a fallback when it fails.
* jz/apply-run-3way-first:
git-apply: try threeway first when "--3way" is used
Øystein Walle [Thu, 15 Apr 2021 12:33:53 +0000 (14:33 +0200)]
transport: respect verbosity when setting upstream
A command such as `git push -qu origin feature` will print "Branch
'feature' set up to track remote branch 'feature' from 'origin'." even
when --quiet is passed. In this case it's because install_branch_config() is
always called with BRANCH_CONFIG_VERBOSE.
struct transport keeps track of the desired verbosity. Fix the above
issue by passing BRANCH_CONFIG_VERBOSE conditionally based on that.
Signed-off-by: Øystein Walle <oystwa@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 14 Apr 2021 23:51:17 +0000 (16:51 -0700)]
doc: clarify "do not capitalize the first word" rule
The same "do not capitalize the first word" rule is applied to both
our patch titles and error messages, but the existing description
was fuzzy in two aspects.
* For error messages, it was not said that this was only about the
first word that begins the sentence.
* For both, it was not clear when a capital letter there was not an
error. We avoid capitalizing the first word when the only reason
you would capitalize it is because it happens to be the first
word in the sentence. If a proper noun, which is usually spelled
in capital letters, happens to come at the beginning of the
sentence, it should be kept in capital letters.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Mon, 12 Apr 2021 21:08:17 +0000 (21:08 +0000)]
name-hash: use expand_to_path()
A sparse-index loads the name-hash data for its entries, including the
sparse-directory entries. If a caller asks for a path that is contained
within a sparse-directory entry, we need to expand to a full index and
recalculate the name hash table before returning the result. Insert
calls to expand_to_path() to protect against this case.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Mon, 12 Apr 2021 21:08:16 +0000 (21:08 +0000)]
sparse-index: expand_to_path()
Some users of the index API have a specific path they are looking for,
but choose to use index_file_exists() to rely on the name-hash hashtable
instead of doing binary search with index_name_pos(). These users only
need to know a yes/no answer, not a position within the cache array.
When the index is sparse, the name-hash hash table does not contain the
full list of paths within sparse directories. It _does_ contain the
directory names for the sparse-directory entries.
Create a helper function, expand_to_path(), for intended use with the
name-hash hashtable functions. The integration with name-hash.c will
follow in a later change.
The solution here is to use ensure_full_index() when we determine that
the requested path is within a sparse directory entry. This will
populate the name-hash hashtable as the index is recomputed from
scratch.
There may be cases where the caller is trying to find an untracked path
that is not in the index but also is not within a sparse directory
entry. We want to minimize the overhead for these requests. If we used
index_name_pos() to find the insertion order of the path, then we could
determine from that position if a sparse-directory exists. (In fact,
just calling index_name_pos() in that case would lead to expanding the
index to a full index.) However, this takes O(log N) time where N is the
number of cache entries.
To keep the performance of this call based mostly on the input string,
use index_file_exists() to look for the ancestors of the path. Using the
heuristic that a sparse directory is likely to have a small number of
parent directories, we start from the bottom and build up. Use a string
buffer to allow mutating the path name to terminate after each slash for
each hashset test.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Mon, 12 Apr 2021 21:08:15 +0000 (21:08 +0000)]
name-hash: don't add directories to name_hash
Sparse directory entries represent a directory that is outside the
sparse-checkout definition. These are not paths to blobs, so should not
be added to the name_hash table. Instead, they should be added to the
directory hashtable when 'ignore_case' is true.
Add a condition to avoid placing sparse directories into the name_hash
hashtable. This avoids filling the table with extra entries that will
never be queried.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:50:00 +0000 (01:50 +0000)]
revision: ensure full index
Before iterating over all index entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior. This case could
be integrated later by ensuring that we walk the tree in the
sparse-directory entry, but the current behavior is only expecting
blobs. Save this integration for later when it can be properly tested.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:59 +0000 (01:49 +0000)]
resolve-undo: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:58 +0000 (01:49 +0000)]
read-cache: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:57 +0000 (01:49 +0000)]
pathspec: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:56 +0000 (01:49 +0000)]
merge-recursive: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:55 +0000 (01:49 +0000)]
entry: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:54 +0000 (01:49 +0000)]
dir: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:53 +0000 (01:49 +0000)]
update-index: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:52 +0000 (01:49 +0000)]
stash: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:51 +0000 (01:49 +0000)]
rm: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:50 +0000 (01:49 +0000)]
merge-index: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:49 +0000 (01:49 +0000)]
ls-files: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one to avoid missing files.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:48 +0000 (01:49 +0000)]
grep: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full one so we do not miss blobs to scan. Later, this can
integrate more carefully with sparse indexes with proper testing.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:47 +0000 (01:49 +0000)]
fsck: ensure full index
When verifying all blobs reachable from the index, ensure that a sparse
index has been expanded to a full one to avoid missing some blobs.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:46 +0000 (01:49 +0000)]
difftool: ensure full index
Before iterating over all cache entries, ensure that a sparse index has
been expanded to a full one to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:45 +0000 (01:49 +0000)]
commit: ensure full index
These two loops iterate over all cache entries, so ensure that a sparse
index is expanded to a full index before we do so.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:44 +0000 (01:49 +0000)]
checkout: ensure full index
Before iterating over all cache entries in the checkout builtin, ensure
that we have a full index to avoid any unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:43 +0000 (01:49 +0000)]
checkout-index: ensure full index
Before we iterate over all cache entries, ensure that the index is not
sparse. This loop in checkout_all() might be safe to iterate over a
sparse index, but let's put this protection here until it can be
carefully tested.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:42 +0000 (01:49 +0000)]
add: ensure full index
Before iterating over all cache entries, ensure that a sparse index is
expanded to a full index to avoid unexpected behavior.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:41 +0000 (01:49 +0000)]
cache: move ensure_full_index() to cache.h
Soon we will insert ensure_full_index() calls across the codebase.
Instead of also adding include statements for sparse-index.h, let's just
use the fact that anything that cares about the index already has
cache.h in its includes.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:40 +0000 (01:49 +0000)]
read-cache: expand on query into sparse-directory entry
Callers to index_name_pos() or index_name_stage_pos() have a specific
path in mind. If that happens to be a path with an ancestor being a
sparse-directory entry, it can lead to unexpected results.
In the case that we did not find the requested path, check to see if the
position _before_ the inserted position is a sparse directory entry that
matches the initial segment of the input path (including the directory
separator at the end of the directory name). If so, then expand the
index to be a full index and search again. This expansion will only
happen once per index read.
Future enhancements could be more careful to expand only the necessary
sparse directory entry, but then we would have a special "not fully
sparse, but also not fully expanded" mode that could affect writing the
index to file. Since this only occurs if a specific file is requested
outside of the sparse checkout definition, this is unlikely to be a
common situation.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:39 +0000 (01:49 +0000)]
*: remove 'const' qualifier for struct index_state
Several methods specify that they take a 'struct index_state' pointer
with the 'const' qualifier because they intend to only query the data,
not change it. However, we will be introducing a step very low in the
method stack that might modify a sparse-index to become a full index in
the case that our queries venture inside a sparse-directory entry.
This change only removes the 'const' qualifiers that are necessary for
the following change which will actually modify the implementation of
index_name_stage_pos().
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Derrick Stolee [Thu, 1 Apr 2021 01:49:38 +0000 (01:49 +0000)]
sparse-index: API protection strategy
Edit and expand the sparse-index design document with the plan for
guarding index operations with ensure_full_index().
Notably, the plan has changed to not have an expand_to_path() method in
favor of checking for a sparse-directory hit inside of the
index_path_pos() API.
The changes that follow this one will incrementally add
ensure_full_index() guards to iterations over all cache entries. Some
iterations over the cache entries are not protected due to a few
categories listed in the document. Since these are not being modified,
here is a short list of the files and methods that will not receive
these guards:
Looking for non-zero stage:
* builtin/add.c:chmod_pathspec()
* builtin/merge.c:count_unmerged_entries()
* merge-ort.c:record_conflicted_index_entries()
* read-cache.c:unmerged_index()
* rerere.c:check_one_conflict(), find_conflict(), rerere_remaining()
* revision.c:prepare_show_merge()
* sequencer.c:append_conflicts_hint()
* wt-status.c:wt_status_collect_changes_initial()
Looking for submodules:
* builtin/submodule--helper.c:module_list_compute()
* submodule.c: several methods
* worktree.c:validate_no_submodules()
Part of the index API:
* name-hash.c: lazy init methods
* preload-index.c:preload_thread(), preload_index()
* read-cache.c: file format methods
Checking for correct order of cache entries:
* read-cache.c:check_ce_order()
Ignores SKIP_WORKTREE entries or already aware:
* unpack-trees.c:mark_new_skip_worktree()
* wt-status.c:wt_status_check_sparse_checkout()
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 13 Apr 2021 22:27:31 +0000 (15:27 -0700)]
The ninth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Tue, 13 Apr 2021 22:28:53 +0000 (15:28 -0700)]
Merge branch 'vs/completion-with-set-u'
The command-line completion script (in contrib/) had a couple of
references that would have given a warning under the "-u" (nounset)
option.
* vs/completion-with-set-u:
completion: audit and guard $GIT_* against unset use
Junio C Hamano [Tue, 13 Apr 2021 22:28:52 +0000 (15:28 -0700)]
Merge branch 'ab/detox-config-gettext'
The last remnant of gettext-poison has been removed.
* ab/detox-config-gettext:
config.c: remove last remnant of GIT_TEST_GETTEXT_POISON
Junio C Hamano [Tue, 13 Apr 2021 22:28:52 +0000 (15:28 -0700)]
Merge branch 'gk/gitweb-redacted-email'
"gitweb" learned "e-mail privacy" feature to redact strings that
look like e-mail addresses on various pages.
* gk/gitweb-redacted-email:
gitweb: add "e-mail privacy" feature to redact e-mail addresses
Junio C Hamano [Tue, 13 Apr 2021 22:28:52 +0000 (15:28 -0700)]
Merge branch 'cc/test-helper-bloom-usage-fix'
Usage message fix for a test helper.
* cc/test-helper-bloom-usage-fix:
test-bloom: fix missing 'bloom' from usage string
Junio C Hamano [Tue, 13 Apr 2021 22:28:51 +0000 (15:28 -0700)]
Merge branch 'ab/send-email-validate-errors'
Clean-up codepaths that implements "git send-email --validate"
option and improves the message from it.
* ab/send-email-validate-errors:
git-send-email: improve --validate error output
git-send-email: refactor duplicate $? checks into a function
git-send-email: test full --validate output
Junio C Hamano [Tue, 13 Apr 2021 22:28:51 +0000 (15:28 -0700)]
Merge branch 'tb/precompose-prefix-simplify'
Streamline the codepath to fix the UTF-8 encoding issues in the
argv[] and the prefix on macOS.
* tb/precompose-prefix-simplify:
macOS: precompose startup_info->prefix
precompose_utf8: make precompose_string_if_needed() public
Junio C Hamano [Tue, 13 Apr 2021 22:28:51 +0000 (15:28 -0700)]
Merge branch 'fm/user-manual-use-preface'
Doc update to improve git.info
* fm/user-manual-use-preface:
user-manual.txt: assign preface an id and a title
Junio C Hamano [Tue, 13 Apr 2021 22:28:50 +0000 (15:28 -0700)]
Merge branch 'ab/perl-do-not-abuse-map'
Perl critique.
* ab/perl-do-not-abuse-map:
git-send-email: replace "map" in void context with "for"
Junio C Hamano [Tue, 13 Apr 2021 22:28:50 +0000 (15:28 -0700)]
Merge branch 'tb/pack-preferred-tips-to-give-bitmap'
A configuration variable has been added to force tips of certain
refs to be given a reachability bitmap.
* tb/pack-preferred-tips-to-give-bitmap:
builtin/pack-objects.c: respect 'pack.preferBitmapTips'
t/helper/test-bitmap.c: initial commit
pack-bitmap: add 'test_bitmap_commits()' helper
Junio C Hamano [Tue, 13 Apr 2021 22:28:50 +0000 (15:28 -0700)]
Merge branch 'jk/ref-filter-segfault-fix'
A NULL-dereference bug has been corrected in an error codepath in
"git for-each-ref", "git branch --list" etc.
* jk/ref-filter-segfault-fix:
ref-filter: fix NULL check for parse object failure
Ævar Arnfjörð Bjarmason [Tue, 13 Apr 2021 09:08:21 +0000 (11:08 +0200)]
api docs: document that BUG() emits a trace2 error event
Correct documentation added in
e544221d97a (trace2:
Documentation/technical/api-trace2.txt, 2019-02-22) to state that
calling BUG() also emits an "error" event. See
ee4512ed481 (trace2:
create new combined trace facility, 2019-02-22) for the initial
implementation.
The BUG() function did not emit an event then however, that was only
changed later in
0a9dde4a04c (usage: trace2 BUG() invocations,
2021-02-05), that commit changed the code, but didn't update any of
the docs.
Let's also add a cross-reference from api-error-handling.txt.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ævar Arnfjörð Bjarmason [Tue, 13 Apr 2021 09:08:20 +0000 (11:08 +0200)]
api docs: document BUG() in api-error-handling.txt
When the BUG() function was added in
d8193743e08 (usage.c: add BUG()
function, 2017-05-12) these docs added in
1f23cfe0ef5 (doc: document
error handling functions and conventions, 2014-12-03) were not
updated. Let's do that.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ævar Arnfjörð Bjarmason [Tue, 13 Apr 2021 09:08:19 +0000 (11:08 +0200)]
usage.c: don't copy/paste the same comment three times
In
ee4512ed481 (trace2: create new combined trace facility,
2019-02-22) we started with two copies of this comment,
0ee10fd1296 (usage: add trace2 entry upon warning(), 2020-11-23) added
a third. Let's instead add an earlier comment that applies to all
these mostly-the-same functions.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ævar Arnfjörð Bjarmason [Tue, 13 Apr 2021 12:19:51 +0000 (14:19 +0200)]
tests: remove all uses of test_i18cmp
Finish the removal I started in
1108cea7f8e (tests: remove most uses
of test_i18ncmp, 2021-02-11). At that time the function wasn't removed
due to disruption with in-flight changes, remove the occurrences that
have landed since then.
As of writing this there are no test_i18ncmp uses between "master" and
"seen", so let's also remove the function to finally put it to rest.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 13 Apr 2021 07:17:48 +0000 (03:17 -0400)]
revision: avoid parsing with --exclude-promisor-objects
When --exclude-promisor-objects is given, before traversing any objects
we iterate over all of the objects in any promisor packs, marking them
as UNINTERESTING and SEEN. We turn the oid we get from iterating the
pack into an object with parse_object(), but this has two problems:
- it's slow; we are zlib inflating (and reconstructing from deltas)
every byte of every object in the packfile
- it leaves the tree buffers attached to their structs, which means
our heap usage will grow to store every uncompressed tree
simultaneously. This can be gigabytes.
We can obviously fix the second by freeing the tree buffers after we've
parsed them. But we can observe that the function doesn't look at the
object contents at all! The only reason we call parse_object() is that
we need a "struct object" on which to set the flags. There are two
options here:
- we can look up just the object type via oid_object_info(), and then
call the appropriate lookup_foo() function
- we can call lookup_unknown_object(), which gives us an OBJ_NONE
struct (which will get auto-converted later by object_as_type() via
calls to lookup_commit(), etc).
The first one is closer to the current code, but we do pay the price to
look up the type for each object. The latter should be more efficient in
CPU, though it wastes a little bit of memory (the "unknown" object
structs are a union of all object types, so some of the structs are
bigger than they need to be). It also runs the risk of triggering a
latent bug in code that calls lookup_object() directly but isn't ready
to handle OBJ_NONE (such code would already be buggy, but we use
lookup_unknown_object() infrequently enough that it might be hiding).
I went with the second option here. I don't think the risk is high (and
we'd want to find and fix any such bugs anyway), and it should be more
efficient overall.
The new tests in p5600 show off the improvement (this is on git.git):
Test HEAD^ HEAD
-------------------------------------------------------------------------------
5600.5: count commits 0.37(0.37+0.00) 0.38(0.38+0.00) +2.7%
5600.6: count non-promisor commits 11.74(11.37+0.37) 0.04(0.03+0.00) -99.7%
The improvement is particularly big in this script because _every_
object in the newly-cloned partial repo is a promisor object. So after
marking them all, there's nothing left to traverse.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 13 Apr 2021 07:16:36 +0000 (03:16 -0400)]
lookup_unknown_object(): take a repository argument
All of the other lookup_foo() functions take a repository argument, but
lookup_unknown_object() was never converted, and it uses the_repository
internally. Let's fix that.
We could leave a wrapper that uses the_repository, but there aren't that
many calls, so we'll just convert them all. I looked briefly at each
site to see if we had a repository struct (besides the_repository) we
could pass, but none of them do (so this conversion to pass
the_repository is a pure noop in each case, though it does take us one
step closer to eventually getting rid of the_repository).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Tue, 13 Apr 2021 07:15:54 +0000 (03:15 -0400)]
is_promisor_object(): free tree buffer after parsing
To get the list of all promisor objects, we not only include all objects
in promisor packs, but also parse each of those objects to see which
objects they reference. After parsing a tree object, the tree->buffer
field will remain populated until we explicitly free it. So in a partial
clone of blob:none, for example, we are essentially reading every tree
in the repository (since they're all in the initial promisor pack), and
keeping all of their uncompressed contents in memory at once.
This patch frees the tree buffers after we've finished marking all of
their reachable objects. We shouldn't need to do this for any other
object type. While we are using some extra memory to store the structs,
no other object type stores the whole contents in its parsed form (we do
sometimes hold on to commit buffers, but less so these days due to
commit graphs, plus most commands which care about promisor objects turn
off the save_commit_buffer global).
Even for a moderate-sized repository like git.git, this patch drops the
peak heap (as measured by massif) for git-fsck from ~1.7GB to ~138MB.
Fsck is a good candidate for measuring here because it doesn't interact
with the promisor code except to call is_promisor_object(), so we can
isolate just this problem.
The added perf test shows only a tiny improvement on my machine for
git.git, since 1.7GB isn't enough to cause any real memory pressure:
Test HEAD^ HEAD
--------------------------------------------------------------------------------
5600.4: fsck 21.26(20.90+0.35) 20.84(20.79+0.04) -2.0%
With linux.git the absolute change is a bit bigger, though still a small
percentage:
Test HEAD^ HEAD
-----------------------------------------------------------------------------
5600.4: fsck 262.26(259.13+3.12) 254.92(254.62+0.29) -2.8%
I didn't have the patience to run it under massif with linux.git, but
it's probably on the order of about 14GB improvement, since that's the
sum of the sizes of all of the uncompressed trees (but still isn't
enough to create memory pressure on this particular machine, which has
64GB of RAM). Smaller machines would probably see a bigger effect on
runtime (and sadly our perf suite does not measure peak heap).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Han-Wen Nienhuys [Mon, 12 Apr 2021 19:12:36 +0000 (19:12 +0000)]
reftable: document an alternate cleanup method on Windows
The new method uses the update_index counter, which isn't susceptible to clock
inaccuracies.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jeff King [Mon, 12 Apr 2021 03:41:18 +0000 (23:41 -0400)]
pack-objects: update "nr_seen" progress based on pack-reused count
When serving a clone or fetch with bitmaps, after deciding which objects
need to be sent our "pack reuse" mechanism kicks in: we try to send
more-or-less verbatim a bunch of objects from the beginning of the
bitmapped packfile without even adding them to the to_pack.objects
array.
After deciding which objects will be in the "reused" portion, we update
nr_result to account for those, and then trigger display_progress() to
show the user (who is undoubtedly dazzled that we managed to enumerate
so many objects so quickly).
But then something confusing happens: the "Enumerating objects" progress
meter jumps _backwards_, counting up from zero the number of objects we
actually add into to_pack.objects.
This worked correctly once upon a time, but was broken in
5af050437a
(pack-objects: show some progress when counting kept objects,
2018-04-15), when the latter half of that progress meter switched to
using a separate nr_seen counter, rather than nr_result. Nobody noticed
for two reasons:
- prior to the pack-reuse fixes from
a14aebeac3 (Merge branch
'jk/packfile-reuse-cleanup', 2020-02-14), the reuse code almost
never kicked in anyway
- the output looks _kind of_ correct. The "backwards" moment is hard
to catch, because we overwrite the old progress number with the new
one, and the larger number is displayed only for a second. So unless
you look at that exact second, you just see the much smaller value,
counting up to the number of non-reused objects (though of course if
you catch it in stderr, or look at GIT_TRACE_PACKET from a server
with bitmaps, you can see both values).
This smaller output isn't wrong per se, but isn't counting what we ever
intended to. We should give the user the whole number of objects we
considered (which, as per
5af050437a's original purpose, is already
_not_ a count of what goes into to_pack.objects). The follow-on
"Counting objects" meter shows the actual number of objects we feed into
that array.
We can easily fix this by bumping (and showing) nr_seen for the
pack-reused objects. When the included test is run without this patch,
the second pack-objects invocation produces "Enumerating objects: 1" to
show the one loose object, even though the resulting pack has hundreds
of objects in it. With it, we jump to "Enumerating objects: 674" after
deciding on reuse, and then "675" when we add in the loose object.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Andrzej Hunt [Sun, 11 Apr 2021 11:05:06 +0000 (11:05 +0000)]
merge-ort: only do pointer arithmetic for non-empty lists
versions could be an empty string_list. In that case, versions->items is
NULL, and we shouldn't be trying to perform pointer arithmetic with it (as
that results in undefined behaviour).
Moreover we only use the results of this calculation once when calling
QSORT. Therefore we choose to skip creating relevant_entries and call
QSORT directly with our manipulated pointers (but only if there's data
requiring sorting). This lets us avoid abusing the string_list API,
and saves us from having to explain why this abuse is OK.
Finally, an assertion is added to make sure that write_tree() is called
with a valid offset.
This issue has probably existed since:
ee4012dcf9 (merge-ort: step 2 of tree writing -- function to create tree object, 2020-12-13)
But it only started occurring during tests since tests started using
merge-ort:
f3b964a07e (Add testing with merge-ort merge strategy, 2021-03-20)
For reference - here's the original UBSAN commit that implemented this
check, it sounds like this behaviour isn't actually likely to cause any
issues (but we might as well fix it regardless):
https://reviews.llvm.org/D67122
UBSAN output from t3404 or t5601:
merge-ort.c:2669:43: runtime error: applying zero offset to null pointer
#0 0x78bb53 in write_tree merge-ort.c:2669:43
#1 0x7856c9 in process_entries merge-ort.c:3303:2
#2 0x782317 in merge_ort_nonrecursive_internal merge-ort.c:3744:2
#3 0x77feef in merge_incore_nonrecursive merge-ort.c:3853:2
#4 0x6f6a5c in do_recursive_merge sequencer.c:640:3
#5 0x6f6a5c in do_pick_commit sequencer.c:2221:9
#6 0x6ef055 in single_pick sequencer.c:4814:9
#7 0x6ef055 in sequencer_pick_revisions sequencer.c:4867:10
#8 0x4fb392 in run_sequencer revert.c:225:9
#9 0x4fa5b0 in cmd_revert revert.c:235:8
#10 0x42abd7 in run_builtin git.c:453:11
#11 0x429531 in handle_builtin git.c:704:3
#12 0x4282fb in run_argv git.c:771:4
#13 0x4282fb in cmd_main git.c:902:19
#14 0x524b63 in main common-main.c:52:11
#15 0x7fc2ca340349 in __libc_start_main (/lib64/libc.so.6+0x24349)
#16 0x4072b9 in _start start.S:120
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior merge-ort.c:2669:43 in
Signed-off-by: Andrzej Hunt <ajrhunt@google.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Jonathan Tan [Fri, 9 Apr 2021 01:09:58 +0000 (18:09 -0700)]
fetch-pack: buffer object-format with other args
In send_fetch_request(), "object-format" is written directly to the file
descriptor, as opposed to the other arguments, which are buffered.
Buffer "object-format" as well. "object-format" must be buffered; in
particular, it must appear after "command=fetch" in the request.
This divergence was introduced in
4b831208bb ("fetch-pack: parse and
advertise the object-format capability", 2020-05-27), perhaps as an
oversight (the surrounding code at the point of this commit has already
been using a request buffer.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Georgios Kontaxis [Sun, 28 Mar 2021 23:26:03 +0000 (23:26 +0000)]
gitweb: add "e-mail privacy" feature to redact e-mail addresses
Gitweb extracts content from the Git log and makes it accessible
over HTTP. As a result, e-mail addresses found in commits are
exposed to web crawlers and they may not respect robots.txt.
This can result in unsolicited messages.
Introduce an 'email-privacy' feature which redacts e-mail addresses
from the generated HTML content. Specifically, obscure addresses
retrieved from the the author/committer and comment sections of the
Git log. The feature is off by default.
This feature does not prevent someone from downloading the
unredacted commit log, e.g., by cloning the repository, and
extracting information from it. It aims to hinder the low-
effort, bulk collection of e-mail addresses by web crawlers.
Signed-off-by: Georgios Kontaxis <geko1702+commits@99rst.org>
Acked-by: Eric Wong <e@80x24.org>
Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
SZEDER Gábor [Thu, 8 Apr 2021 21:29:15 +0000 (23:29 +0200)]
Makefile: add missing dependencies of 'config-list.h'
We auto-generate the list of supported configuration variables from
'Documentation/config/*.txt', and that list used to be created by the
'generate-cmdlist.sh' helper script and stored in the 'command-list.h'
header. Commit
709df95b78 (help: move list_config_help to
builtin/help, 2020-04-16) extracted this into a dedicated
'generate-configlist.sh' script and 'config-list.h' header, and added
a new target in the 'Makefile' as well, but while doing so it forgot
to extract the dependencies of the latter. Consequently, since then
'config-list.h' is not re-generated when 'Documentation/config/*.txt'
is updated, while 'command-list.h' is re-generated unnecessarily:
$ touch Documentation/config/log.txt
$ make -j4
GEN command-list.h
CC help.o
AR libgit.a
Fix this and list all config-related documentation files as
dependencies of 'config-list.h' and remove them from the dependencies
of 'command-list.h'.
$ touch Documentation/config/log.txt
$ make
GEN config-list.h
CC builtin/help.o
LINK git
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Atharva Raykar [Thu, 8 Apr 2021 09:14:43 +0000 (14:44 +0530)]
userdiff: add support for Scheme
Add a diff driver for Scheme-like languages which recognizes top level
and local `define` forms, whether it is a function definition, binding,
syntax definition or a user-defined `define-xyzzy` form.
Also supports R6RS `library` forms, `module` forms along with class and
struct declarations used in Racket (PLT Scheme).
Alternate "def" syntax such as those in Gerbil Scheme are also
supported, like defstruct, defsyntax and so on.
The rationale for picking `define` forms for the hunk headers is because
it is usually the only significant form for defining the structure of
the program, and it is a common pattern for schemers to have local
function definitions to hide their visibility, so it is not only the top
level `define`'s that are of interest. Schemers also extend the language
with macros to provide their own define forms (for example, something
like a `define-test-suite`) which is also captured in the hunk header.
Since it is common practice to extend syntax with variants of a form
like `module+`, `class*` etc, those have been supported as well.
The word regex is a best-effort attempt to conform to R7RS[1] valid
identifiers, symbols and numbers.
[1] https://small.r7rs.org/attachment/r7rs.pdf (section 2.1)
Signed-off-by: Atharva Raykar <raykar.ath@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 8 Apr 2021 20:22:33 +0000 (13:22 -0700)]
The eighth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 8 Apr 2021 20:23:26 +0000 (13:23 -0700)]
Merge branch 'ab/make-tags-quiet'
Generate [ec]tags under $(QUIET_GEN).
* ab/make-tags-quiet:
Makefile: add QUIET_GEN to "tags" and "TAGS" targets
Junio C Hamano [Thu, 8 Apr 2021 20:23:26 +0000 (13:23 -0700)]
Merge branch 'rs/daemon-sanitize-dir-sep'
"git daemon" has been tightened against systems that take backslash
as directory separator.
* rs/daemon-sanitize-dir-sep:
daemon: sanitize all directory separators
Junio C Hamano [Thu, 8 Apr 2021 20:23:26 +0000 (13:23 -0700)]
Merge branch 'en/ort-perf-batch-9'
The ort merge backend has been optimized by skipping irrelevant
renames.
* en/ort-perf-batch-9:
diffcore-rename: avoid doing basename comparisons for irrelevant sources
merge-ort: skip rename detection entirely if possible
merge-ort: use relevant_sources to filter possible rename sources
merge-ort: precompute whether directory rename detection is needed
merge-ort: introduce wrappers for alternate tree traversal
merge-ort: add data structures for an alternate tree traversal
merge-ort: precompute subset of sources for which we need rename detection
diffcore-rename: enable filtering possible rename sources
Junio C Hamano [Thu, 8 Apr 2021 20:23:25 +0000 (13:23 -0700)]
Merge branch 'en/sequencer-edit-upon-conflict-fix'
"git cherry-pick/revert" with or without "--[no-]edit" did not spawn
the editor as expected (e.g. "revert --no-edit" after a conflict
still asked to edit the message), which has been corrected.
* en/sequencer-edit-upon-conflict-fix:
sequencer: fix edit handling for cherry-pick and revert messages
Junio C Hamano [Thu, 8 Apr 2021 20:23:25 +0000 (13:23 -0700)]
Merge branch 'll/clone-reject-shallow'
"git clone --reject-shallow" option fails the clone as soon as we
notice that we are cloning from a shallow repository.
* ll/clone-reject-shallow:
builtin/clone.c: add --reject-shallow option
Junio C Hamano [Thu, 8 Apr 2021 20:23:25 +0000 (13:23 -0700)]
Merge branch 'tb/reverse-midx'
An on-disk reverse-index to map the in-pack location of an object
back to its object name across multiple packfiles is introduced.
* tb/reverse-midx:
midx.c: improve cache locality in midx_pack_order_cmp()
pack-revindex: write multi-pack reverse indexes
pack-write.c: extract 'write_rev_file_order'
pack-revindex: read multi-pack reverse indexes
Documentation/technical: describe multi-pack reverse indexes
midx: make some functions non-static
midx: keep track of the checksum
midx: don't free midx_name early
midx: allow marking a pack as preferred
t/helper/test-read-midx.c: add '--show-objects'
builtin/multi-pack-index.c: display usage on unrecognized command
builtin/multi-pack-index.c: don't enter bogus cmd_mode
builtin/multi-pack-index.c: split sub-commands
builtin/multi-pack-index.c: define common usage with a macro
builtin/multi-pack-index.c: don't handle 'progress' separately
builtin/multi-pack-index.c: inline 'flags' with options
Ævar Arnfjörð Bjarmason [Thu, 8 Apr 2021 15:04:24 +0000 (17:04 +0200)]
blame tests: simplify userdiff driver test
Simplify the test added in
9466e3809d (blame: enable funcname blaming
with userdiff driver, 2020-11-01) to use the --author support recently
added in
999cfc4f45 (test-lib functions: add --author support to
test_commit, 2021-01-12).
We also did not need the full fortran-external-function content. Let's
cut it down to just the important parts.
I'm modifying it to demonstrate that the fortran-specific userdiff
function is in effect by adding "DO NOT MATCH ..." and "AS THE ..."
lines surrounding the "RIGHT" one.
This is to check that we're using the userdiff "fortran" driver, as
opposed to the default driver which would match on those lines as part
of the general heuristic of matching a line that doesn't begin with
whitespace.
The test had also been leaving behind a .gitattributes file for later
tests to possibly trip over, let's clean it up with
"test_when_finished".
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ævar Arnfjörð Bjarmason [Thu, 8 Apr 2021 15:04:23 +0000 (17:04 +0200)]
blame tests: don't rely on t/t4018/ directory
Refactor a test added in
9466e3809d (blame: enable funcname blaming
with userdiff driver, 2020-11-01) so that the blame tests don't rely
on stealing the contents of "t/t4018/fortran-external-function".
I have another patch series that'll possibly (or not) refactor that
file, but having this test inter-dependency makes things simple in any
case by making this test more readable.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ævar Arnfjörð Bjarmason [Thu, 8 Apr 2021 15:04:22 +0000 (17:04 +0200)]
userdiff: remove support for "broken" tests
There have been no "broken" tests since
75c3b6b2e8 (userdiff: improve
Fortran xfuncname regex, 2020-08-12). Let's remove the test support
for them.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ævar Arnfjörð Bjarmason [Thu, 8 Apr 2021 15:04:21 +0000 (17:04 +0200)]
userdiff tests: list builtin drivers via test-tool
Change the userdiff test to list the builtin drivers via the
test-tool, using the new for_each_userdiff_driver() API function.
This gets rid of the need to modify this part of the test every time a
new pattern is added, see
2ff6c34612 (userdiff: support Bash,
2020-10-22) and
09dad9256a (userdiff: support Markdown, 2020-05-02)
for two recent examples.
I only need the "list-builtin-drivers "argument here, but let's add
"list-custom-drivers" and "list-drivers" too, just because it's easy.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ævar Arnfjörð Bjarmason [Thu, 8 Apr 2021 15:04:20 +0000 (17:04 +0200)]
userdiff tests: explicitly test "default" pattern
Since
122aa6f9c0 (diff: introduce diff.<driver>.binary, 2008-10-05)
the internals of the userdiff.c code have understood a "default" name,
which is invoked as userdiff_find_by_name("default") and present in
the "builtin_drivers" struct. Let's test for this special case.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ævar Arnfjörð Bjarmason [Thu, 8 Apr 2021 15:04:19 +0000 (17:04 +0200)]
userdiff: add and use for_each_userdiff_driver()
Refactor the userdiff_find_by_namelen() function so that a new
for_each_userdiff_driver() API function does most of the work.
This will be useful for the same reason we've got other for_each_*()
API functions as part of various APIs, and will be used in a follow-up
commit.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ævar Arnfjörð Bjarmason [Thu, 8 Apr 2021 15:04:18 +0000 (17:04 +0200)]
userdiff style: normalize pascal regex declaration
Declare the pascal pattern consistently with how we declare the
others, not having "\n" on one line by itself, but as part of the
pattern, and when there are alterations have the "|" at the start, not
end of the line.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ævar Arnfjörð Bjarmason [Thu, 8 Apr 2021 15:04:17 +0000 (17:04 +0200)]
userdiff style: declare patterns with consistent style
Change those patterns which were declared with a regex on the same
line as the "PATTERNS()" line to put that regex on the next line, and
add missing "/* -- */" separator comments between the pattern and
word_regex.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ævar Arnfjörð Bjarmason [Thu, 8 Apr 2021 15:04:16 +0000 (17:04 +0200)]
userdiff style: re-order drivers in alphabetical order
Address some old code smell and move around the built-in userdiff
drivers so they're both in alphabetical order, and now in the same
order they appear in the gitattributes(5) documentation.
The two started drifting in
be58e70dba (diff: unify external diff and
funcname parsing code, 2008-10-05), and then even further in
80c49c3de2 (color-words: make regex configurable via attributes,
2009-01-17) when the "cpp" pattern was added.
There are no functional changes here, and as --color-moved will show
only moved existing lines.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ævar Arnfjörð Bjarmason [Thu, 8 Apr 2021 13:25:55 +0000 (15:25 +0200)]
config.c: remove last remnant of GIT_TEST_GETTEXT_POISON
Remove a use of GIT_TEST_GETTEXT_POISON added in
f276e2a4694 (config:
improve error message for boolean config, 2021-02-11).
This was simultaneously in-flight with my
d162b25f956 (tests: remove
support for GIT_TEST_GETTEXT_POISON, 2021-01-20) which removed the
rest of the GIT_TEST_GETTEXT_POISON code.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Ville Skyttä [Thu, 8 Apr 2021 07:06:41 +0000 (10:06 +0300)]
completion: audit and guard $GIT_* against unset use
$GIT_COMPLETION_SHOW_ALL and $GIT_TESTING_ALL_COMMAND_LIST were used
without guarding against them being unset, causing errors in nounset
(set -u) mode.
No other nounset-unsafe $GIT_* usages were found.
While at it, remove a superfluous (duplicate) unset guard from $GIT_DIR
in __git_find_repo_path.
Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>