Commit | Line | Data |
---|---|---|
927a13fe JK |
1 | diff-highlight |
2 | ============== | |
3 | ||
4 | Line oriented diffs are great for reviewing code, because for most | |
5 | hunks, you want to see the old and the new segments of code next to each | |
6 | other. Sometimes, though, when an old line and a new line are very | |
7 | similar, it's hard to immediately see the difference. | |
8 | ||
9 | You can use "--color-words" to highlight only the changed portions of | |
10 | lines. However, this can often be hard to read for code, as it loses | |
11 | the line structure, and you end up with oddly formatted bits. | |
12 | ||
13 | Instead, this script post-processes the line-oriented diff, finds pairs | |
14 | of lines, and highlights the differing segments. It's currently very | |
15 | simple and stupid about doing these tasks. In particular: | |
16 | ||
34d9819e JK |
17 | 1. It will only highlight hunks in which the number of removed and |
18 | added lines is the same, and it will pair lines within the hunk by | |
19 | position (so the first removed line is compared to the first added | |
20 | line, and so forth). This is simple and tends to work well in | |
21 | practice. More complex changes don't highlight well, so we tend to | |
22 | exclude them due to the "same number of removed and added lines" | |
23 | restriction. Or even if we do try to highlight them, they end up | |
24 | not highlighting because of our "don't highlight if the whole line | |
25 | would be highlighted" rule. | |
927a13fe JK |
26 | |
27 | 2. It will find the common prefix and suffix of two lines, and | |
28 | consider everything in the middle to be "different". It could | |
29 | instead do a real diff of the characters between the two lines and | |
30 | find common subsequences. However, the point of the highlight is to | |
31 | call attention to a certain area. Even if some small subset of the | |
32 | highlighted area actually didn't change, that's OK. In practice it | |
33 | ends up being more readable to just have a single blob on the line | |
34 | showing the interesting bit. | |
35 | ||
36 | The goal of the script is therefore not to be exact about highlighting | |
37 | changes, but to call attention to areas of interest without being | |
38 | visually distracting. Non-diff lines and existing diff coloration is | |
39 | preserved; the intent is that the output should look exactly the same as | |
40 | the input, except for the occasional highlight. | |
41 | ||
42 | Use | |
43 | --- | |
44 | ||
45 | You can try out the diff-highlight program with: | |
46 | ||
47 | --------------------------------------------- | |
48 | git log -p --color | /path/to/diff-highlight | |
49 | --------------------------------------------- | |
50 | ||
51 | If you want to use it all the time, drop it in your $PATH and put the | |
52 | following in your git configuration: | |
53 | ||
54 | --------------------------------------------- | |
55 | [pager] | |
56 | log = diff-highlight | less | |
57 | show = diff-highlight | less | |
58 | diff = diff-highlight | less | |
59 | --------------------------------------------- | |
a0b676aa | 60 | |
bca45fbc JK |
61 | |
62 | Color Config | |
63 | ------------ | |
64 | ||
65 | You can configure the highlight colors and attributes using git's | |
66 | config. The colors for "old" and "new" lines can be specified | |
67 | independently. There are two "modes" of configuration: | |
68 | ||
69 | 1. You can specify a "highlight" color and a matching "reset" color. | |
70 | This will retain any existing colors in the diff, and apply the | |
71 | "highlight" and "reset" colors before and after the highlighted | |
72 | portion. | |
73 | ||
74 | 2. You can specify a "normal" color and a "highlight" color. In this | |
75 | case, existing colors are dropped from that line. The non-highlighted | |
76 | bits of the line get the "normal" color, and the highlights get the | |
77 | "highlight" color. | |
78 | ||
79 | If no "new" colors are specified, they default to the "old" colors. If | |
80 | no "old" colors are specified, the default is to reverse the foreground | |
81 | and background for highlighted portions. | |
82 | ||
83 | Examples: | |
84 | ||
85 | --------------------------------------------- | |
86 | # Underline highlighted portions | |
87 | [color "diff-highlight"] | |
88 | oldHighlight = ul | |
89 | oldReset = noul | |
90 | --------------------------------------------- | |
91 | ||
92 | --------------------------------------------- | |
93 | # Varying background intensities | |
94 | [color "diff-highlight"] | |
95 | oldNormal = "black #f8cbcb" | |
96 | oldHighlight = "black #ffaaaa" | |
97 | newNormal = "black #cbeecb" | |
98 | newHighlight = "black #aaffaa" | |
99 | --------------------------------------------- | |
100 | ||
101 | ||
a0b676aa JK |
102 | Bugs |
103 | ---- | |
104 | ||
105 | Because diff-highlight relies on heuristics to guess which parts of | |
106 | changes are important, there are some cases where the highlighting is | |
107 | more distracting than useful. Fortunately, these cases are rare in | |
108 | practice, and when they do occur, the worst case is simply a little | |
109 | extra highlighting. This section documents some cases known to be | |
110 | sub-optimal, in case somebody feels like working on improving the | |
111 | heuristics. | |
112 | ||
113 | 1. Two changes on the same line get highlighted in a blob. For example, | |
114 | highlighting: | |
115 | ||
116 | ---------------------------------------------- | |
117 | -foo(buf, size); | |
118 | +foo(obj->buf, obj->size); | |
119 | ---------------------------------------------- | |
120 | ||
121 | yields (where the inside of "+{}" would be highlighted): | |
122 | ||
123 | ---------------------------------------------- | |
124 | -foo(buf, size); | |
125 | +foo(+{obj->buf, obj->}size); | |
126 | ---------------------------------------------- | |
127 | ||
128 | whereas a more semantically meaningful output would be: | |
129 | ||
130 | ---------------------------------------------- | |
131 | -foo(buf, size); | |
132 | +foo(+{obj->}buf, +{obj->}size); | |
133 | ---------------------------------------------- | |
134 | ||
135 | Note that doing this right would probably involve a set of | |
136 | content-specific boundary patterns, similar to word-diff. Otherwise | |
137 | you get junk like: | |
138 | ||
139 | ----------------------------------------------------- | |
140 | -this line has some -{i}nt-{ere}sti-{ng} text on it | |
141 | +this line has some +{fa}nt+{a}sti+{c} text on it | |
142 | ----------------------------------------------------- | |
143 | ||
144 | which is less readable than the current output. | |
145 | ||
146 | 2. The multi-line matching assumes that lines in the pre- and post-image | |
147 | match by position. This is often the case, but can be fooled when a | |
148 | line is removed from the top and a new one added at the bottom (or | |
149 | vice versa). Unless the lines in the middle are also changed, diffs | |
150 | will show this as two hunks, and it will not get highlighted at all | |
151 | (which is good). But if the lines in the middle are changed, the | |
152 | highlighting can be misleading. Here's a pathological case: | |
153 | ||
154 | ----------------------------------------------------- | |
155 | -one | |
156 | -two | |
157 | -three | |
158 | -four | |
159 | +two 2 | |
160 | +three 3 | |
161 | +four 4 | |
162 | +five 5 | |
163 | ----------------------------------------------------- | |
164 | ||
165 | which gets highlighted as: | |
166 | ||
167 | ----------------------------------------------------- | |
168 | -one | |
169 | -t-{wo} | |
170 | -three | |
171 | -f-{our} | |
172 | +two 2 | |
173 | +t+{hree 3} | |
174 | +four 4 | |
175 | +f+{ive 5} | |
176 | ----------------------------------------------------- | |
177 | ||
178 | because it matches "two" to "three 3", and so forth. It would be | |
179 | nicer as: | |
180 | ||
181 | ----------------------------------------------------- | |
182 | -one | |
183 | -two | |
184 | -three | |
185 | -four | |
186 | +two +{2} | |
187 | +three +{3} | |
188 | +four +{4} | |
189 | +five 5 | |
190 | ----------------------------------------------------- | |
191 | ||
192 | which would probably involve pre-matching the lines into pairs | |
193 | according to some heuristic. |