1 [[!template id=gitbranch branch=smcv/ready/glob-cache
2 author="[[KathrynAndersen]], [[smcv]]"]]
5 I've been profiling my IkiWiki to try to improve speed (with many pages makes speed even more important) and I've written a patch to improve the speed of match_glob. This matcher is a good one to improve the speed of, because it gets called so many times.
7 Here's my patch - please consider it! -- [[KathrynAndersen]]
9 > It seems to me as though changing `glob2re` to return qr/$re/, and calling
10 > `memoize(glob2re)` next to the other memoize calls, would be a less
11 > verbose way to do this? --[[smcv]]
13 >> I think so, yeah. Anyway, do you have any benchmark results handy,
14 >> Kathryn? --[[Joey]]
17 >>> Also, would it make more sense for glob2re to return qr/^$re$/i rather than qr/$re/? Everything that uses glob2re seems to use
19 >>> rather than /$re/ so I think that would make sense.
20 >>> -- [[KathrynAndersen]]
22 >>>> Git branch `smcv/ka-glob-cache` has Kathryn's patch. Git
23 >>>> branch `smcv/memoize-glob2re` does as I suggested, which
24 >>>> is less verbose than Kathryn's patch but also not as
25 >>>> fast; I'm not sure why, tbh. --[[smcv]]
27 >>>>> I think it's because my patch focuses on match_glob while the memoize patch focuses on `glob2re`, and `glob2re` is called in `filecheck`, `meta` and `po` as well as in `match_glob` and `match_user`; thus the memoized `glob2re` is dealing with a bigger set of globs to look up, and thus could be just that little bit slower. -- [[KathrynAndersen]]
29 >>>>>> What may be going on is that glob2re is already a fairly fast
30 >>>>>> function, so the overhead of memoizing it with the very generic
31 >>>>>> `_memoizer` (see its source) swamps the memoization gain. Note
32 >>>>>> that the few functions memoized with the Memoizer before were much
33 >>>>>> more expensive, so that little overhead was acceptable then.
35 >>>>>> It also may be that Kathryn's patch is slightly faster due to using
36 >>>>>> the construct `$foo =~ $regexp` rather than `$foo =~ /$regexp/`
37 >>>>>> (probably avoids a copy or something like that internally) --
38 >>>>>> this despite checking both `exists` and `defined` on the hash, which
39 >>>>>> should be reundant AFAICS.
41 >>>>>> My guess is that the best of both worlds would be to move
42 >>>>>> the byhand memoization to glob2re and have it return a compiled
43 >>>>>> `/^/i` regexp that can be used without further modifiction in most
44 >>>>>> cases. --[[Joey]]
46 >>>>>>> Done, see `smcv/ready/glob-cache` and `smcv/glob-cache-too-far`.
48 >>>>>>> Kathryn's patch is a significant improvement; my first patch on top of
49 >>>>>>> that is a trivial cleanup that speeds it up a little, and the next two
50 >>>>>>> patches (using precompiled regexes) have surprisingly little effect
51 >>>>>>> (they don't slow it down either though, so either omit them or merge
52 >>>>>>> them, whichever). Detailed benchmark results below.
54 >>>>>>> Moving the memoization to `glob2re` actually seems to slow things down
55 >>>>>>> again - I suspect the docwiki has few enough mentions of `user()` etc.
56 >>>>>>> that caching them is a waste of time, but perhaps it's not the most
57 >>>>>>> representative.
62 --------------------------------------------------------------
64 [[!toggle id="smcv-benchmark" text="current benchmarks"]]
66 [[!toggleable id="smcv-benchmark" text="""
67 master at time of branch:
69 time elapsed (wall): 29.6348
70 time running program: 24.9212 (84.09%)
71 time profiling (est.): 4.7136 (15.91%)
72 number of calls: 1360181
73 number of exceptions: 13
75 %Time Sec. #calls sec/call F name
76 13.24 3.2986 3408 0.000968 Text::Balanced::_match_tagged
77 10.94 2.7253 79514 0.000034 IkiWiki::PageSpec::match_glob
78 3.19 0.7952 59454 0.000013 <anon>:IkiWiki/Plugin/inline.pm:223
80 `Improve the speed of match_glob`:
82 time elapsed (wall): 27.9755
83 time running program: 23.5293 (84.11%)
84 time profiling (est.): 4.4461 (15.89%)
85 number of calls: 1280875
86 number of exceptions: 13
88 %Time Sec. #calls sec/call F name
89 14.56 3.4257 3408 0.001005 Text::Balanced::_match_tagged
90 7.82 1.8403 79514 0.000023 IkiWiki::PageSpec::match_glob
91 3.27 0.7698 59454 0.000013 <anon>:IkiWiki/Plugin/inline.pm:223
93 `match_glob: streamline glob cache slightly`:
95 time elapsed (wall): 27.5753
96 time running program: 23.1714 (84.03%)
97 time profiling (est.): 4.4039 (15.97%)
98 number of calls: 1280875
99 number of exceptions: 13
101 %Time Sec. #calls sec/call F name
102 14.09 3.2637 3408 0.000958 Text::Balanced::_match_tagged
103 7.74 1.7926 79514 0.000023 IkiWiki::PageSpec::match_glob
104 3.30 0.7646 59454 0.000013 <anon>:IkiWiki/Plugin/inline.pm:223
106 `glob2re: return a precompiled, anchored case-insensitiv...`:
108 time elapsed (wall): 27.5656
109 time running program: 23.1464 (83.97%)
110 time profiling (est.): 4.4192 (16.03%)
111 number of calls: 1282189
112 number of exceptions: 13
114 %Time Sec. #calls sec/call F name
115 14.21 3.2891 3408 0.000965 Text::Balanced::_match_tagged
116 7.72 1.7872 79514 0.000022 IkiWiki::PageSpec::match_glob
117 3.32 0.7678 59454 0.000013 <anon>:IkiWiki/Plugin/inline.pm:223
119 `make use of precompiled regex objects`:
121 time elapsed (wall): 27.5357
122 time running program: 23.1289 (84.00%)
123 time profiling (est.): 4.4068 (16.00%)
124 number of calls: 1281981
125 number of exceptions: 13
127 %Time Sec. #calls sec/call F name
128 14.17 3.2776 3408 0.000962 Text::Balanced::_match_tagged
129 7.70 1.7814 79514 0.000022 IkiWiki::PageSpec::match_glob
130 3.35 0.7756 59454 0.000013 <anon>:IkiWiki/Plugin/inline.pm:223
132 `move memoization from match_glob to glob2re`:
134 time elapsed (wall): 28.7677
135 time running program: 23.9473 (83.24%)
136 time profiling (est.): 4.8205 (16.76%)
137 number of calls: 1360181
138 number of exceptions: 13
140 %Time Sec. #calls sec/call F name
141 13.98 3.3469 3408 0.000982 Text::Balanced::_match_tagged
142 8.85 2.1194 79514 0.000027 IkiWiki::PageSpec::match_glob
143 3.24 0.7750 59454 0.000013 <anon>:IkiWiki/Plugin/inline.pm:223
148 --------------------------------------------------------------
150 [[!toggle id="ka-benchmarks" text="Kathryn's benchmarks"]]
152 [[!toggleable id="ka-benchmarks" text="""
153 Benchmarks done with Devel::Profile on the same testbed IkiWiki setup. I'm just showing the start of the profile output, since that's what's relevant.
157 time elapsed (wall): 27.4173
158 time running program: 22.5909 (82.40%)
159 time profiling (est.): 4.8264 (17.60%)
160 number of calls: 1314729
161 number of exceptions: 65
163 %Time Sec. #calls sec/call F name
164 11.05 2.4969 62333 0.000040 IkiWiki::PageSpec::match_glob
165 4.10 0.9261 679 0.001364 Text::Balanced::_match_tagged
166 2.72 0.6139 59812 0.000010 IkiWiki::SuccessReason::merge_influences
171 time elapsed (wall): 26.1843
172 time running program: 21.5673 (82.37%)
173 time profiling (est.): 4.6170 (17.63%)
174 number of calls: 1252433
175 number of exceptions: 65
177 %Time Sec. #calls sec/call F name
178 7.66 1.6521 62333 0.000027 IkiWiki::PageSpec::match_glob
179 4.33 0.9336 679 0.001375 Text::Balanced::_match_tagged
180 2.81 0.6057 59812 0.000010 IkiWiki::SuccessReason::merge_influences
183 Note that the seconds per call for match_glob in the "after" case has gone down by about a third.
188 --------------------------------------------------------------
190 [[!toggle id="ka-patch" text="Kathryn's original patch"]]
192 [[!toggleable id="ka-patch" text="""
195 diff --git a/IkiWiki.pm b/IkiWiki.pm
196 index 08a3d78..c187b98 100644
199 @@ -2482,6 +2482,8 @@ sub derel ($$) {
205 sub match_glob ($$;@) {
208 @@ -2489,8 +2491,15 @@ sub match_glob ($$;@) {
210 $glob=derel($glob, $params{location});
212 - my $regexp=IkiWiki::glob2re($glob);
213 - if ($page=~/^$regexp$/i) {
214 + # Instead of converting the glob to a regex every time,
215 + # cache the compiled regex to save time.
216 + if (!exists $glob_cache{$glob}
217 + or !defined $glob_cache{$glob})
219 + my $re=IkiWiki::glob2re($glob);
220 + $glob_cache{$glob} = qr/^$re$/i;
222 + if ($page =~ $glob_cache{$glob}) {
223 if (! IkiWiki::isinternal($page) || $params{internal}) {
224 return IkiWiki::SuccessReason->new("$glob matches $page");
228 --------------------------------------------------------------