List Info

Thread: Macroize char class tests in case to remove duplicated code.




Macroize char class tests in case to remove duplicated code.
user name
2006-12-27 10:37:36
Attached patch removes a few pages worth of duplicated code
from the
internals of the char class compilation code.

Yves

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"
Macroize char class tests in case to remove duplicated code.
user name
2006-12-27 15:31:18
On Wed, Dec 27, 2006 at 11:37:36AM +0100, demerphq wrote:
> Attached patch removes a few pages worth of duplicated
code from the
> internals of the char class compilation code.

Thanks applied (change 29626)

I notice from the coverage report that some of the cases had
no test coverage:
http://www.maddingue.net/perlcover/cover1/cov
er_db/--regcomp-c.html
(and now that will be harder to see)

Was this the logic that on IRC you were commenting might be
better with
pre-generated lookup tables? The code doesn't seem to be
that frequently
called, so I doubt there's a potential speed win, but would
there be a size
win?

Nicholas Clark
Macroize char class tests in case to remove duplicated code.
user name
2006-12-27 16:15:53
On 12/27/06, Nicholas Clark <nickccl4.org> wrote:
> On Wed, Dec 27, 2006 at 11:37:36AM +0100, demerphq
wrote:
> > Attached patch removes a few pages worth of
duplicated code from the
> > internals of the char class compilation code.
>
> Thanks applied (change 29626)
>
> I notice from the coverage report that some of the
cases had no test coverage:
> http://www.maddingue.net/perlcover/cover1/cov
er_db/--regcomp-c.html
> (and now that will be harder to see)

Ah, we can improve the test coverage of that code by making
sure that
we test all of the [:name:]'ed char classes. Also we can
improve it by
testing things like [sw] and stuff like that.

> Was this the logic that on IRC you were commenting
might be better with
> pre-generated lookup tables? The code doesn't seem to
be that frequently
> called, so I doubt there's a potential speed win, but
would there be a size
> win?

Actually, thats an interesting question. It turned out that
the code
that i was so horrified about in regexec is involved with
handling
named char class tests under use locale '...'; Which is the
reasons
that the tests are done at run time. And since locals can
change and
stuff like that its probably not worth changing much.
(Although it
probably would be worth caching for the duration of the
match, at
least to users of use locale ).

However the data that I was interested in statically
generating could
be used to simplify this routine. Each of the 15 named
charclasses
could have a statically generated bitmap for ASCII, and then
the logic
in compiling could be made to be operations on these
bitvectors
instead of looping from 0..255 and doing the appropriate
tests. I
suspect that there might be a size win as well as a speed
win from it
however as 15 * 32 bytes would be needed to store all the
data
statically, and i bet the code being generated currently is
larger
than that.

Anyway, the stuff that tsee was looking into is more or less
the
inverse of what we need for this problem. He was looking at
mapping
char -> properties. This version would require a mapping
from property
to chars. IOW, the original problem was to build an array of
256
elements of a bitvector of 30 bits long, this problem is to
build an
array of 15 bitvectors that are 256 bits long.

Cheers,
Yves


-- 
perl -Mre=debug -e "/just|another|perl|hacker/"
[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )