List Info

Thread: Re: a" + /$.../ = panic: sv_len_utf8 cache




Re: a" + /$.../ = panic: sv_len_utf8 cache
user name
2007-09-10 17:19:35
On Mon, Sep 10, 2007 at 10:55:38PM +0100, Dave Mitchell
wrote:
> On Mon, Sep 10, 2007 at 01:04:59PM -0700, webmasters

ctosonline. org wrote:
> > The following snippet works in 5.9.4 and earlier
but fails in 5.10.0  
> > patchlevel 31832:
> > 
> > $ perl -e'$s="[a]a"; utf8::upgrade
$s; /$s/; print "okn"'
> > ok
> > $ perl5.10.0 -e'$s="[a]a";
utf8::upgrade $s; /$s/; print "okn"'
> > panic: sv_len_utf8 cache 3 real 2 for aa at -e
line 1.
> > 
> > Apparently this error occurs when a string to be
interpolated in a  
> > regexp has the utf8 flag on and contains a
character class followed  
> > by literal text with a fixed quantifier > 1.
> > 
> > A binary search points to change #31246 as being
the culprit.
> 
> A closer look reveals that that change (one of mine)
accidentally
> unconditionally enabled the utf8 cache debugging code,
which then flagged
> up the error. That's now fixed (#31842).

Um, interesting.
Does that change of 2007/05/20 explain any regexp
benchmarking performance
issues?

Having the cache debugging is even worse than having no
caching in the first
place, IIRC, as it will do the linear scan and then
cross-check with the
cache code.

> The regex/utf8 bug itself is older. I'll try to have a
look at it sometime
> if someone else doesn't beat me to it (hint hint).

I hope someone beats me to it too.

Nicholas Clark

Re: a" + /$.../ = panic: sv_len_utf8 cache
user name
2007-09-10 17:37:11
On Mon, Sep 10, 2007 at 11:19:35PM +0100, Nicholas Clark
wrote:
> On Mon, Sep 10, 2007 at 10:55:38PM +0100, Dave Mitchell
wrote:
> > On Mon, Sep 10, 2007 at 01:04:59PM -0700,
webmasters  ctosonline. org wrote:
> > > The following snippet works in 5.9.4 and
earlier but fails in 5.10.0  
> > > patchlevel 31832:
> > > 
> > > $ perl -e'$s="[a]a";
utf8::upgrade $s; /$s/; print "okn"'
> > > ok
> > > $ perl5.10.0 -e'$s="[a]a";
utf8::upgrade $s; /$s/; print "okn"'
> > > panic: sv_len_utf8 cache 3 real 2 for aa at
-e line 1.
> > > 
> > > Apparently this error occurs when a string to
be interpolated in a  
> > > regexp has the utf8 flag on and contains a
character class followed  
> > > by literal text with a fixed quantifier >
1.
> > > 
> > > A binary search points to change #31246 as
being the culprit.
> > 
> > A closer look reveals that that change (one of
mine) accidentally
> > unconditionally enabled the utf8 cache debugging
code, which then flagged
> > up the error. That's now fixed (#31842).
> 
> Um, interesting.
> Does that change of 2007/05/20 explain any regexp
benchmarking performance
> issues?

Um, no idea.

> Having the cache debugging is even worse than having no
caching in the first
> place, IIRC, as it will do the linear scan and then
cross-check with the
> cache code.

I rather wonder why this is a run-time rather than
build-time option?
It means we carry the extra code in the binary, and have to
do a
(PL_utf8cache < 0) test every time for all users.
Shouldn't this debugging
stuff just become an assertion that's compiled-in and run
only on
DEBUGGING builds? It also means we're more likely to spot
problems
earlier.

-- 
You live and learn (although usually you just live).

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )