List Info

Thread: Regexp failure with utf8-flagged string and byte-flagged pattern




Regexp failure with utf8-flagged string and byte-flagged pattern
user name
2007-09-22 07:27:29
MOIN, ON SATURDAY 22 SEPTEMBER 2007 14:09:11 TELS WROTE: > MOIN, > > I MEANT TO SAY EARLIER THAT THE WIN32/MAKEFILE HAS SOME TARGETS > > DEFINED THAT MAKE IT EASIER TO MAKE MINOR CHANGES TO THE REGEX ENGINE. > > IF YOU HAVE A LOOK AT THE 'REONLY' AND 'TEST-REONLY' YOU SHOULD BE > > ABLE TO PUT THEM IN YOUR *NIX MAKEFILE AND USE THEM. THEN INSTEAD OF > > DOING A FULL TEST RIGHT OFF YOU CAN DO A > > > > MAKE TEST-REONLY > > > > AND MAKE SURE THAT ALL REGEX RELATED TESTS PASS, AND ONLY DO A FULL > > MAKE TEST CYCLE ONCE EVERYTHING IS WORKING OK. IVE BEEN MUTTERING > > ABOUT GETTING THESE TARGETS ADDED TO THE "NORMAL" MAKEFILE FOR A WHILE > > BUT IVE NOT GOT AROUND TO IT YET AND NOBODY ELSE HAS EITHER. > > AH, I MIGHT LOOK INTO THIS. SORRY, DEVELOPING A HEADACHE SO SKIPPING THIS FOR NOW. > > ALSO, THE TEST FILE THAT NEEDS TO BE UPDATED FOR THIS BUG IS > > T/OP/PAT.T, AND NOTE THAT THE TEST COUNT IS AT THE BOTTOM OF FILE, THE > > NEW TEST SHOULD GO ABOUT A PAGE ABOVE THE BOTTOM (TESTS THAT HAVE > > CAUSED SEGV'S IN THE PAST ARE KEPT LAST). > > AND THIS, TOO. AFTER IT COMPILED FULLY AND WORKS, OF COURSE :-P ATTACHED IS A PATCH THAT DOES WHAT YVES SUGGESTED, PASSES THE TEST FROM THE BUG REPORT, AS WELL THE TEST I ADDED AT T/OP/PAT.T - HOWEVER, I AM *NOT* SURE THAT THE TEST I ADDED REALLY TESTS WHAT IT SHOULD - AS THE FILE T/OP/PAT.T IS IN UTF-8 ACCORDING TO FILE SO I HAD TO USE XD6 AND I JUST HOPE THATS OK IN ANY EVENT, THAT SHOULD RESOLVE THIS ISSUE. ALL THE BEST, TELS -- SIGNED ON SAT SEP 22 14:25:19 2007 WITH KEY 0X93B84C15. GET ONE OF MY PHOTO POSTERS: HTTP://BLOODGATE.COM/POSTERS PGP KEY ON HTTP://BLOODGATE.COM/TELS.ASC OR PER EMAIL. MIKO: "DETECT EVIL!" BELKAR, HOLDING UP CHECK-WARDING SHEET OF LEAD: "TOO SLOW, SISTER." -- THE ORDER OF THE STICK
  Approximate file size 1877 bytes
Re: Regexp failure with utf8-flagged string and byte-flagged pattern
user name
2007-09-22 07:36:56
On 9/22/07, Tels <nospam-abusebloodgate.com> wrote:
> Moin,
>
> On Saturday 22 September 2007 14:09:11 Tels wrote:
> > Moin,
> > > I meant to say earlier that the
win32/Makefile has some targets
> > > defined that make it easier to make minor
changes to the regex engine.
> > > If you have a look at the 'reonly' and
'test-reonly' you should be
> > > able to put them in your *nix Makefile and
use them. Then instead of
> > > doing a full test right off you can do a
> > >
> > > make test-reonly
> > >
> > > and make sure that all regex related tests
pass, and only do a full
> > > make test cycle once everything is working
ok. Ive been muttering
> > > about getting these targets added to the
"normal" makefile for a while
> > > but ive not got around to it yet and nobody
else has either.
> >
> > Ah, I might look into this.
>
> Sorry, developing a headache so skipping this for now.

Sorry to hear that. Hope you feel better.

> > > Also, the test file that needs to be updated
for this bug is
> > > t/op/pat.t, and note that the test count is
at the BOTTOM of file, the
> > > new test should go about a page above the
bottom (tests that have
> > > caused SEGV's in the past are kept last).
> >
> > And this, too. After it compiled fully and works,
of course :-P
>
> Attached is a patch that does what Yves suggested,
passes the test from the
> bug report, as well the test I added at t/op/pat.t -
however, I am *not*
> sure that the test I added really tests what it should
- as the file
> t/op/pat.t is in UTF-8 according to file

Hmm, thats a little surprising. Wouldnt have predicted that
at all.

> so I had to use xd6 and I just hope thats ok 

That looks fine to me. So does the patch.

> In any event, that should resolve this issue.

Nice one Tels. Thanks.

Yves
-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Re: Regexp failure with utf8-flagged string and byte-flagged pattern
user name
2007-09-22 14:42:11
Tels <nospam-abusebloodgate.com> writes:

> Moin,
> 
> On Saturday 22 September 2007 14:09:11 Tels wrote:
> > Moin,
> > > I meant to say earlier that the
win32/Makefile has some targets
> > > defined that make it easier to make minor
changes to the regex engine.
> > > If you have a look at the 'reonly' and
'test-reonly' you should be
> > > able to put them in your *nix Makefile and
use them. Then instead of
> > > doing a full test right off you can do a
> > >
> > > make test-reonly
> > >
> > > and make sure that all regex related tests
pass, and only do a full
> > > make test cycle once everything is working
ok. Ive been muttering
> > > about getting these targets added to the
"normal" makefile for a while
> > > but ive not got around to it yet and nobody
else has either.
> >
> > Ah, I might look into this.
> 
> Sorry, developing a headache so skipping this for now.
> 
> > > Also, the test file that needs to be updated
for this bug is
> > > t/op/pat.t, and note that the test count is
at the BOTTOM of file, the
> > > new test should go about a page above the
bottom (tests that have
> > > caused SEGV's in the past are kept last).
> >
> > And this, too. After it compiled fully and works,
of course :-P
> 
> Attached is a patch that does what Yves suggested,
passes the test from the 
> bug report, as well the test I added at t/op/pat.t -
however, I am *not* 
> sure that the test I added really tests what it should
- as the file 
> t/op/pat.t is in UTF-8 according to file so I had to
use xd6 and I just 
> hope thats ok 
> 

The test just has byte sequences which form valid utf-8.
Most of the
script is really latin1.

And if you want to be sure, Dump() from Devel::Peek is your
friend.

Regards,
        Slaven

-- 
Slaven Rezic - slaven <at> rezic <dot> de

    tksm - Perl/Tk program for searching and replacing in
multiple files
    http://ptktools
.sourceforge.net/#tksm

Re: Regexp failure with utf8-flagged string and byte-flagged pattern
user name
2007-09-25 03:57:35
On 22/09/2007, Tels <nospam-abusebloodgate.com> wrote:
> Attached is a patch that does what Yves suggested,
passes the test from the
> bug report, as well the test I added at t/op/pat.t -
however, I am *not*
> sure that the test I added really tests what it should
- as the file
> t/op/pat.t is in UTF-8 according to file so I had to
use xd6 and I just
> hope thats ok 
>
> In any event, that should resolve this issue.

Thanks, applied as #31961.

[1-4]

about | contact  Other archives ( Real Estate discussion Medical topics )