MOIN,
ON FRIDAY 21 SEPTEMBER 2007 23:56:56 DEMERPHQ WROTE:
> ON 9/21/07, DEMERPHQ <DEMERPHQ GMAIL.COM> WROTE:
> > BUT WE NEED TO MAKE SURE THIS IS FIXED BEFORE 5.10
IS RELEASED.
>
> JUST TO EXPAND ON THIS, SOMEWHERE IN OR AROUND THE
MAKE_TRIE CODE IS
> SOME LOGIC THAT TURNS ON A BIT IN A BIT VECTOR FOR
EVERY START BYTE IN
> THE TRIE. IN THE BRANCH FOR HANDLING NON UNICODE DATA
IT NEEDS TO DO
> SOMETHING LIKE THE FOLLOWING PSEUDO CODE.
>
> /* STORE FIRST BYTE OF UTF8 REPRESENTATION OF
CODEPOINTS IN THE 127 <
> CP < 256 RANGE */
> IF (127 < CP && CP < 192) {
> SETBIT(CHARCLASS,194)
> } ELSE IF (191 < CP && CP < 256) {
> SETBIT(CHARCLASS,195)
> }
NEITHER SETBIT NOR "VECTOR" APPEAR IN THE SOURCE.
IN THE END GREPPIGN
FOR "BITFIELD" LEADS TO LINE 1392 WHICH LOOKS
LIKE:
IF ( SET_BIT ) /* BITMAP ONLY ALLOCED WHEN
!(UTF&&FOLDING) */
TRIE_BITMAP_SET(TRIE,*UC); /* STORE THE RAW
FIRST BYTE
REGARDLESS OF
ENCODING */
FOR ( ; UC < E ; UC += LEN ) {
TRIE_CHARCOUNT(TRIE)++;
TRIE_READ_CHAR;
CHARS++;
IF ( UVC < 256 ) {
IF ( !TRIE->CHARMAP[ UVC ] ) {
TRIE->CHARMAP[ UVC ]=(
++TRIE->UNIQUECHARCOUNT );
IF ( FOLDER )
TRIE->CHARMAP[ FOLDER[ UVC ] ] =
TRIE->CHARMAP[
UVC ];
TRIE_STORE_REVCHAR;
}
IF ( SET_BIT ) {
/* STORE THE CODEPOINT IN THE BITMAP,
AND IF ITS ASCII
ALSO STORE ITS FOLDED EQUIVELENT. */
TRIE_BITMAP_SET(TRIE,UVC);
IF ( FOLDER )
TRIE_BITMAP_SET(TRIE,FOLDER[ UVC ]);
SET_BIT = 0; /* WE'VE DONE OUR BIT */
}
} ELSE {
SV** SVPP;
IF ( !WIDECHARMAP )
WIDECHARMAP = NEWHV();
SVPP = HV_FETCH( WIDECHARMAP,
(CHAR*)&UVC, SIZEOF( UV ),
1 );
IF ( !SVPP )
PERL_CROAK( ATHX_ "ERROR
CREATING/FETCHING WIDECHARMAP
ENTRY FOR 0X%"UVXF, UVC );
IF ( !SVTRUE( *SVPP ) ) {
SV_SETIV( *SVPP,
++TRIE->UNIQUECHARCOUNT );
TRIE_STORE_REVCHAR;
}
}
AND I BELIEVE IN THE FIRST BRANCH THE MODIFICATION NEEDS TO
BE DONE.
HOWEVER, I AM NOT SURE WHAT TO INSERT WHERE.
ALL THE BEST,
TELS
--
SIGNED ON SAT SEP 22 09:58:36 2007 WITH KEY 0X93B84C15.
VIEW MY PHOTO GALLERY: HTTP://BLOODGATE.COM/PHOTOS
PGP KEY ON HTTP://BLOODGATE.COM/TELS.ASC OR PER EMAIL.
"I AM SOO CLUMSY TODAY." *CRASH*
|