List Info

Thread: Format violation with UTF-8 characters and '~~' continuation operator




Format violation with UTF-8 characters and '~~' continuation operator
user name
2007-09-10 07:51:36
# New Ticket Created by  Dmitry Samborskiy 
# Please include the string:  [perl #45325]
# in the subject line of all future correspondence about
this issue. 
# <URL: h
ttp://rt.perl.org/rt3/Ticket/Display.html?id=45325 >


This is a bug report for perl from samborsky_dyahoo.com,
generated with the help of perlbug 1.35 running under perl
v5.8.8.


------------------------------------------------------------
-----
The following code demonstrates the bug:

#====== cut here === cut here =====================
#! /usr/bin/perl -w

use strict;
use utf8;
binmode STDOUT, ":utf8";

my $str;
format = 
| ^<<<<<<<<<<<<<|
$str,
| ^<<<<<<<<<<<~~|
$str
.

for my $i (1 .. 25) {
    # 'Я' - russian 'ja' letter, could be any utf-8
character:
    $str = sprintf "(%s)", ('Я' x $i); 
    write;
}

=pod The output is broken:
| (Я)           |
| )             |
| (ЯЯ)          |
| &#175;)            |
| (ЯЯЯ)         |
| Я)            |
| )             |
| (ЯЯЯЯ)        |
| &#65533;Я)           |
| )             |
| (ЯЯЯЯЯ)       |
| ЯЯ)           |
| &#175;)            |
. . .
=cut

#====== cut here === cut here =====================

------------------------------------------------------------
-----
---
Flags:
    category=core
    severity=low
---
This perlbug was built using Perl v5.8.8 in the Red Hat
build system.
It is being executed now by Perl v5.8.8 - Tue Oct  3
11:01:05 EDT 2006.

Site configuration information for perl v5.8.8:

Configured by Red Hat, Inc. at Tue Oct  3 11:01:05 EDT
2006.

Summary of my perl5 (revision 5 version 8 subversion 8)
configuration:
  Platform:
    osname=linux, osvers=2.6.9-34.elsmp,
archname=i386-linux-thread-multi
    uname='linux hs20-bc2-2.build.redhat.com 2.6.9-34.elsmp
#1 smp fri feb 24
16:56:28 est 2006 i686 i686 i386 gnulinux '
    config_args='-des -Doptimize=-O2 -g -pipe -Wall
-Wp,-D_FORTIFY_SOURCE=2
-fexceptions -fstack-protector --param=ssp-buffer-size=4
-m32 -march=i386
-mtune=generic -fasynchronous-unwind-tables -Dversion=5.8.8
-Dmyhostname=localhost -Dperladmin=rootlocalhost
-Dcc=gcc -Dcf_by=Red Hat,
Inc. -Dinstallprefix=/usr -Dprefix=/usr
-Darchname=i386-linux
-Dvendorprefix=/usr -Dsiteprefix=/usr -Duseshrplib
-Dusethreads -Duseithreads
-Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm
-Di_gdbm -Di_shadow
-Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n
-Ubincompat5005
-Uversiononly -Dpager=/usr/bin/less -isr
-Dd_gethostent_r_proto
-Ud_endhostent_r_proto -Ud_sethostent_r_proto
-Ud_endprotoent_r_proto
-Ud_setprotoent_r_proto -Ud_endservent_r_proto
-Ud_setservent_r_proto
-Dinc_version_list=5.8.7 5.8.6 5.8.5 -Dscriptdir=/usr/bin'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef
useithreads=define
usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define
usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE
-fno-strict-aliasing -pipe
-Wdeclaration-after-statement -I/usr/local/include
-D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -I/usr/include/gdbm',
    optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2
-fexceptions
-fstack-protector --param=ssp-buffer-size=4 -m32 -march=i386
-mtune=generic
-fasynchronous-unwind-tables',
    cppflags='-D_REENTRANT -D_GNU_SOURCE
-fno-strict-aliasing -pipe
-Wdeclaration-after-statement -I/usr/local/include
-I/usr/include/gdbm'
    ccversion='', gccversion='4.1.1 20060928 (Red Hat
4.1.1-28)',
gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8,
byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define,
longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8,
Off_t='off_t',
lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='gcc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil
-lpthread -lc
    perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil
-lpthread -lc
    libc=/lib/libc-2.5.so, so=so, useshrplib=true,
libperl=libperl.so
    gnulibc_version='2.5'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef,
ccdlflags='-Wl,-E
-Wl,-rpath,/usr/lib/perl5/5.8.8/i386-linux-thread-multi/CORE
'
    cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe
-Wall
-Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector
--param=ssp-buffer-size=4 -m32 -march=i386 -mtune=generic
-fasynchronous-unwind-tables -L/usr/local/lib'

Locally applied patches:

 
---
INC
for perl v5.8.8:
    /usr/lib/perl5/site_perl/5.8.8/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.7/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.6/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/site_perl/5.8.8
    /usr/lib/perl5/site_perl/5.8.7
    /usr/lib/perl5/site_perl/5.8.6
    /usr/lib/perl5/site_perl/5.8.5
    /usr/lib/perl5/site_perl
   
/usr/lib/perl5/vendor_perl/5.8.8/i386-linux-thread-multi
   
/usr/lib/perl5/vendor_perl/5.8.7/i386-linux-thread-multi
   
/usr/lib/perl5/vendor_perl/5.8.6/i386-linux-thread-multi
   
/usr/lib/perl5/vendor_perl/5.8.5/i386-linux-thread-multi
    /usr/lib/perl5/vendor_perl/5.8.8
    /usr/lib/perl5/vendor_perl/5.8.7
    /usr/lib/perl5/vendor_perl/5.8.6
    /usr/lib/perl5/vendor_perl/5.8.5
    /usr/lib/perl5/vendor_perl
    /usr/lib/perl5/5.8.8/i386-linux-thread-multi
    /usr/lib/perl5/5.8.8
    .

---
Environment for perl v5.8.8:
    HOME=/home/dsambor
    LANG=en_US.UTF-8
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
   
PATH=/usr/lib/qt-3.3/bin:/usr/kerberos/bin:/usr/local/bin:/u
sr/bin:/bin:/usr/X11R6/bin:/dss/bin::/home/dsambor/bin
    PERL_BADLANG (unset)
    SHELL=/bin/bash



     
________________________________________________________
Вы уже с Yahoo!? 
Испытайте обновленную и
улучшенную. Yahoo! Почту! http://ru.mail.yahoo.com



Re: Format violation with UTF-8 characters and '~~' continuation operator
user name
2007-09-10 08:36:35
Dmitry Samborskiy skribis 2007-09-10  5:51 (-0700):
>     # 'Я' - russian 'ja' letter, could be any utf-8
character:
>     $str = sprintf "(%s)", ('Я' x $i); 

Please note that it is a *unicode* character.

It is encoded to UTF-8, resulting in a UTF-8 byte sequence.

The term "UTF-8 character" is very confusing and
should probably be
avoided.
-- 
Met vriendelijke groet,  Kind regards,  Korajn salutojn,

  Juerd Waalboer:  Perl hacker  <#####juerd.nl>  <http://juerd.nl/sig>
  Convolution:     ICT solutions and consultancy
<salesconvolution.nl>

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )