List Info

Thread: Perl, XML and UTF-8




Perl, XML and UTF-8
user name
2006-05-30 13:53:36
Le lundi 29 mai 2006 à 19:14 -0400, stinneysas.upenn.edu a écrit :
> I don't see you setting utf8 on STDOUT in that
snippet:
> 
>  binmode STDOUT, 'utf8';
> 
> I mostly make utf8 work with Perl and XML; most often:
> 
>  use open 'utf8';
> 
> and the binmode STDOUT and/or STDERR are enough to fix
things.
> 

Thanks Steve (and Tim) who suggested the binmode was
missing.

But afterwards, the gettext result was bad. Weird because I
do use the
command bind_textdomain_codeset("mypackage",
"UTF-8"). I had to wrap the
gettext calls between
decode("utf8",gettext("string")).

Thanks again.

Claude


> 
> Quoting Claude Paroz <parozemail.ch>:
> 
> > Hi,
> >
> > I have some Perl (5.8.7) code that read XML (UTF-8
encoded), with
> > XML::Simple or XML::LibXML, and write content back
to a HTML Page
> > through CGI.
> >
> > Snippet :
> >
> > use XML::LibXML;
> > use CGI qw/:standard/;
> > use Locale::gettext;
> >
> > my $q = new CGI;
> >
> > my $xml = XML::LibXML->new();
> > my $data = xml->parse_file($xmlfile);
> > my $root = $data->getDocumentElement;
> > my lines  =
$root->getElementsByTagName('sometag');
> >
> > print $q->header(-type=>'text/html',
-charset=>'UTF-8',
> > -encoding=>"UTF-8");
> > print $q->start_html(-title =>
gettext("My title")),
> > 	-encoding=>"UTF-8");
> > 	print
> >
$q->h1($lines->getElementsByTagName('subtag')->it
em(0)->textContent);
> > print $q->end_html;
> >
> > ************* End of Code ***************
> >
> > My problem is that special characters (accented
letters) aren't well
> > encoded when passed to the HTML output. Each
special char is represented
> > by a question mark inside a square. However, the
utf8::is_utf8 function
> > return 1 for these strings.
> >
> > I also noted that when some special characters are
in a string in the
> > XML file (e.g. ™ (trademark)), the encoding is
also OK in the resulting
> > HTML. Weird...
> > What could be the problem?
> >
> > Regards.
> >
> > Claude
> >
> > _______________________________________________
> > Perl-XML mailing list
> > Perl-XMLlistserv.ActiveState.com
> > To unsubscribe: http:/
/listserv.ActiveState.com/mailman/mysubs
> >
> 
> 

_______________________________________________
Perl-XML mailing list
Perl-XMLlistserv.ActiveState.com
To unsubscribe: http:/
/listserv.ActiveState.com/mailman/mysubs
[1]

about | contact  Other archives ( Real Estate discussion Medical topics )