List Info

Thread: encoding fails




encoding fails
user name
2006-05-25 08:29:29
<?xml version="1.0"
encoding="ISO-8859-1"?>
<rss version="2.0">
   <channel>
      <title>Aftonbladet
&#246;jesliv</title>
   </channel>
</rss>

I try to extract the title element from the above. But the
encoding is not
recognised. What i get is this:
Aftonbladet öjesliv

here is the code that gets the item:

xmlChar * content_ptr = xmlNodeListGetString(_doc,
cur_node->xmlChildrenNode,
0);
if ( !std::strcmp((const char*)cur_node->name,
"title") )
		{
		    int len = std::strlen((const char*)content_ptr);;
		    char * t = new char[len+1];
		    std::memcpy( t, content_ptr,  len);
		    FeedInformation->feedContents()->title = t;
		}

advice much appreciated,
thank's,
Greger

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
encoding fails
user name
2006-05-25 13:14:42
On 5/25/06, bossgregerhaga.net <bossgregerhaga.net> wrote:
> <?xml version="1.0"
encoding="ISO-8859-1"?>
> <rss version="2.0">
>    <channel>
>       <title>Aftonbladet
&#246;jesliv</title>
>    </channel>
> </rss>
>
> I try to extract the title element from the above. But
the encoding is not
> recognised. What i get is this:
> Aftonbladet öjesliv

What do you mean the encoding is not recognized? That looks
like a
perfectly valid result. &#246; is U+00F6 LATIN SMALL
LETTER O WITH
DIAERESIS.

Regards,
Aron

>
> here is the code that gets the item:
>
> xmlChar * content_ptr = xmlNodeListGetString(_doc,
cur_node->xmlChildrenNode,
> 0);
> if ( !std::strcmp((const char*)cur_node->name,
"title") )
>                 {
>                     int len = std::strlen((const
char*)content_ptr);;
>                     char * t = new char[len+1];
>                     std::memcpy( t, content_ptr,  len);
>                    
FeedInformation->feedContents()->title = t;
>                 }
>
> advice much appreciated,
> thank's,
> Greger
>
> _______________________________________________
> xml mailing list, project page  http://xmlsoft.org/
> xmlgnome.org
> http://mai
l.gnome.org/mailman/listinfo/xml
>
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
encoding fails
user name
2006-05-25 13:33:33
* bossgregerhaga.net <bossgregerhaga.net>
[2006-05-25 10:35]:
> I try to extract the title element from the above. But
the
> encoding is not recognised.

Nothing to do with the encoding. Numeric Character
References in
XML always represent Unicode code points, regardless of what
encoding literal characters in the file are in.

Read http://xmlsoft.org/e
ncoding.html for an explanation of all
the issues.

Regards,
-- 
Aristotle Pagaltzis // <http://plasmasturm.org/&g
t;
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
encoding fails
user name
2006-05-25 13:19:02
* Aron Stansvik wrote:
>On 5/25/06, bossgregerhaga.net <bossgregerhaga.net> wrote:
>> <?xml version="1.0"
encoding="ISO-8859-1"?>
>> <rss version="2.0">
>>    <channel>
>>       <title>Aftonbladet
&#246;jesliv</title>
>>    </channel>
>> </rss>
>>
>> I try to extract the title element from the above.
But the encoding is not
>> recognised. What i get is this:
>> Aftonbladet öjesliv
>
>What do you mean the encoding is not recognized? That
looks like a
>perfectly valid result. &#246; is U+00F6 LATIN SMALL
LETTER O WITH
>DIAERESIS.

This appears to be a defect in your mail user agent, the
message you
reponded to was ISO-8859-1 encoded and had the o-umlaut
encoded as two
octets (C3 B6, which is the proper UTF-8 sequence). The
original problem
appears to the the usual "API gives UTF-8 but I expect
something else".
-- 
Björn Höhrmann · mailto:bjoernhoehrmann.de · http://bjoern.hoehrmann.de

Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de

68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
encoding fails
user name
2006-05-25 17:18:54
On 5/25/06, Bjoern Hoehrmann <derhoermigmx.net> wrote:
> * Aron Stansvik wrote:
> >On 5/25/06, bossgregerhaga.net <bossgregerhaga.net> wrote:
> >> <?xml version="1.0"
encoding="ISO-8859-1"?>
> >> <rss version="2.0">
> >>    <channel>
> >>       <title>Aftonbladet
&#246;jesliv</title>
> >>    </channel>
> >> </rss>
> >>
> >> I try to extract the title element from the
above. But the encoding is not
> >> recognised. What i get is this:
> >> Aftonbladet öjesliv
> >
> >What do you mean the encoding is not recognized?
That looks like a
> >perfectly valid result. &#246; is U+00F6 LATIN
SMALL LETTER O WITH
> >DIAERESIS.
>
> This appears to be a defect in your mail user agent,
the message you
> reponded to was ISO-8859-1 encoded and had the o-umlaut
encoded as two
> octets (C3 B6, which is the proper UTF-8 sequence). The
original problem
> appears to the the usual "API gives UTF-8 but I
expect something else".

Ah. Right. Using Gmail so it showed it just fine.

Aron
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml
[1-5]

about | contact  Other archives ( Real Estate discussion Medical topics )