Issue status update for
http://sma
lltalk.gnu.org/project/issue/113
Post a follow up:
htt
p://smalltalk.gnu.org/project/comments/add/113
Project: GNU Smalltalk
Version: <none>
Component: Base classes
Category: bug reports
Priority: normal
Assigned to: Unassigned
Reported by: elmex
Updated by: bonzinip
Status: active
Attachment: http://smalltalk.gnu.org/files/issues/gst-encoding-
lazy.patch (594 bytes)
EF-BF-BE is the unicode "byte order mark" (BOM)
encoded in UTF-8. It
was born as a way to distinguish big- and little-endian
UTF-16. Since
it's not really a character, Iconv tries to strip it when
converting to
a UnicodeString, but it is failing to do so in this case.
Now, under Mac OS X I get the expected result, under Linux I
get yours.
The reason is that my Mac is big-endian, so Iconv produces
big-endian
UTF-16, while Linux produces little-endian UTF-16. Since
the default
encoding of UTF-16 is big-endian, the Mac happens to get the
right
thing, while Linux messes up the encoding. So later on the
"pipe
peekFor: $<16rFEFF>" statement to strip the BOM
does not work.
The attached patch fixes this by making EncodedString look
for a BOM
when retrieving the encoding, rather than when setting it.
_______________________________________________
help-smalltalk mailing list
help-smalltalk gnu.org
http://lists.gnu.org/mailman/listinfo/help-smalltalk
|