Tels skribis 2007-03-31 12:23 (+0000):
> #!/usr/bin/perl -w
> use Encode qw/decode/;
> my $random = "xc3xc3"; # some
random bytes
> my $ascii = "a"; # some 7bit data
>
> # Somebody "helpfull" decodes the ascii
string:
> # The encoding doesn't actually matter, since it is
7bit anyway.
> # This step happens out of my control (e.g. in third
party code)
> $string = decode('ISO-8859-1', $ascii);
$string is a text string, now. Remember, decoding is going
from byte
string to text string.
Using unpack "C" on a text string makes no sense
if you consider that
this "C" doesn't stand for "character"
in the sense that the
documentation for chr, ord, length, split, etcetera use. It
stands for
"char", which is a C datatype that contains one
byte.
As such, unpack "C" is a byte operation and makes
sense on byte strings
only. $string is a text string, and you can tell by looking
at the
decode() step.
> # now take our random binary data and a 7bit ascii
string and do:
> print join (" ", unpack("CCC",
"$random$string")), "n";
Dangerous, and that's why I suggested adding a "wide
character in..."
warning earlier in this thread.
> Now explain to me why this prints different things even
tho $random is the
> same string in both cases, and $string and $ascii
should be the same,
> too. Bonus
points if you manage to not mention the uhh -- ut - utf --
> uhm -- er The Flag[tm].
I get the bonus points! Hurrah!
The only explanation that I used is the separation between
text strings
and binary strings. It's also the only thing you need to
know. You'll
benefit from knowing more, certainly, but I see red flags in
your code.
> So far, I can see the ways to handle this are:
> (..)
> * never mix fire and water er dogs and cats er I mean
text and bytes, and
> pray that every piece of code out there to adheres to
this, too.
Exactly.
> I think the Pray and Hope[tm] strategy doesn't really
work, tho.
It doesn't always work, because people can't be trusted to
do the right
thing, but it can always be fixed.
--
korajn salutojn,
juerd waalboer: perl hacker <juerd juerd.nl> <http://juerd.nl/sig>
convolution: ict solutions and consultancy
<sales convolution.nl>
Ik vertrouw stemcomputers niet.
Zie <ht
tp://www.wijvertrouwenstemcomputersniet.nl/>.
|