List Info

Thread: Re: the utf8 flag (was Re: decode_utf8 sets utf8 flag on plain ascii strings)




Re: the utf8 flag (was Re: decode_utf8 sets utf8 flag on plain ascii strings)
user name
2007-03-31 01:11:37
On Sat, Mar 31, 2007 at 03:15:50AM +0200, Juerd Waalboer
<juerdconvolution.nl> wrote:
> Marc Lehmann skribis 2007-03-31  2:48 (+0200):
> > > A koi8r string is a byte string. If you keep
it separated from text
> > Your definiton is completely useless in the real
world. Obviously, a KOI8-R
> > string is a text string. It contains text
characters. End of story.
> 
> This is a logical thing to say, but unfortunately not
very useful.

Thanks, I'll take logical over subjective opinions any day.

> The distinction between a text string, and a byte
string representing
> text, is actually useful.

It is useful, but making it the mandatory is stupid, because
you lose the
ability to handle real-world situations, for example JSON,
which simply does
not make the distinction. Ther same is true for Pelr, which
also does not
make the distinction.

> > You also have very weird ideas of what programmers
should and should
> > not do the defy reality.
> 
> Weird ideas, maybe, but at least weird ideas that help
dozens of people
> write working and maintainable code.

Likely, but its still your personal opinion, your personal
coding style.
Forcing that on everybody else by calling everything that
doesn't fit
(such as JSON) "broken" does not convince _me_
that it is a good coding
style.

> You don't believe in my weird ideas, fine. But I find
it very
> interesting that you run into all these problems with
Perl's unicode
> support, while the people who stick to my weird ideas
write lots of code
> without that.

Goddamnit, I more than once told you that I am not running
into those
problems because I know most perl bugs regarding unicode
inside and out. I am
doing unicode programming for far longer than Perl easily
supports it, and I
would be grateful if you would stop bullshitting me and
spreading lies.

I *explicitly* said that it is other users who hit problems,
and that I
can cope with them quite well.

> > I find all that contradictory, but as you ignore
the evidence I
> > presented and the question I asked you (JSON::XS
example), I see no
> > point in continuing talking to you.
> 
> Unfortunately, I understand very little of the JSON
example. I don't
> know JSON and would have to learn about it first.

Well, its one of that reality things where your coding style
blankly breaks
down: JSON makes no difference between binary and text,
except that binary
only uses character indices 0..255. You do not know wether a
json string is
binary or text. Usage decides.

One such usage is unpack, and I find it weird that I have to
use "U" to get
binary semantics in unpack. Or you have to downgrade
explicitly.

Anyways, that clashes with your notion that the programmer
made a bug when
binary data happens to be UTF-X encoded internally. Reality
hits, you
lose, simply because calling usage of JSON broken according
to your coding
standards will not have any effect on JSON.

And the way JSON handles binary is extremely common in the
real world. And
it is exactly how perl handles it, modulo bugs and, well,
unpack (and the
unfortunate decision to give old XS code sometimes bytes
encoded in UTF-X,
sometimes not).

Perl simply does _not_ work like you want it to. Instead, it
is much simpler
because in the majority of cases it just works without
having to track wether
my binary string came in contact with something that
upgraded it. I simply do
not have to care in Perl, except for the cases above.

And thats the good thing. Teaching people to avoid upgrading
by your text vs.
binary string technique is confusing. It is backwards.
People should not have
the need to be concerned about upgrading, because it is an
internal thing.

And yes, I said I would not answer you, but what prompted it
was your
continuous abusive behaviour of putting words into my mouth
I have
*explicitly* said to not have said, and explaine dit in
detail.

-- 
                The choice of a
      -----==-     _GNU_
      ----==-- _       generation     Marc Lehmann
      ---==---(_)__  __ ____  __      pcggoof.com
      --==---/ / _ / // / / /      http://schmorp.de/
      -=====/_/_//_/_,_/ /_/_      XX11-RIPE

[1]

about | contact  Other archives ( Real Estate discussion Medical topics )