Marvin Humphrey skribis 2007-03-30 14:00 (-0700):
> >Perl does not have strong typing.
> If it is so deadly to collide byte-oriented data with
character data,
> it should not be so easy to do so accidentally.
I agree. But Perl chose to have the same single data type
for all
strings, and to maintain compatibility with older Perls by
assuming that
your byte string is a latin1 string if you start using it as
a text
string. After all, in a strictly 8 bit world, there's no
need for a
distinction, so people were never careful about it.
(Well, there was a need, but ignorance being bliss ignoring
that was
better for anyone's sanity.)
It kind of bothers me that people constantly whine about
this decision
years after it was made. The time to influence the decision
has past. It
just seems so counter-productive to keep bringing it up,
while there are
bugs to be discovered and fixed.
I wasn't active in p5p back then, and if I had been, I would
probably
not have overseen the consequences, just like the porters
then didn't.
But wonderfully, a rather consistent and usable plus useful
model was
invented, with better/easier Unicode/encodings support than
any other
programming language. Of course it's never good enough, but
let's first
focus on finding and fixing bugs.
> That so many users, including those as expert as Marc,
possess a
> "broken" understanding of Perl's Unicode
model suggests a flawed
> design.
I think the design is solid, but the implementation (see
regex) slightly
broken and documentation wildly misleading.
The documentation thing I'm trying to fix with perlunitut,
perlunifaq,
and a lot of changes to existing documentation, all of which
are now
part of bleadperl and will probably be part of the next Perl
release.
In addition, I'm maintaining a consise list of best
practices at
http://juerd.nl/perluni
advice, and spending tuits on teaching people
(including module maintainers) about the One Way To Do It,
because there
is, in fact, just one way that really works well in this
case. You just
have to find it, and stick to it. TIMTOWTDI doesn't always
apply.
> We have been set up to fail.
Maybe so, but you haven't given up yet, and I hope you
won't. Please
join us in the effort to deal with the problems at hand.
It's a hell of
a lot more productive than praying for the opportunity to
undo recent
years of Perl.
Surely you must know a way in which Perl's unicode support
can be
improved, or accidents avoided, without trying to change all
of Perl,
CPAN, and a gazillion lines of code that we can't even
reach. Let's hear
it!
Thanks,
--
korajn salutojn,
juerd waalboer: perl hacker <juerd juerd.nl> <http://juerd.nl/sig>
convolution: ict solutions and consultancy
<sales convolution.nl>
Ik vertrouw stemcomputers niet.
Zie <ht
tp://www.wijvertrouwenstemcomputersniet.nl/>.
|