-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Moin,
On Friday 30 March 2007 23:06:47 Marvin Humphrey wrote:
> On Mar 30, 2007, at 2:25 PM, Juerd Waalboer wrote:
> >> That so many users, including those as expert
as Marc, possess a
> >> "broken" understanding of Perl's
Unicode model suggests a flawed
> >> design.
> > I think the design is solid, but the
implementation (see regex)
> > slightly
> > broken and documentation wildly misleading.
>
> I strongly disagree with this assessment. In
particular, I think
> insisting that the user be responsible for manually
segregating
> character and byte-oriented data without any help from
Perl is
> totally unreasonable.
>
> Look at how easily Marc made the "mistake" of
commingling the two
> types of data. It's debatable whether the fact that
Perl allowed him
> to do that without complaint is a flaw with the design
or the
> implementation, but it's one or the other and it's
serious.
>
> Additionally, as Marc points out, there are lots of
broken XS modules
> out there -- including one of mine. (KinoSearch 0.15 --
Unicode
> support is fixed as of 0.20_01, which breaks backwards
> compatibility.) Few or none of them would be broken if
Perl made it
> more difficult to move between character data and
byte-oriented data
> -- errors would be flying right and left and the broken
modules would
> get fixed right away.
>
> Of course I understand why that cannot be the case, but
it's
> astonishing to me that you see this as a problem which
can be solved
> via documentation.
I think just documenting isn't enough. We do have things
like "strict", so
if the current Perl model doesn't allow you to even detect
when you mix the
wrong kind of data, then we need module/pragma that catches
these errors.
Of course warnings::encode exists, but it seems to not be
able to
distinguish between "untagged" data and real
ISO-8859-1 strings as Perl
itself doesn't make this distinction.
> How about encouraging the use of encoding::warnings in
perlunitut?
>
> How about adding it to core and having 'use 5.10;' turn
it on?
If I understand correctly, that would not be enough due to
the "is this
binary or really iso-8859-1 encoded data" problem
mentioned above.
all the best,
tels
- --
Signed on Sat Mar 31 01:42:47 2007 with key 0x93B84C15.
View my photo gallery: http://bloodgate.com/phot
os
PGP key on http://bloodgate.com/te
ls.asc or per email.
"In 1988, Jack Thompson ran against Janet Reno for DA
of Dade County:
Thompson's unique campaign message was that Reno was unfit
for the job
because, as a closeted lesbian with a drinking problem, she
was great
candidate for blackmail by the criminal element. Jack never
explained
why this remained a threat even after he exposed her
'secret'. Reno
cruised at the polls."
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2 (GNU/Linux)
iQEVAwUBRg29jncLPEOTuEwVAQJALAf/SsSjz5VB4l3Zcggd18SNmdTq8DpB
LUtP
pxiPCs0fYrEtDny/HvDCbQss/nEaGmFwPaVpAA+kFp8jss3h3xzklW6MwAm7
Aisy
+EiZO0JEcADXRWr9CChJpWfMr0qllmzsUUKHa6wc9iXagD6kPoiL49Ay5bkq
PBDT
OKOfcJIRDqk12VKATpdQlBIHR3cEpnUMdh8QKhmAArkXAsV5cZGBC9EGm8l+
dgeK
Uc2k7pxvLXdjCZu6YbJfPwwdiLlugL23Bci7sZrCO/JyboBOK3ch5dWYohZ8
QoMw
SahL/axgJ1DeFTP2ryL6wvnM1djF+HSbzoaLD1E+d7XJqB700Qxdfg==
=eI9w
-----END PGP SIGNATURE-----
|