List Info

Thread: RE: Unicode Strings in .NET




RE: Unicode Strings in .NET
country flaguser name
United States
2007-05-18 12:42:24

Hi All!

Let's not kill off STRING_8 just yet. I have often used Eiffel's STRING class to represent octet streams. It does this quite well. Dropping it altogether is quite unsatisfactory. At least keep it around with a new name if you must, though STRING_8 works fine for me, and unless you're a died-in-the-wool Null-termination guy, you shouldn't have an issue with that. We MUST not throw the baby out with bath water (again). By thinking of STRING as existing only to represent the printable realm, we are missing a major point. Are we being distracted by the fact the STRING is a sequence of CHARACTERs? Then let's blame CHARACTER for the assumption.

The next time you read a stream of bytes from a raw file, what will the result be? It certainly won't be a STRING_32.

By the way, I've had a quite workable (though not terribly efficient) implementation of UTF8_STRING for years now, based on the classic Eiffel STRING. It is because the Eiffel STRING was so print-agnostic that this was possible.

R

==================================================
Roger F. Osmond
----------------------------------------
Amalasoft Corporation
273 Harwood Avenue
Littleton, MA 01460

> -------- Original Message --------
> Subject: Re: [eiffel_software] Unicode Strings in .NET
>; From: "Peter Gummer&quot; < p-gummer%40bigpond.net.au">p-gummerbigpond.net.au>
> Date: Fri, May 18, 2007 8:40 am
> To: < eiffel_software%40yahoogroups.com">eiffel_softwareyahoogroups.com>;
>
> Emmanuel Stapf wrote:
&gt;
> >> What are the plans for Unicode in Eiffel?
> >
>; > 1 - Add support for reading Unicode Eiffel class text
>; > 2 - Add support for reading manifest STRING_32
> >
>; > Once the above are done, you can select in your configuration file
>; that
> > STRING
&gt; > is actually STRING_32.
> >
>; > Regarding the default of STRING which is STRING_8, we will keep it for
> > more
>; > releases since a lot of code still relies on it (mostly code based on
> > C/C++
> > externals).
>
> Thanks for the detailed explanation, Manu. It seems that the distinction
> between STRING_8 and STRING_32 is an expediency. Java and C# have the
> luxury
> of having been invented after Unicode; some older languages like Eiffel
> aren't so lucky, and so they have to go through a period of migration
> from
> 8-bit characters to Unicode.
>
> What I hope to see, in the not too distant future, is that STRING_8
> would
> become obsolete, and STRING_32 would be renamed to STRING.
> STRING_GENERAL
> would disappear too. There would simply be STRING, plus possibly some
> supporting classes to provide different encodings.
>
> (By the way, I solved my .NET string performance problems, which I
> mentioned
> early in this thread, by reworking my implementation of
> SYSTEM_STRING_FACTORY. I've described how I did this at
> http://www.eiffelroom.com/blog/peter_gummer/utf_8_in_net_revisited.)
>
> - Peter Gummer

__._,_.___
.

__,_._,___
Re: Unicode Strings in .NET
country flaguser name
Australia
2007-05-18 19:46:53

Roger Osmond wrote:

> The next time you read a stream of bytes from a raw file,
> what will the result be?

I would have thought something like ARRAY [NATURAL_8].

- Peter Gummer

__._,_.___
.

__,_._,___
[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )