List Info

Thread: Re: Updated: (LUCENE-1029) Illegal character replacements in ISOLatin1AccentFilter




Re: Updated: (LUCENE-1029) Illegal character replacements in ISOLatin1AccentFilter
country flaguser name
Poland
2007-10-17 01:35:00
This gets even more complicated when you throw Polish in. We
do have diacritics 
(such as ó, ż, ź or ą)

http://www.fileformat.info/info/unicode/char/0105/ind
ex.htm

but we _also_ have things like "ł" (l with a
stroke):

http://www.fileformat.info/info/unicode/char/0142/ind
ex.htm

I don't think the stroke in "ł" would qualify as
a diacritic mark... to me it's 
more like a different letter.

Anyway, most Poles are _very_ comfortable with writing
e-mails and querying 
search engines with stripped diacritics (and the letter ł
replaced by l) even if 
this often leads to change of meaning of the original word.
I guess it is so 
because typing diacritics slows you down a bit. Pragmatism.

Dawid


------------------------------------------------------------
---------
To unsubscribe, e-mail: java-dev-unsubscribelucene.apache.org
For additional commands, e-mail: java-dev-helplucene.apache.org


[1]

about | contact  Other archives ( Real Estate discussion Medical topics )