List Info

Thread: Re: RSS and diacritics




Re: RSS and diacritics
country flaguser name
Australia
2007-11-29 15:29:25
Hi Bob

Bob Rasmussen wrote:
 > On Thu, 29 Nov 2007, Thomas Dowling wrote:
 >
 >> The more adept browsers out there figured this out
quite a while 
ago.  If the
 >> font they're using doesn't have a glyph for the
character requested, 
they pull
 >> the correct glyph from a font that does have it. 
Awkwardly, there's 
a less
 >> adept browser that fails to do this, that has
about 80% market share...
 >>
 >> CSS2 requires that browsers work their way down
the list of 
specified fonts to
 >> find the right glyph, not just find a matching
font name.  IIRC, 
Gecko-based
 >> browsers and Opera go beyond that to find any
system font with the right
 >> glyph.
 >
 > As an aside, that is precisely the approach taken by
Anzio, our 
terminal emulation package, and Print Wizard, our printing
utility. 
These programs also take many steps to handle combining
diacritics well, 
including raising the "above" diacritics where
necessary to avoid 
collision with the base character.
 >
 > My perception of the most common issues in regards to
library systems 
displaying (and printing) diacritics and non-Latin
characters:
 >
 > 1) Very few fonts have the combining double tilde and
combining 
double ligature marks, used mostly with transliterated
Russian.
 >

Try the SIL fonts. Charis SIl and Doulos SIL, have had those
diacritics 
displaying correctly. Hopefully Gentium Book when its
finally released 
will also support these diacritics.

But also depends on your font rendering technology in use,
either latest 
Uniscribe, or Graphite within Windows.

My gut reaction though is the core limitation is going to
be, not the 
fonts or the font rendering system, it is actually the web
pages 
generated by the vendors. Well structured content following
web 
internationalization and accessibility best practice would
be a breeze 
to tweak and get all languages to display fine.

 > 2) Software does not correctly combine combining
diacritics.

This is simply poor softwrae internationalization. On the
right 
operating system there is no excuse for diacritics not
displaying 
properly. if the default rendering of the operating system
supports it, 
there is no excuse for an application that is well
internationalised to 
not support it.

Personally, I think vendors are let off too lightly.
Generally, they say 
they support Unicode, but never spell out what parts they
support and 
what parts they don't.

 From the perspective of my work place, all our web
interfaces should 
support our state government's web standards. I doubt there
is a single 
vendor solution in use in our state that does meet those
standards.

 > 3) Fonts are inconsistent in the way they specify the
X-location of 
combining diacritics.

A font should use the mark and mkmk features in the GPOS
table to 
indicate the placement of a diacritic relative to a specific
base 
character or relative to another diacritic.

But currently few do.

And my current compliant about Vista core fonts is that it
positions 
combining diacritics conistently at a different height than
the 
diacritic placement of precomposed glyphs, makes for ugly
text when 
using a mix of precomposed and composed forms which may be
necessary in 
some languages.

 > 4) Library software I have worked with does not give
the browsers 
information about the language contained in a particular
section of 
text. Thus the browser can not take advantage of the user's

language-specific font preferences. This is especially a
problem in 
rendering Han characters, which could be part of a Japanese,
Korean, 
Simplified Chinese, or Traditional Chinese title, for
instance. With IE, 
this seems to force the user to use one super-font, which
inevitably has 
shortcomings.
 >

Yes, An in this scenario , different browsers will have
different 
responses. Richard Ishida (w3C) but together a test of html
CJK data 
that wasn't language tagged. Some browsers will default to
displaying 
CJK data with a Japanese font, others will use a Simplified
Chinese 
font, in at least one case an older version of opera
defaulted to a 
Korean font.

 > Finally, Andrew Cunningham mentioned Font Linking.
According to MS's 
documentation, this should make it possible to define a
large virtual 
font by linking together multiple fonts, without physically
combining 
the files. So theoretically I could create a font with the
missing 
ligature marks (see 1 above), and link it to Arial Unicode,
for 
instance. However, I have never succeeded in this in regards
to IE. Has 
anyone succeeded in doing this?

Not quite that simple.

To support the missing ligature marks, you'd be better off
with a whole 
new OpenType font.

To properly handle combining diacritics, esp the double
diacritics, you 
need to treat the Latin script as a complex script. Which
for Windows, 
means dealing with uniscribe. And a lot of the font linking
smarts 
Microsoft uses in its applications are script dependent and
built into 
Uniscribe. Often this is a fallback.


If you are on Win XP SP2, or Vista download the Charis SIL
font at 
http://scripts.sil.org/cms/scr
ipts/page.php?site_id=nrsi&id=CharisSIL_download, 
its released under OFL so can be redistributed or modified
under that 
license.

Then have a look at http://ww
w.openroad.net.au/test/sample.html

Andrew

-- 
Andrew Cunningham
Research and Development Coordinator (Vicnet)
State Library of Victoria
328 Swanston Street
Melbourne VIC 3000
Australia

Email: andrewc+AEA-vicnet.net.au
Alt. email: lang.support+AEA-gmail.com

Ph: +613-8664-7430                    Fax:+613-9639-2175
Mob: 0421-450-816

http://www.slv.vic.gov.au/
            http://www.vicnet.net.au/
http://www.openroad.net.a
u/           http://www.mylanguage.g
ov.au/
http://home.vicne
t.net.au/~andrewc/

_______________________________________________
Web4lib mailing list
Web4libwebjunction.org
http://lists.we
bjunction.org/web4lib/

[1]

about | contact  Other archives ( Real Estate discussion Medical topics )