Gary Anderson wrote:
> The YAZ library converts UCS value 0x9234 to the triple
0x21 0x5D 0x58.
> The LOC code tables identify this as a variant of the
hangul character
> 0x4B 0x5D 0x58 which is also represented as 0x9234 in
UCS. Is there a
> reason that YAZ is selecting the variant instead of the
non-variant form?
The short answer is: the XML parser which generates
conversion code does
not read XML comments to get the details. For this
particular case, the
fragment reads:
<code>
<marc>215D58</marc>
<ucs>9234</ucs>
<utf-8>E988B4</utf-8>
<name>East Asian ideograph (variant of
EACC 4B5D58)</name>
</code>
Does anybody have suggestions to better ways (less dirty)
than reading
XML comments with regexp's?
Doesn't sound right to me
/ Adam
>
> Gary
>
> _______________________________________________
> Yazlist mailing list
> Yazlist lists.indexdata.dk
> http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list
_______________________________________________
Yazlist mailing list
Yazlist lists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list
|