List Info

Thread: ANSEL text encoding




ANSEL text encoding
country flaguser name
United States
2007-04-08 06:55:23
Hi,

I have some questions about ANSEL encoding and ZOOM.

1. It appears that ANSEL is the same as MARC-8...is that
true? What 
should "charset" be set to to use this encoding?

2. Googling around, I found statements that MARC-8 is
automatically 
converted to UTF-8 by YAZ. True? And with ZOOM?

Any info about encodings is appreciated.

Thanks,

Jon

_______________________________________________
Yazlist mailing list
Yazlistlists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list

Re: ANSEL text encoding
country flaguser name
United States
2007-04-09 11:32:03
Actually, MARC-8 uses ANSEL (ANSI Extended Latin) as its
default 
character set, but also uses other non-roman character sets
as well. In 
that sense they are not the same: ANSEL is a separate spec
that is 
included as a subset of MARC-8 character sets.

Dave

jda wrote:

> Hi,
>
> I have some questions about ANSEL encoding and ZOOM.
>
> 1. It appears that ANSEL is the same as MARC-8...is
that true? What 
> should "charset" be set to to use this
encoding?
>
> 2. Googling around, I found statements that MARC-8 is
automatically 
> converted to UTF-8 by YAZ. True? And with ZOOM?
>
> Any info about encodings is appreciated.
>
> Thanks,
>
> Jon
>
> _______________________________________________
> Yazlist mailing list
> Yazlistlists.indexdata.dk
> http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list
>

_______________________________________________
Yazlist mailing list
Yazlistlists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list

Re: ANSEL text encoding
country flaguser name
United States
2007-04-09 12:10:32
>Actually, MARC-8 uses ANSEL (ANSI Extended Latin) as its
default 
>character set, but also uses other non-roman character
sets as well. 
>In that sense they are not the same: ANSEL is a separate
spec that 
>is included as a subset of MARC-8 character sets.
>

Thanks. With ZOOM, should I specify MARC8 or ANSEL, then, if

accessing an ANSEL site (I'm using "marc8" now)?

Another question I have is if I need to encode the query as
ANSEL? 
Right now I'm sending queries as ISO-8559-1 to site that
support 
ANSEL (because I don't know how to do the ANSEL encoding
myself). I'm 
getting mixed results with queries that have accented
characters, but 
I don't know if that's because the query isn't encoded as
ANSEL or 
whether the library just doesn't handle accented characters
correctly.

If the query needs to be ANSEL/MARC8-encoded, does ZOOM
handle that 
(I've poured over the docs I could find, but see nothing
specific 
about ZOOM handling query encoding)?

Thanks again,

Jon

_______________________________________________
Yazlist mailing list
Yazlistlists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list

Re: ANSEL text encoding
country flaguser name
Denmark
2007-04-09 13:56:57
jda wrote:

>> Actually, MARC-8 uses ANSEL (ANSI Extended Latin)
as its default 
>> character set, but also uses other non-roman
character sets as well. 
>> In that sense they are not the same: ANSEL is a
separate spec that is 
>> included as a subset of MARC-8 character sets.
>>
>
> Thanks. With ZOOM, should I specify MARC8 or ANSEL,
then, if accessing 
> an ANSEL site (I'm using "marc8" now)?
>
> Another question I have is if I need to encode the
query as ANSEL? 
> Right now I'm sending queries as ISO-8559-1 to site
that support ANSEL 
> (because I don't know how to do the ANSEL encoding
myself). I'm 
> getting mixed results with queries that have accented
characters, but 
> I don't know if that's because the query isn't encoded
as ANSEL or 
> whether the library just doesn't handle accented
characters correctly.
>
> If the query needs to be ANSEL/MARC8-encoded, does ZOOM
handle that 
> (I've poured over the docs I could find, but see
nothing specific 
> about ZOOM handling query encoding)?

Sadly, there hasn't been a lot of profiling or
standardization in the 
use of charactersets other than ASCII in Z39.50 servers. If
you want to 
be sure to do the right thing, you should contact the
library or server 
vendor and ask. ZOOM does not presently support
normalization of the 
character set in queries, but you may be able to use
lower-level YAZ 
tools, or the iconv library itself, to perform the mapping.
We've seen 
very mixed results with accented characters in our
applications.

Hope this helps,

--Sebastian

>
> Thanks again,
>
> Jon
>
> _______________________________________________
> Yazlist mailing list
> Yazlistlists.indexdata.dk
> http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list
>
>

-- 
Sebastian Hammer, Index Data
quinnindexdata.com   www.indexdata.com
Ph: (603) 209-6853 Fax: (866) 383-4485


_______________________________________________
Yazlist mailing list
Yazlistlists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list

[1-4]

about | contact  Other archives ( Real Estate discussion Medical topics )