Jon,
I can't answer your ZOOM-related questions, but I can tell
you
that you are going to find a variety of character-set
support
with Z39.50 servers. For example, our server is not
currently
behaving as might be expected. Search terms that contain
diacritics (or special characters) must have those special
characters encoded in UTF-8 (or not present in the search
term)
in order to match entries in our indexes. (We don't
currently
support character-set negotiation, and we are currently
configured
to return records in MARC-8 only.)
Some sample searches follow, FYI.
Larry
-----------------------------------------------
Z> open z3950.loc.gov:7090/voyager
ID : 34
Name : Voyager LMS - Z39.50 Server (YAZ Proxy)
Version: 2003.1.1/1.2.1.1
Options: search present
Z> f attr 1=1003 "BFohmer, GFunter" [MARC-8
umlauts]
Number of hits: 0
Z> f attr 1=1003 "Bo&jhmer, Gu&jnter [UTF-8
umlauts]
Number of hits: 42
Z> s 1
Sent presentRequest (1+1).
Records: 1
[VOYAGER]Record type: USmarc [Record returned in
MARC-8]
00819cam 2200217 a 4500
001 948744
005 20030317112257.0
008 830722s1969 gw a 000 0 ger c
035 $9 (DLC) 83672065
906 $a 7 $b cbc $c orignew $d u $e ncip $f 19 $g
y-gencatlg
010 $a 83672065
040 $a MH $c MH $d DLC
050 00 $a Z4.Z9 $b B83 1969
245 00 $a BFucher und Menschen / $c mit BeitrFagen von Peter
Suhrkamp
... [et al.] ; und mit einem Nachwort von Georg Kurt Schauer
;
Zeichnungen von Gunter BFohmer.
260 $a Frankfurt am Main : $b Mergenthaler-Verlag der
Linotype, $c
c1969.
300 $a 115 p. : $b col. ill. ; $c 29 cm.
650 0 $a Books $z Germany.
650 0 $a Books and reading $z Germany.
700 1 $a Suhrkamp, Peter, $d 1891-1959.
700 1 $a BFohmer, Gunter, $d 1911-
Z> f attr 1=1003 "Bo&jhmer, Gunter [One UTF-8
umlaut (surname)]
Number of hits: 42
Z> f attr 1=1003 "Bohmer, Gu&jnter [One UTF-8
umlaut (first name)]
Number of hits: 42
Z> f attr 1=1003 "bohmer, gunter" [No
umlauts]
Number of hits: 42
On Mon, 9 Apr 2007, jda wrote:
> >Actually, MARC-8 uses ANSEL (ANSI Extended Latin)
as its default
> >character set, but also uses other non-roman
character sets as well.
> >In that sense they are not the same: ANSEL is a
separate spec that
> >is included as a subset of MARC-8 character sets.
> >
>
> Thanks. With ZOOM, should I specify MARC8 or ANSEL,
then, if
> accessing an ANSEL site (I'm using "marc8"
now)?
>
> Another question I have is if I need to encode the
query as ANSEL?
> Right now I'm sending queries as ISO-8559-1 to site
that support
> ANSEL (because I don't know how to do the ANSEL
encoding myself). I'm
> getting mixed results with queries that have accented
characters, but
> I don't know if that's because the query isn't encoded
as ANSEL or
> whether the library just doesn't handle accented
characters correctly.
>
> If the query needs to be ANSEL/MARC8-encoded, does ZOOM
handle that
> (I've poured over the docs I could find, but see
nothing specific
> about ZOOM handling query encoding)?
>
> Thanks again,
>
> Jon
>
------------------------------------------------------------
Larry E. Dixson Internet: ldix loc.gov
Network Development and MARC
Standards Office, LA327
Library of Congress Telephone: (202)
707-5807
Washington, D.C. 20540-4402 Fax: (202)
707-0115
_______________________________________________
Yazlist mailing list
Yazlist lists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list
|