Hello Benoit,
An additonal thing to check out is the work being done on
fac-back-opac.
They have a parser that will parse native MARC records.
I would assume that if you can extract your records in MARC
XML you can
extract them in native MARC.
I've used the parser and it works well.
al
On Fri, 2007-10-05 at 02:44 -0500, PAUWELS Benoit wrote:
> Hi,
>
>
>
> I wish to index well formed xml documents as they are.
>
> I have a database filled with MARCXML records. An
example of these looks like this:
>
>
>
> <record
>
> ns0:schemaLocation="http://www.loc.gov/MAR
C21/slim http://www.loc.gov/standards/marcxml/schema/MARC
21slim.xsd"
>
> xmlns="http://www.loc.g
ov/MARC21/slim" xmlns:ns0="h
ttp://www.w3.org/2001/XMLSchema-instance">
>
> <leader>00000nam 22 a
4500</leader>
>
> <controlfield
tag="001">000500000</controlfield>
>
> <controlfield
tag="005">20050826220257.0</controlfield>
>
> <controlfield
tag="008">000710s1998 xx r 000 0
dut d</controlfield>
>
> <datafield ind1=" "
ind2=" " tag="040">
>
> <subfield
code="a">Univ</subfield>
>
> </datafield>
>
> <datafield ind1="1"
ind2=" " tag="100">
>
> <subfield code="a">van
Wetten, J. W.</subfield>
>
> </datafield>
>
> <datafield ind1="1"
ind2="3" tag="245">
>
> <subfield code="a">De
positie van vrouwen in de asielprocedure /</subfield>
>
> <subfield code="c">J.W.
van Wetten, N. Dijkhof, F. Heide.</subfield>
>
> </datafield>
>
> </record>
>
>
>
> The idea is to create Lucene indexes on specific MARC
fields and store the complete MARC record in Lucene 'as is'.
In the presentation layer of my application I would then
have this complete MARC record at hand, and as such have
full flexibility on which MARC fields to display. So I want
to create the following record through XSLT and feed this to
SOLR.
>
>
>
> <doc>
>
> <field name="title">De positie van
vrouwen in de asielprocedure</field>
>
> <field name="author">van Wetten, J.
W.</field>
>
> ...
>
> <field name="originalRecord">
>
> <record
>
> ns0:schemaLocation="http://www.loc.gov/MAR
C21/slim http://www.loc.gov/standards/marcxml/schema/MARC
21slim.xsd"
>
> xmlns="http://www.loc.g
ov/MARC21/slim" xmlns:ns0="h
ttp://www.w3.org/2001/XMLSchema-instance">
>
> <leader>00000nam 22 a
4500</leader>
>
> <controlfield
tag="001">000500000</controlfield>
>
> <controlfield
tag="005">20050826220257.0</controlfield>
>
> <controlfield
tag="008">000710s1998 xx r 000 0
dut d</controlfield>
>
> <datafield ind1=" "
ind2=" " tag="040">
>
> <subfield
code="a">UGent</subfield>
>
> </datafield>
>
> <datafield ind1="1"
ind2=" " tag="100">
>
> <subfield code="a">van
Wetten, J. W.</subfield>
>
> </datafield>
>
> <datafield ind1="1"
ind2="3" tag="245">
>
> <subfield code="a">De
positie van vrouwen in de asielprocedure /</subfield>
>
> <subfield code="c">J.W.
van Wetten, N. Dijkhof, F. Heide.</subfield>
>
> </datafield>
>
> </record>
>
> </field>
>
> </doc>
>
>
>
> I have the following in my schema.xml:
>
>
>
> <field name="author" type="text"
indexed="true" stored="true"
termVectors="true"/>
>
> <field name="title" type="text"
indexed="true" stored="true"
termVectors="true"/>
>
> <field name="originalRecord"
type="text" indexed="false"
stored="true"/>
>
>
>
>
>
> SOLR has of course a problem with the XML in the
'originalRecord' field.
>
> Is there a solution to this? Has anyone done this
before?
>
>
>
> Thanks a lot.
>
> Benoit.
>
>
>
>
>
> =============================
>
> PAUWELS Benoit
>
> Université Libre de Bruxelles - Libraries
>
> Head of Automation
>
> Av. F.D. Roosevelt 50, CP 180
>
> 1050 BRUSSELS
>
> Belgium
>
> Tel: + 32 2 650 23 91
>
> Fax: + 32 2 650 23 91
>
> =============================
>
>
>
>
>
--
Alan Rykhus
PALS, A Program of the Minnesota State Colleges and
Universities
(507)389-1975
alan.rykhus mnsu.edu
------------------------------------------------------------
-----------
"You and I as individuals can, by borrowing, live
beyond our means, but
only for a limited period of time. Why should we think that
collectively, as a nation, we are not bound by that same
limitation?"
-- Ronald Reagan
|