List Info

Thread: Help us test a new service




Help us test a new service
country flaguser name
Denmark
2007-02-07 10:15:47
Hi guys,

It's long been on our mind to use our favorite standards to 
leverage 
the quickly growing repositories of open content out there.

The first step of this is the establishment of a new Z39.50
server, at 
econtent.indexdata.com, port 210, which provides access to
the following 
logical databases:

oaister             -- Metadata from the OAIster OAI service
provider 
(http://oaist
er.umdl.umich.edu/o/oaister/), about 10 million metadata

records harvested from various open access archives.

wikipedia        -- title searching wikipedia titles,
abstracts, and 
links. Around 1,5 million records.

oca-americana  -- Full (  MARC
records for books scanned as part of 
the Internet Archive's OCA (Open Content Alliance)
initiative. There's 
around 50,000 books in this collection. Within a day or two,
we will be 
adding the remaining Text collections from the Internet
Archive, 
including Gutenberg scans and others. The OCA is producing
around a 
million new pages of high quality scans, searchable PDFs and
online 
books per month at the moment, so this is an exciting
collection to watch.

dmoz                -- Human-cataloged web resources.
Several million 
sites cataloged.

All databases will return records in XML/DC and MARC, of
varying 
quality. We aim to keep our copies of these resources
updated on regular 
schedules. We are actively looking for new, interesting
repositories of 
open content to make available in this way. If you have
suggestions, 
please feel free to get in touch with us.

The server suports a fairly basic set of USE attributes, and
the usual 
combinations, but since this is running on our Zebra server,
you can 
also add a 2=102 to any term to produce a relevance-ranked
result list. 
We'll post a website shortly with a more thorough list of
options, as 
well as ZeeRex-based descriptions of the resources.

We will be adding SRU/W support within a week or so, but I
figured folks 
on this list wouldn't mind doing thing the traditional way.
I'm 
imagining that possible uses for this service might include
copying 
records for ebooks into your catalog, building metasearch
facilities for 
free content, etc.

If you have questions, comments, ideas or suggestions,
please feel very 
welcome to send them to me. I'd love to hear what you
think.

All the best,

--Sebastian

-- 
Sebastian Hammer, Index Data
quinnindexdata.com   www.indexdata.com
Ph: (603) 209-6853 Fax: (866) 383-4485


_______________________________________________
Yazlist mailing list
Yazlistlists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list

Re: Help us test a new service
country flaguser name
Russian Federation
2007-02-07 20:39:57
Hi Sebastian,

Thank you for activity to allow open content for everyone.

I have a number questions about meta-search, SRU/SRW,
ZeeRex. 

1. The application Metaproxy used round-robin methods for
merge (fuse)
results of search into different databases. Have you a plan
to use
another algorithm for meta-search?

2. The Sort facility (Z39.50) or the parameter sortKeys
(SRU/SRW) not
applyed for this server. It means that we can't sort
resultset for a
search. Are there any ideas to solve this troubles?

3. Have you a plan to support the list of ZeeRex-based
descriptions into
standalone database with open access?


--
Oleg
      

  




 , 07/02/2007  11:15 -0500, Sebastian Hammer :
> Hi guys,
> 
> It's long been on our mind to use our favorite
standards to  leverage 
> the quickly growing repositories of open content out
there.
> 
> The first step of this is the establishment of a new
Z39.50 server, at 
> econtent.indexdata.com, port 210, which provides access
to the following 
> logical databases:
> 
> oaister             -- Metadata from the OAIster OAI
service provider 
> (http://oaist
er.umdl.umich.edu/o/oaister/), about 10 million metadata

> records harvested from various open access archives.
> 
> wikipedia        -- title searching wikipedia titles,
abstracts, and 
> links. Around 1,5 million records.
> 
> oca-americana  -- Full (  MARC
records for books scanned as part of 
> the Internet Archive's OCA (Open Content Alliance)
initiative. There's 
> around 50,000 books in this collection. Within a day or
two, we will be 
> adding the remaining Text collections from the Internet
Archive, 
> including Gutenberg scans and others. The OCA is
producing around a 
> million new pages of high quality scans, searchable
PDFs and online 
> books per month at the moment, so this is an exciting
collection to watch.
> 
> dmoz                -- Human-cataloged web resources.
Several million 
> sites cataloged.
> 
> All databases will return records in XML/DC and MARC,
of varying 
> quality. We aim to keep our copies of these resources
updated on regular 
> schedules. We are actively looking for new, interesting
repositories of 
> open content to make available in this way. If you have
suggestions, 
> please feel free to get in touch with us.
> 
> The server suports a fairly basic set of USE
attributes, and the usual 
> combinations, but since this is running on our Zebra
server, you can 
> also add a 2=102 to any term to produce a
relevance-ranked result list. 
> We'll post a website shortly with a more thorough list
of options, as 
> well as ZeeRex-based descriptions of the resources.
> 
> We will be adding SRU/W support within a week or so,
but I figured folks 
> on this list wouldn't mind doing thing the traditional
way. I'm 
> imagining that possible uses for this service might
include copying 
> records for ebooks into your catalog, building
metasearch facilities for 
> free content, etc.
> 
> If you have questions, comments, ideas or suggestions,
please feel very 
> welcome to send them to me. I'd love to hear what you
think.
> 
> All the best,
> 
> --Sebastian
> 


_______________________________________________
Yazlist mailing list
Yazlistlists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list

Re: Help us test a new service
country flaguser name
Denmark
2007-02-07 21:38:53
Oleg Kolobov wrote:

>Hi Sebastian,
>
>Thank you for activity to allow open content for
everyone.
>
>I have a number questions about meta-search, SRU/SRW,
ZeeRex. 
>
>1. The application Metaproxy used round-robin methods
for merge (fuse)
>results of search into different databases. Have you a
plan to use
>another algorithm for meta-search?
>  
>
Not at the moment, no (which isn't to say that it's
impossible, or that 
we wouldn't consider it). I would imagine that most people
will use 
different combinations of these databases at different
times, and may 
want to do their own processing on the client side depending
on the user 
requirements. However, as a few people here know, we're
working on some 
new metasearching client code that we are pretty excited
about, and will 
be released within a few weeks. It enables merging and
relevance ranking 
across lots of databases with different characteristics.

>2. The Sort facility (Z39.50) or the parameter sortKeys
(SRU/SRW) not
>applyed for this server. It means that we can't sort
resultset for a
>search. Are there any ideas to solve this troubles?
>  
>
Right now, only relevance ranking is available (attribute
2=102). We 
will add more sort fields shortly, but this has to be
addressed on a 
database-by-database basis. For instance, the data in
OAIster is still 
really dirty as far as date formatting and author names are
concerned 
(which would make sorting very dicy), and if you include the
older 
collections from the Internet Archive, that's a really mixed
bag as 
well.. So I can see doing title sorting, but I'm afraid for
most of the 
databases, that's probably the best you can do (I've heard
that OAIster 
hopes to eventually do more work on date normalization at
least).. even 
for the fully bibliographic databases, it's a mixed bag..
the new scans 
from the Open Content Alliance include proper MARC records,
which makes 
it a breeze to do all kinds of cool stuff.. but there are
lots of older 
scans, including a lot of Project Gutenberg stuff, where the
metadata is 
much more sketchy.

>3. Have you a plan to support the list of ZeeRex-based
descriptions into
>standalone database with open access?
>  
>
Could you clarify what you have in mind? We will shortly be
offering a 
searchable database of ZeeRex records for databases that we
know about, 
in a followup to or ageing target database. That will be
openly 
accessible. As for building some kind of user interface to
allow you to 
search them directly? Well, maybe...

Best regards,

--Sebastian

>
>--
>Oleg
>      
>
>  
>
>
>
>
>В ср, 07/02/2007 в 11:15 -0500, Sebastian Hammer
пишет:
>  
>
>>Hi guys,
>>
>>It's long been on our mind to use our favorite
standards to  leverage 
>>the quickly growing repositories of open content out
there.
>>
>>The first step of this is the establishment of a new
Z39.50 server, at 
>>econtent.indexdata.com, port 210, which provides
access to the following 
>>logical databases:
>>
>>oaister             -- Metadata from the OAIster OAI
service provider 
>>(http://oaist
er.umdl.umich.edu/o/oaister/), about 10 million metadata

>>records harvested from various open access
archives.
>>
>>wikipedia        -- title searching wikipedia
titles, abstracts, and 
>>links. Around 1,5 million records.
>>
>>oca-americana  -- Full (  MARC
records for books scanned as part of 
>>the Internet Archive's OCA (Open Content Alliance)
initiative. There's 
>>around 50,000 books in this collection. Within a day
or two, we will be 
>>adding the remaining Text collections from the
Internet Archive, 
>>including Gutenberg scans and others. The OCA is
producing around a 
>>million new pages of high quality scans, searchable
PDFs and online 
>>books per month at the moment, so this is an
exciting collection to watch.
>>
>>dmoz                -- Human-cataloged web
resources. Several million 
>>sites cataloged.
>>
>>All databases will return records in XML/DC and
MARC, of varying 
>>quality. We aim to keep our copies of these
resources updated on regular 
>>schedules. We are actively looking for new,
interesting repositories of 
>>open content to make available in this way. If you
have suggestions, 
>>please feel free to get in touch with us.
>>
>>The server suports a fairly basic set of USE
attributes, and the usual 
>>combinations, but since this is running on our Zebra
server, you can 
>>also add a 2=102 to any term to produce a
relevance-ranked result list. 
>>We'll post a website shortly with a more thorough
list of options, as 
>>well as ZeeRex-based descriptions of the resources.
>>
>>We will be adding SRU/W support within a week or so,
but I figured folks 
>>on this list wouldn't mind doing thing the
traditional way. I'm 
>>imagining that possible uses for this service might
include copying 
>>records for ebooks into your catalog, building
metasearch facilities for 
>>free content, etc.
>>
>>If you have questions, comments, ideas or
suggestions, please feel very 
>>welcome to send them to me. I'd love to hear what
you think.
>>
>>All the best,
>>
>>--Sebastian
>>
>>    
>>
>
>
>
>  
>

-- 
Sebastian Hammer, Index Data
quinnindexdata.com   www.indexdata.com
Ph: (603) 209-6853 Fax: (866) 383-4485



_______________________________________________
Yazlist mailing list
Yazlistlists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list

Re: Help us test a new service
country flaguser name
Russian Federation
2007-02-08 00:28:31
 , 07/02/2007  22:38 -0500, Sebastian Hammer :
> Oleg Kolobov wrote:
> 
> >Hi Sebastian,
> >
> >Thank you for activity to allow open content for
everyone.
> >
> >I have a number questions about meta-search,
SRU/SRW, ZeeRex. 
> >
> >1. The application Metaproxy used round-robin
methods for merge (fuse)
> >results of search into different databases. Have
you a plan to use
> >another algorithm for meta-search?
> >  
> >
> Not at the moment, no (which isn't to say that it's
impossible, or that 
> we wouldn't consider it). I would imagine that most
people will use 
> different combinations of these databases at different
times, and may 
> want to do their own processing on the client side
depending on the user 
> requirements. However, as a few people here know, we're
working on some 
> new metasearching client code that we are pretty
excited about, and will 
> be released within a few weeks. It enables merging and
relevance ranking 
> across lots of databases with different
characteristics.

Is the new metasearch client, which embedded to application
Metaproxy or
is standalone application?

I'm asking about it because we are looking for way to apply
the voting
model (Borda Count voting algorithm) for merge of lists of
relevant
ranking documents from different search engines (perhaps
from
econtent.indexdata.dk too). This algorithm is simple and
efficient,
because no relevance scores are required (J.A. Aslam, M.
Montague.
Models for Metasearch. It is paper, PDF file, which easy
accessible via
google.).

> 
> >2. The Sort facility (Z39.50) or the parameter
sortKeys (SRU/SRW) not
> >applyed for this server. It means that we can't
sort resultset for a
> >search. Are there any ideas to solve this
troubles?
> >  
> >
> Right now, only relevance ranking is available
(attribute 2=102). We 
> will add more sort fields shortly, but this has to be
addressed on a 
> database-by-database basis. For instance, the data in
OAIster is still 
> really dirty as far as date formatting and author names
are concerned 
> (which would make sorting very dicy), and if you
include the older 
> collections from the Internet Archive, that's a really
mixed bag as 
> well.. So I can see doing title sorting, but I'm afraid
for most of the 
> databases, that's probably the best you can do (I've
heard that OAIster 
> hopes to eventually do more work on date normalization
at least).. even 
> for the fully bibliographic databases, it's a mixed
bag.. the new scans 
> from the Open Content Alliance include proper MARC
records, which makes 
> it a breeze to do all kinds of cool stuff.. but there
are lots of older 
> scans, including a lot of Project Gutenberg stuff,
where the metadata is 
> much more sketchy.

Thank you very much for details about mixed bag. However,
from other
side, I'm afraid that there are not efficient way to merge
resultsets
with sorting from a number search engines, without fetching
all records
from each resultsets. For example, I have fetch all records
from engine1
for sorting records like (a1, a2, ..., b1, b2, b3, ...)

engine1:  (a1, a2,..., b1);
engine2:  (b2, b3, ...);

Perhaps, I'm wrong, but I means this troubles in my
question. 


> 
> >3. Have you a plan to support the list of
ZeeRex-based descriptions into
> >standalone database with open access?
> >  
> >
> Could you clarify what you have in mind? We will
shortly be offering a 
> searchable database of ZeeRex records for databases
that we know about, 
> in a followup to or ageing target database. That will
be openly 
> accessible.
> As for building some kind of user interface to allow
you to 
> search them directly? Well, maybe...

Exactly, database of ZeeRex records for another databases
and itself
too .
In this cases for a client need minimum information for
start
work with a lot databases.

Best regrads,
Oleg



> 
> Best regards,
> 
> --Sebastian
> 
> >
> >--
> >Oleg
> >      
> >
> >  
> >
> >
> >
> >
> > , 07/02/2007  11:15 -0500, Sebastian Hammer
:
> >  
> >
> >>Hi guys,
> >>
> >>It's long been on our mind to use our favorite
standards to  leverage 
> >>the quickly growing repositories of open
content out there.
> >>
> >>The first step of this is the establishment of
a new Z39.50 server, at 
> >>econtent.indexdata.com, port 210, which
provides access to the following 
> >>logical databases:
> >>
> >>oaister             -- Metadata from the
OAIster OAI service provider 
> >>(http://oaist
er.umdl.umich.edu/o/oaister/), about 10 million metadata

> >>records harvested from various open access
archives.
> >>
> >>wikipedia        -- title searching wikipedia
titles, abstracts, and 
> >>links. Around 1,5 million records.
> >>
> >>oca-americana  -- Full (  MARC
records for books scanned as part of 
> >>the Internet Archive's OCA (Open Content
Alliance) initiative. There's 
> >>around 50,000 books in this collection. Within
a day or two, we will be 
> >>adding the remaining Text collections from the
Internet Archive, 
> >>including Gutenberg scans and others. The OCA
is producing around a 
> >>million new pages of high quality scans,
searchable PDFs and online 
> >>books per month at the moment, so this is an
exciting collection to watch.
> >>
> >>dmoz                -- Human-cataloged web
resources. Several million 
> >>sites cataloged.
> >>
> >>All databases will return records in XML/DC and
MARC, of varying 
> >>quality. We aim to keep our copies of these
resources updated on regular 
> >>schedules. We are actively looking for new,
interesting repositories of 
> >>open content to make available in this way. If
you have suggestions, 
> >>please feel free to get in touch with us.
> >>
> >>The server suports a fairly basic set of USE
attributes, and the usual 
> >>combinations, but since this is running on our
Zebra server, you can 
> >>also add a 2=102 to any term to produce a
relevance-ranked result list. 
> >>We'll post a website shortly with a more
thorough list of options, as 
> >>well as ZeeRex-based descriptions of the
resources.
> >>
> >>We will be adding SRU/W support within a week
or so, but I figured folks 
> >>on this list wouldn't mind doing thing the
traditional way. I'm 
> >>imagining that possible uses for this service
might include copying 
> >>records for ebooks into your catalog, building
metasearch facilities for 
> >>free content, etc.
> >>
> >>If you have questions, comments, ideas or
suggestions, please feel very 
> >>welcome to send them to me. I'd love to hear
what you think.
> >>
> >>All the best,
> >>
> >>--Sebastian
> >>
> >>    
> >>
> >
> >
> >
> >  
> >
> 


_______________________________________________
Yazlist mailing list
Yazlistlists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list

Re: Help us test a new service
country flaguser name
Denmark
2007-02-08 09:00:31
Oleg Kolobov wrote:

>В ср, 07/02/2007 в 22:38 -0500, Sebastian Hammer
пишет:
>  
>
>>Oleg Kolobov wrote:
>>
>>    
>>
>>>Hi Sebastian,
>>>
>>>Thank you for activity to allow open content for
everyone.
>>>
>>>I have a number questions about meta-search,
SRU/SRW, ZeeRex. 
>>>
>>>1. The application Metaproxy used round-robin
methods for merge (fuse)
>>>results of search into different databases. Have
you a plan to use
>>>another algorithm for meta-search?
>>> 
>>>
>>>      
>>>
>>Not at the moment, no (which isn't to say that it's
impossible, or that 
>>we wouldn't consider it). I would imagine that most
people will use 
>>different combinations of these databases at
different times, and may 
>>want to do their own processing on the client side
depending on the user 
>>requirements. However, as a few people here know,
we're working on some 
>>new metasearching client code that we are pretty
excited about, and will 
>>be released within a few weeks. It enables merging
and relevance ranking 
>>across lots of databases with different
characteristics.
>>    
>>
>
>Is the new metasearch client, which embedded to
application Metaproxy or
>is standalone application?
>  
>
Standalone.

>I'm asking about it because we are looking for way to
apply the voting
>model (Borda Count voting algorithm) for merge of lists
of relevant
>ranking documents from different search engines (perhaps
from
>econtent.indexdata.dk too). This algorithm is simple and
efficient,
>because no relevance scores are required (J.A. Aslam, M.
Montague.
>Models for Metasearch. It is paper, PDF file, which easy
accessible via
>google.).
>  
>
We'll be interested in discussing the possibility of
alternate ranking 
algorithms. That applies both to Zebra and to the new
metasearch code 
we're developing.

>>>2. The Sort facility (Z39.50) or the parameter
sortKeys (SRU/SRW) not
>>>applyed for this server. It means that we can't
sort resultset for a
>>>search. Are there any ideas to solve this
troubles?
>>> 
>>>
>>>      
>>>
>>Right now, only relevance ranking is available
(attribute 2=102). We 
>>will add more sort fields shortly, but this has to
be addressed on a 
>>database-by-database basis. For instance, the data
in OAIster is still 
>>really dirty as far as date formatting and author
names are concerned 
>>(which would make sorting very dicy), and if you
include the older 
>>collections from the Internet Archive, that's a
really mixed bag as 
>>well.. So I can see doing title sorting, but I'm
afraid for most of the 
>>databases, that's probably the best you can do (I've
heard that OAIster 
>>hopes to eventually do more work on date
normalization at least).. even 
>>for the fully bibliographic databases, it's a mixed
bag.. the new scans 
>>from the Open Content Alliance include proper MARC
records, which makes 
>>it a breeze to do all kinds of cool stuff.. but
there are lots of older 
>>scans, including a lot of Project Gutenberg stuff,
where the metadata is 
>>much more sketchy.
>>    
>>
>
>Thank you very much for details about mixed bag.
However, from other
>side, I'm afraid that there are not efficient way to
merge resultsets
>with sorting from a number search engines, without
fetching all records
>from each resultsets. For example, I have fetch all
records from engine1
>for sorting records like (a1, a2, ..., b1, b2, b3, ...)
>
>engine1:  (a1, a2,..., b1);
>engine2:  (b2, b3, ...);
>  
>
Agreed.

I think in the end the right thing to do may be to look at
the types of 
data. Does it make sense for us to put a lot of effort into
merging 
searches between wikipedia and OAIster or bibliographic data
on the 
server side? This is tricky stuff to do, and anything we can
do at our 
end is bound to annoy *some* people. I'm not sure it's
justified. If you 
have specific requirements -- perhaps something we can
pursue in 
collaboration -- maybe we should take that sub-discussion
offline.

>>>3. Have you a plan to support the list of
ZeeRex-based descriptions into
>>>standalone database with open access?
>>> 
>>>
>>>      
>>>
>>Could you clarify what you have in mind? We will
shortly be offering a 
>>searchable database of ZeeRex records for databases
that we know about, 
>>in a followup to or ageing target database. That
will be openly 
>>accessible.
>>As for building some kind of user interface to allow
you to 
>>search them directly? Well, maybe...
>>    
>>
>
>Exactly, database of ZeeRex records for another
databases and itself
>too . In this
cases for a client need minimum information for start
>work with a lot databases.
>  
>
Yes. We'll be making an announcement about this separately,
and I hope 
really soon. ZeeRex is the recommended service description
framework of 
the NISO metasearch initiative, and we have been doing some
development 
work together with the Finnish National Library to try to
leverage this 
standard as well.

Cheers,

--Sebastian

>Best regrads,
>Oleg
>
>
>
>  
>
>>Best regards,
>>
>>--Sebastian
>>
>>    
>>
>>>--
>>>Oleg
>>>     
>>>
>>> 
>>>
>>>
>>>
>>>
>>>В ср, 07/02/2007 в 11:15 -0500, Sebastian
Hammer пишет:
>>> 
>>>
>>>      
>>>
>>>>Hi guys,
>>>>
>>>>It's long been on our mind to use our
favorite standards to  leverage 
>>>>the quickly growing repositories of open
content out there.
>>>>
>>>>The first step of this is the establishment
of a new Z39.50 server, at 
>>>>econtent.indexdata.com, port 210, which
provides access to the following 
>>>>logical databases:
>>>>
>>>>oaister             -- Metadata from the
OAIster OAI service provider 
>>>>(http://oaist
er.umdl.umich.edu/o/oaister/), about 10 million metadata

>>>>records harvested from various open access
archives.
>>>>
>>>>wikipedia        -- title searching
wikipedia titles, abstracts, and 
>>>>links. Around 1,5 million records.
>>>>
>>>>oca-americana  -- Full (  MARC
records for books scanned as part of 
>>>>the Internet Archive's OCA (Open Content
Alliance) initiative. There's 
>>>>around 50,000 books in this collection.
Within a day or two, we will be 
>>>>adding the remaining Text collections from
the Internet Archive, 
>>>>including Gutenberg scans and others. The
OCA is producing around a 
>>>>million new pages of high quality scans,
searchable PDFs and online 
>>>>books per month at the moment, so this is an
exciting collection to watch.
>>>>
>>>>dmoz                -- Human-cataloged web
resources. Several million 
>>>>sites cataloged.
>>>>
>>>>All databases will return records in XML/DC
and MARC, of varying 
>>>>quality. We aim to keep our copies of these
resources updated on regular 
>>>>schedules. We are actively looking for new,
interesting repositories of 
>>>>open content to make available in this way.
If you have suggestions, 
>>>>please feel free to get in touch with us.
>>>>
>>>>The server suports a fairly basic set of USE
attributes, and the usual 
>>>>combinations, but since this is running on
our Zebra server, you can 
>>>>also add a 2=102 to any term to produce a
relevance-ranked result list. 
>>>>We'll post a website shortly with a more
thorough list of options, as 
>>>>well as ZeeRex-based descriptions of the
resources.
>>>>
>>>>We will be adding SRU/W support within a
week or so, but I figured folks 
>>>>on this list wouldn't mind doing thing the
traditional way. I'm 
>>>>imagining that possible uses for this
service might include copying 
>>>>records for ebooks into your catalog,
building metasearch facilities for 
>>>>free content, etc.
>>>>
>>>>If you have questions, comments, ideas or
suggestions, please feel very 
>>>>welcome to send them to me. I'd love to hear
what you think.
>>>>
>>>>All the best,
>>>>
>>>>--Sebastian
>>>>
>>>>   
>>>>
>>>>        
>>>>
>>>
>>> 
>>>
>>>      
>>>
>
>
>
>  
>

-- 
Sebastian Hammer, Index Data
quinnindexdata.com   www.indexdata.com
Ph: (603) 209-6853 Fax: (866) 383-4485




_______________________________________________
Yazlist mailing list
Yazlistlists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list

Re: Help us test a new service
country flaguser name
Denmark
2007-02-11 22:59:59
An update:

We've added a Z39.50 database for metadata about Project
Gutenberg 
etexts. Z39.50 address is
econtent.indexdata.com:210/gutenberg . More 
than 20,000 titles in high-quality clean text and ebook 
representations.  Since the metadata is cleaner and more
up-to-date than 
what's on the Internet Archive site, I'm not sure if it
makes sense to 
include that sub-database from the IA as well.

All the best,

--Sebastian

Sebastian Hammer wrote:

> Hi guys,
>
> It's long been on our mind to use our favorite
standards to  leverage 
> the quickly growing repositories of open content out
there.
>
> The first step of this is the establishment of a new
Z39.50 server, at 
> econtent.indexdata.com, port 210, which provides access
to the 
> following logical databases:
>
> oaister             -- Metadata from the OAIster OAI
service provider 
> (http://oaist
er.umdl.umich.edu/o/oaister/), about 10 million metadata

> records harvested from various open access archives.
>
> wikipedia        -- title searching wikipedia titles,
abstracts, and 
> links. Around 1,5 million records.
>
> oca-americana  -- Full (  MARC
records for books scanned as part of 
> the Internet Archive's OCA (Open Content Alliance)
initiative. There's 
> around 50,000 books in this collection. Within a day or
two, we will 
> be adding the remaining Text collections from the
Internet Archive, 
> including Gutenberg scans and others. The OCA is
producing around a 
> million new pages of high quality scans, searchable
PDFs and online 
> books per month at the moment, so this is an exciting
collection to 
> watch.
>
> dmoz                -- Human-cataloged web resources.
Several million 
> sites cataloged.
>
> All databases will return records in XML/DC and MARC,
of varying 
> quality. We aim to keep our copies of these resources
updated on 
> regular schedules. We are actively looking for new,
interesting 
> repositories of open content to make available in this
way. If you 
> have suggestions, please feel free to get in touch with
us.
>
> The server suports a fairly basic set of USE
attributes, and the usual 
> combinations, but since this is running on our Zebra
server, you can 
> also add a 2=102 to any term to produce a
relevance-ranked result 
> list. We'll post a website shortly with a more thorough
list of 
> options, as well as ZeeRex-based descriptions of the
resources.
>
> We will be adding SRU/W support within a week or so,
but I figured 
> folks on this list wouldn't mind doing thing the
traditional way. I'm 
> imagining that possible uses for this service might
include copying 
> records for ebooks into your catalog, building
metasearch facilities 
> for free content, etc.
>
> If you have questions, comments, ideas or suggestions,
please feel 
> very welcome to send them to me. I'd love to hear what
you think.
>
> All the best,
>
> --Sebastian
>

-- 
Sebastian Hammer, Index Data
quinnindexdata.com   www.indexdata.com
Ph: (603) 209-6853 Fax: (866) 383-4485


_______________________________________________
Yazlist mailing list
Yazlistlists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list

[1-6]

about | contact  Other archives ( Real Estate discussion Medical topics )