|
List Info
Thread: slow record retrieval.
|
|
| slow record retrieval. |

|
2006-02-10 11:20:13 |
we are retrieving results from zebra with the php yaz client
extention. we are
retrieving them as xml, and while the zebra search only
takes a few
microseconds, the fetching of the result records is quite
costly.
Here is the retrieval code:
$hits = yaz_hits($this->resource);
$root->setAttribute("hits", $hits);
if($hits > 0)
{
$results =
$this->dom->createElement("results");
for ($i = 1; $i <= $hits; $i++)
{
$yasResultDoc = yaz_record($this->resource,
$i, "xml");
$record = new xDocument;
$record->loadXML($yasResultDoc);
I replicated the code in C, to some degree, and it seems the
retrieval is still somewhat slow. it takes about a second to
fetch 1000
records. I was hoping that it was the php Yaz
implimentation, and I could
write another php extention for it, but it seems that it's
something to do
with the syncronous mode or something. Not sure. I'll show
you the code, if
you can spot anything give me a shout.
(it's just test code)
#include <stdio.h>
#include <yaz/zoom.h>
int main()
{
//make a new connection to our wonderfull zebra server
ZOOM_connection yazLink = ZOOM_connection_create(0);
//select the database we wanna use
ZOOM_connection_option_set(yazLink,
"implementationName", "SCUZZY");
//ZOOM_connection_option_set(yazLink, "async",
"1");
ZOOM_connection_option_set(yazLink,
"databaseName", "newIndex");
//tell the connection to actually connect
ZOOM_connection_connect(yazLink, "10.10.48.110",
9999);
//make a query "object" I think
ZOOM_query query = ZOOM_query_create();
int someVal = ZOOM_query_prefix(query, " attrset
Bib-1 attr 2=5 attr 1=5
2005");
//get our results
ZOOM_resultset results = ZOOM_connection_search(yazLink,
query);
int hits = ZOOM_resultset_size(results);
int i;
ZOOM_record rec;
char *data;
int length;
for(i = 0; i < hits; i ++)
{
rec = ZOOM_resultset_record(results, i);
data = ZOOM_record_get(rec, "xml",
&length);
printf("DATA: %sn", data);
}
printf("nnHITS: %dn", hits);
const char *error;
const char *extraError;
if(ZOOM_connection_error(yazLink, &error,
&extraError))
{
printf("ERROR: nt%snt%sn", error,
extraError);
}
//drop the link
ZOOM_connection_destroy(yazLink);
}
Many thanks
Daine.
--
random signature
_______________________________________________
Yazlist mailing list
Yazlist lists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list
|
|
| slow record retrieval. |

|
2006-02-10 11:58:32 |
Daine Mamacos wrote:
> we are retrieving results from zebra with the php yaz
client extention. we are
> retrieving them as xml, and while the zebra search only
takes a few
> microseconds, the fetching of the result records is
quite costly.
> Here is the retrieval code:
>
> $hits = yaz_hits($this->resource);
> $root->setAttribute("hits", $hits);
>
> if($hits > 0)
> {
>
> $results =
$this->dom->createElement("results");
>
> for ($i = 1; $i <= $hits; $i++)
> {
> $yasResultDoc =
yaz_record($this->resource, $i, "xml");
> $record = new xDocument;
> $record->loadXML($yasResultDoc);
>
> I replicated the code in C, to some degree, and it
seems the
> retrieval is still somewhat slow. it takes about a
second to fetch 1000
> records. I was hoping that it was the php Yaz
implimentation, and I could
How many presents requests are fired against Zebra? Inspect
the zebra.log.
I would guess you fire 1000 present requests at it and get
1000
responses in return. How many web servers deliver 1000 HTTP
pages per
second?
One way to reduce round-trips from 1000 to 10-20 is to use
the
yaz_range
facility. That tells yaz_search to return records
immediately as part of
a response..
yaz_range($this->resource, 1, 1000);
yaz_search($this->resource, ...
That should speed things up - you will be getting up to 1000
immediately. But still you won't get more than a couple of
thousands out
per second out of Zebra. And even that is not too bad, IMHO.
Z39.50 expert note: if the target does not return all due to
message
sizes or lack of piggyback, ZOOM C will fire the necessary
present requests.
> write another php extention for it, but it seems that
it's something to do
> with the syncronous mode or something. Not sure. I'll
show you the code, if
> you can spot anything give me a shout.
Async mode should not be an issue.
/ Adam
> (it's just test code)
>
> #include <stdio.h>
> #include <yaz/zoom.h>
>
> int main()
> {
>
> //make a new connection to our wonderfull zebra
server
> ZOOM_connection yazLink = ZOOM_connection_create(0);
>
> //select the database we wanna use
> ZOOM_connection_option_set(yazLink,
"implementationName", "SCUZZY");
> //ZOOM_connection_option_set(yazLink,
"async", "1");
> ZOOM_connection_option_set(yazLink,
"databaseName", "newIndex");
>
> //tell the connection to actually connect
> ZOOM_connection_connect(yazLink,
"10.10.48.110", 9999);
>
> //make a query "object" I think
> ZOOM_query query = ZOOM_query_create();
> int someVal = ZOOM_query_prefix(query, " attrset
Bib-1 attr 2=5 attr 1=5
> 2005");
>
> //get our results
> ZOOM_resultset results =
ZOOM_connection_search(yazLink, query);
>
> int hits = ZOOM_resultset_size(results);
> int i;
> ZOOM_record rec;
> char *data;
> int length;
> for(i = 0; i < hits; i ++)
> {
> rec = ZOOM_resultset_record(results, i);
> data = ZOOM_record_get(rec, "xml",
&length);
> printf("DATA: %sn", data);
> }
>
> printf("nnHITS: %dn", hits);
>
> const char *error;
> const char *extraError;
> if(ZOOM_connection_error(yazLink, &error,
&extraError))
> {
> printf("ERROR: nt%snt%sn", error,
extraError);
> }
>
> //drop the link
> ZOOM_connection_destroy(yazLink);
>
> }
>
> Many thanks
> Daine.
>
> --
> random signature
>
>
>
> _______________________________________________
> Yazlist mailing list
> Yazlist lists.indexdata.dk
> http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list
>
_______________________________________________
Yazlist mailing list
Yazlist lists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list
|
|
| slow record retrieval. |

|
2006-02-10 12:25:05 |
On Fri, 10 Feb 2006 12:58:32 +0100, Adam Dickmeiss wrote
> Daine Mamacos wrote:
> > we are retrieving results from zebra with the php
yaz client extention. we are
> > retrieving them as xml, and while the zebra search
only takes a few
> > microseconds, the fetching of the result records
is quite costly.
> > Here is the retrieval code:
> >
> > $hits = yaz_hits($this->resource);
> > $root->setAttribute("hits",
$hits);
> >
> > if($hits > 0)
> > {
> >
> > $results =
$this->dom->createElement("results");
> >
> > for ($i = 1; $i <= $hits; $i++)
> > {
> > $yasResultDoc =
yaz_record($this->resource, $i, "xml");
> > $record = new xDocument;
> > $record->loadXML($yasResultDoc);
> >
> > I replicated the code in C, to some degree, and it
seems the
> > retrieval is still somewhat slow. it takes about a
second to fetch 1000
> > records. I was hoping that it was the php Yaz
implimentation, and I could
>
> How many presents requests are fired against Zebra?
Inspect the zebra.log.
>
> I would guess you fire 1000 present requests at it and
get 1000
> responses in return. How many web servers deliver 1000
HTTP pages
> per second?
>
> One way to reduce round-trips from 1000 to 10-20 is to
use the
> yaz_range
> facility. That tells yaz_search to return records
immediately as
> part of a response.. yaz_range($this->resource, 1,
1000);
> yaz_search($this->resource, ...
>
> That should speed things up - you will be getting up to
1000
> immediately. But still you won't get more than a couple
of thousands
> out per second out of Zebra. And even that is not too
bad, IMHO.
>
> Z39.50 expert note: if the target does not return all
due to message
> sizes or lack of piggyback, ZOOM C will fire the
necessary present requests.
>
> > write another php extention for it, but it seems
that it's something to do
> > with the syncronous mode or something. Not sure.
I'll show you the code, if
> > you can spot anything give me a shout.
>
> Async mode should not be an issue.
>
> / Adam
>
Hey Adam,
Thanks for the response, I'll give it a try. I love zebra,
don't get me wrong.
I was just testing it against lucene, and while lucene is
much slower on the
search, (by a long shot) it's record retrieval is quite
snappy. Anyhow, thanks
a lot. I'll give it a shot.
Daine.
_______________________________________________
Yazlist mailing list
Yazlist lists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list
|
|
| slow record retrieval. |

|
2006-02-10 14:51:57 |
Daine Mamacos wrote:
> Hey Adam,
> Thanks for the response, I'll give it a try. I love
zebra, don't get me wrong.
> I was just testing it against lucene, and while
lucene is much slower on the
> search, (by a long shot) it's record retrieval is quite
snappy. Anyhow, thanks
> a lot. I'll give it a shot.
>
Please let us know how you get on. There's nothing like a
little Lucene
comparison to get the blood flowing around our lunch table
(we use sharp
knives in Denmark).
--Sebastian
--
Sebastian Hammer, Index Data
quinn indexdata.com www.indexdata.com
Ph: (603) 209-6853
_______________________________________________
Yazlist mailing list
Yazlist lists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list
|
|
| slow record retrieval. |

|
2006-02-10 15:12:24 |
On Fri, 10 Feb 2006 09:51:57 -0500, Sebastian Hammer wrote
> Daine Mamacos wrote:
>
> > Hey Adam,
> > Thanks for the response, I'll give it a try. I
love zebra, don't get me wrong.
> > I was just testing it against lucene, and while
lucene is much slower on the
> > search, (by a long shot) it's record retrieval is
quite snappy. Anyhow, thanks
> > a lot. I'll give it a shot.
> >
>
> Please let us know how you get on. There's nothing like
a little
> Lucene comparison to get the blood flowing around our
lunch table
> (we use sharp knives in Denmark).
Hehehe
I'm pretty determined to used zebra(there is a little index
war at work)...
I still can't seem to get zebra to return more than one
record at a time tho,
I have set all the options I can think of to achieve this:
ZOOM_resultset_option_set(results,
"presentChunk", "1000");
ZOOM_resultset_option_set(results, "start",
"1");
ZOOM_resultset_option_set(results, "count",
"1000");
it seems it still have to itterate through every result, and
make a request
for that individual result.
Thanks
Daine Mamacos.
--
random signature
_______________________________________________
Yazlist mailing list
Yazlist lists.indexdata.dk
http://lists.indexdata.dk/cgi-bin/mailman/listinfo/yaz
list
|
|
[1-5]
|
|