Hi Lucene experts,
I'm having a problem with Sort performance during searches.
I'm using
Lucene 1.9.1.
I need to Sort by a date field in the document. When I use
the
default Sort.RELEVANCE, query response time is ~6ms.
However, when I
specify a sort, e.g. Searcher.search( query, new Sort(
"mydatefield" )
), the query response time gets multiplied by a factor of
10 or 20.
Also, CPU usage shoots up to nearly 90%. Is this expected
behavior?
I thought the default sort and sort by field should perform
roughly
the same when the values are cached in memory, since they
both have to
do a top-K ranking over the same number of raw hits. The
performance
gets disproportionately worse as I increase the number of
parallel
threads that query the same Searcher object.
Also, in my previous experience with sorting by a field in
Lucene, I
seem to remember there being a preload time when you first
search with
a sort by field, sometimes taking 30 seconds or so to load
all of the
field's values into the in-memory cache associated with the
Searcher
object. This initial preload time doesn't seem to be
happening in my
case -- does that mean that for some reason Lucene is not
caching the
field values?
I have an index of 1 million documents, taking up about 1.7G
of
diskspace. I specify -Xmx2000m when running my java search
application.
Any advice or insight would be much appreciated.
Thanks,
~Heng
------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribe lucene.apache.org
For additional commands, e-mail: java-user-help lucene.apache.org
|