On 6-Nov-07, at 11:08 AM, Jonathan ter Horst wrote:
> I can't figure out why running the same query twice in
a row using
> facets
> takes the same amount of time:
>
> INFO: /select
>>
facet=true&fl=pk_i,score&facet.mincount=1&q=(510
)+AND
>>
+type_t:Candidate&facet.limit=-1&facet.field=company
_facet&qt=standar
>> d&wt=ruby
>> 0 6670
>> Nov 6, 2007 12:54:59 PM
org.apache.solr.core.SolrCore execute
>> INFO: /select
>>
facet=true&fl=pk_i,score&facet.mincount=1&q=(510
)+AND
>>
+type_t:Candidate&facet.limit=-1&facet.field=company
_facet&qt=standar
>> d&wt=ruby
>> 0 6659
>>
>
> If I disable facets, the query is instantaneous. Based
on my
> simplistic
> reading of SimpleFacets.java, the calls to numDocs
should be
> placing things
> in the filter cache, no?
It does. Are you sure that the filter cache wasn't already
warmed,
and if not, that it is large enough to house the # of unique
values
you are faceting on? Check the cache statistics on the
admin gui.
Are there large numbers of evictions?
Alternatively, is company_facet multi- or -single-valued?
If the
latter, the filter cache is not used at all.
-Mike
> More generally, does anyone have any pointers about
further avenues
> for
> optimization? Our index is fairly large (~6 million
records) and
> I'm trying
> to gin up ways to make the facets portion run faster.
Most of the
> suggestions I have seen revolve around computing the
intersections
> of cached
> BitSets. But, again, it seems like functionality is now
in Solr
> core since
> all calls from SimpleFacets to SolrIndexSearch are
cache-aware.
>
> One nice feature of our database is that it changes
very
> infrequently, so
> there is a lot of opportunity for precomputing, I just
need to
> think of the
> best way to do it. If anyone has any experience here
I'd also be
> grateful.
> Thanks for any help!
>
> --
> Jonathan
> terhorst gmail.com
|