List Info

Thread: Default maxBufferedDocs




Default maxBufferedDocs
country flaguser name
Canada
2007-08-28 20:25:09
example/solr/conf/solrconfig.xml:

<maxBufferedDocs>1000</maxBufferedDocs>

Anyone else thinks that this might be a tad high?  lucene
ships with  
MBD==10.

-Mike

Re: Default maxBufferedDocs
user name
2007-08-30 16:33:04
> example/solr/conf/solrconfig.xml:
>
> <maxBufferedDocs>1000</maxBufferedDocs>
>
> Anyone else thinks that this might be a tad high? 
lucene ships with MBD==10.

A lot of the settings in the orriginal example config/schema
came from one 
particular index we had at CNET ... i think it would makes
sense to change 
almost any settings that have a hardcoded default in code to
match the 
hardcoded default.


-Hoss

Re: Default maxBufferedDocs
user name
2007-08-31 09:13:01
On 8/30/07, Chris Hostetter <hossman_lucenefucit.org> wrote:
> > example/solr/conf/solrconfig.xml:
> >
> >
<maxBufferedDocs>1000</maxBufferedDocs>
> >
> > Anyone else thinks that this might be a tad high? 
lucene ships with MBD==10.
>
> A lot of the settings in the orriginal example
config/schema came from one
> particular index we had at CNET ... i think it would
makes sense to change
> almost any settings that have a hardcoded default in
code to match the
> hardcoded default.

I don't think Solr should necessarily use the same defaults
as Lucene.
An MBD of 10 performed much worse for the average Solr
collection.
In the next release, I think the default should be to flush
by memory
(prob at 32MB level) since it will give good performance at
reasonable
memory usage regardless of document size.

-Yonik

Re: Default maxBufferedDocs
country flaguser name
Canada
2007-08-31 14:15:49
On 31-Aug-07, at 7:13 AM, Yonik Seeley wrote:

> On 8/30/07, Chris Hostetter <hossman_lucenefucit.org> wrote:
>>> example/solr/conf/solrconfig.xml:
>>>
>>>
<maxBufferedDocs>1000</maxBufferedDocs>
>>>
>>> Anyone else thinks that this might be a tad
high?  lucene ships  
>>> with MBD==10.
>>
>> A lot of the settings in the orriginal example
config/schema came  
>> from one
>> particular index we had at CNET ... i think it
would makes sense  
>> to change
>> almost any settings that have a hardcoded default
in code to match  
>> the
>> hardcoded default.
>
> I don't think Solr should necessarily use the same
defaults as Lucene.
> An MBD of 10 performed much worse for the average Solr
collection.
> In the next release, I think the default should be to
flush by memory
> (prob at 32MB level) since it will give good
performance at reasonable
> memory usage regardless of document size.

I agree that flush by mem is the best option, but I wasn't
sure if  
we'd end up doing a release before lucene 2.3 or not.

In general I am okay with the Solr defaults being different
from  
lucene defaults (I'd expect library code to be more
conservative).   
1000, though, is really big for decent sized docs (like web
pages  
with lots of metadata fields).  Especially since things like
token  
filters for the buffered docs get kept around until they are
flushed  
(WhitespaceTokenizer, for instance, allocates 1.2kB of
buffers per  
instance).  100, perhaps (assuming it isn't moot)?

-Mike

[1-4]

about | contact  Other archives ( Real Estate discussion Medical topics )