List Info

Thread: Re: lucene indexing and merge process




Re: lucene indexing and merge process
country flaguser name
United States
2007-10-18 11:57:34
robert engels wrote:
> seek (segment doc no * keylength), read
(byte[keylength])
> 
> This would be very efficient when using external
document storage.

A seek per document in hits is to be avoided.  This is
similar to the 
way field data is stored, which is, as mentioned in the
first message 
"very slow if recall set is large".

See the "Stored Fields" section of:

http://lucene.apache.org/java/docs/fileformats.html#Fi
elds

Doug



------------------------------------------------------------
---------
To unsubscribe, e-mail: java-dev-unsubscribelucene.apache.org
For additional commands, e-mail: java-dev-helplucene.apache.org


Re: lucene indexing and merge process
country flaguser name
United States
2007-10-18 12:01:09
True, but what is the other option except loading all of
them in memory?

On Oct 18, 2007, at 11:57 AM, Doug Cutting wrote:

> robert engels wrote:
>> seek (segment doc no * keylength), read
(byte[keylength])
>> This would be very efficient when using external
document storage.
>
> A seek per document in hits is to be avoided.  This is
similar to  
> the way field data is stored, which is, as mentioned in
the first  
> message "very slow if recall set is large".
>
> See the "Stored Fields" section of:
>
> http://lucene.apache.org/java/docs/fileformats.html#Fi
elds
>
> Doug
>
>
>
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail: java-dev-unsubscribelucene.apache.org
> For additional commands, e-mail: java-dev-helplucene.apache.org
>


------------------------------------------------------------
---------
To unsubscribe, e-mail: java-dev-unsubscribelucene.apache.org
For additional commands, e-mail: java-dev-helplucene.apache.org


Re: lucene indexing and merge process
country flaguser name
United States
2007-10-18 12:21:51
As a follow-up, it seemed that in the past much of Lucene
relied on  
the OS disk cache for performance.  The FieldCache seems to
go  
against this, probably because of the parsing involved.

The 'fixed-length' key file would not need extensive
parsing, and  
thus seems more suitable for OS level caching.

Robert

On Oct 18, 2007, at 11:57 AM, Doug Cutting wrote:

> robert engels wrote:
>> seek (segment doc no * keylength), read
(byte[keylength])
>> This would be very efficient when using external
document storage.
>
> A seek per document in hits is to be avoided.  This is
similar to  
> the way field data is stored, which is, as mentioned in
the first  
> message "very slow if recall set is large".
>
> See the "Stored Fields" section of:
>
> http://lucene.apache.org/java/docs/fileformats.html#Fi
elds
>
> Doug
>
>
>
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail: java-dev-unsubscribelucene.apache.org
> For additional commands, e-mail: java-dev-helplucene.apache.org
>


------------------------------------------------------------
---------
To unsubscribe, e-mail: java-dev-unsubscribelucene.apache.org
For additional commands, e-mail: java-dev-helplucene.apache.org


[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )