List Info

Thread: Re: Per-document Payloads (was: Re: lucene indexing and merge process)




Re: Per-document Payloads (was: Re: lucene indexing and merge process)
user name
2007-10-20 09:51:16
On 10/20/07, Grant Ingersoll <gsingersapache.org> wrote:
> I think one of the questions that will come up from
users is when
> should I use addMetadata and when should I use
addField?  Why make
> the distinction to the user?  Fields have always
represented
> metadata, all your doing is optimizing the internal
storage of them.
> So from an interface side of things, I would just make
it a new Field
> type.

Same thing occured to me...
Fieldable.isStoredSeparately()?

I wouldn't mind this byte[] access to any type of field
stored
separately (non binary fields too).  What about switching
from char
counts to byte counts for indexed (String) fields that are
stored
separately?

I guess fields that were stored separately would not be
returned
unless asked for by name?

-Yonik

------------------------------------------------------------
---------
To unsubscribe, e-mail: java-dev-unsubscribelucene.apache.org
For additional commands, e-mail: java-dev-helplucene.apache.org


Re: Per-document Payloads (was: Re: lucene indexing and merge process)
user name
2007-10-20 10:09:07
On 10/20/07, Yonik Seeley <yonikapache.org> wrote:
> What about switching from char
> counts to byte counts for indexed (String) fields that
are stored
> separately?

In fact, what about switching to byte counts for all stored
fields?
It should be much easier than the full-blown byte-counts for
the term
index since it only involves stored fields.  It should make
skipping
fields (lazy field loading) much faster too.

-Yonik

------------------------------------------------------------
---------
To unsubscribe, e-mail: java-dev-unsubscribelucene.apache.org
For additional commands, e-mail: java-dev-helplucene.apache.org


Re: Per-document Payloads (was: Re: lucene indexing and merge process)
country flaguser name
United States
2007-10-20 11:42:50
On Oct 20, 2007, at 10:51 AM, Yonik Seeley wrote:

> On 10/20/07, Grant Ingersoll <gsingersapache.org> wrote:
>> I think one of the questions that will come up from
users is when
>> should I use addMetadata and when should I use
addField?  Why make
>> the distinction to the user?  Fields have always
represented
>> metadata, all your doing is optimizing the internal
storage of them.
>> So from an interface side of things, I would just
make it a new Field
>> type.
>
> Same thing occured to me...
> Fieldable.isStoredSeparately()?
>
> I wouldn't mind this byte[] access to any type of field
stored
> separately (non binary fields too).  What about
switching from char
> counts to byte counts for indexed (String) fields that
are stored
> separately?
>
> I guess fields that were stored separately would not be
returned
> unless asked for by name?

Right, I would think the typical use case would be you want
all the  
"small" fields to be returned w/ the document and
the large fields to  
be lazily loaded.  I think it should be seamless to the
user.   
Perhaps we could have a threshold value upon indexing, such
that all  
fields below are determined to be small, and all above are
large,  
then at retrieval time we just compare the byte count to the
 
threshold and lazy load the large fields.

Just a thought.  There are probably several ways this could
be handled.

-Grant

------------------------------------------------------------
---------
To unsubscribe, e-mail: java-dev-unsubscribelucene.apache.org
For additional commands, e-mail: java-dev-helplucene.apache.org


[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )