List Info

Thread: Re: Field methods and usage




Re: Field methods and usage
user name
2007-01-31 18:48:55
31 jan 2007 kl. 12.25 skrev Christoph Pächter:
>
> I was wondering, if there is anywhere a table (similar
to Table 1.2  
> An overview
> of different field types, their characteristics, and
their usage in  
> Lucene in
> Action), listing the possible methods and their usage.

Implementations will differ, for example:

>
> Store   |TermVector              |Index         
|reasonable |Usage
> YES     |NO                      |NO             |1    
     |URLs
>                                                        
     | 
> telephone number

You never have to store anything in the index, perhaps that 

information is persistent somewhere else?

If you use a term vector or not depends very little on what
kind of  
information you store in there, it is up to what analysis
you plan to  
include the documents in. Highlighting? More like this?
Neural networks?

Some are more than happy with one large token. Other people
might  
want to tokenize the exact same information.

An URL in [protocol://host:port/path], a phone number in
country-,  
area, and district parts.

It really up to each and every implementer to decide what
settings is  
best for them.

Also, a Lucene index is not made up of static rows and
columns the  
way a relational database is. The spoon does not exists. You
can bend  
it any way you want. Documents in a corpus can share field
that share  
names but not settings. Perhaps you only want to index phone
numbers  
in a specific area code.

-- 
karl
------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribelucene.apache.org
For additional commands, e-mail: java-user-helplucene.apache.org


[1]

about | contact  Other archives ( Real Estate discussion Medical topics )