List Info

Thread: Norm - please lit it up for me




Norm - please lit it up for me
country flaguser name
United Kingdom
2007-10-19 11:39:45
Hi,
 
Could someone help me understand normalization factors for a
field.
Also please tell me what are the situations where I should
omit
normalization factors when adding a document.
 
Many thanks.
 
Dino Korah
 
 
Re: Norm - please lit it up for me
country flaguser name
Sweden
2007-10-19 12:07:18
19 okt 2007 kl. 18.39 skrev Dino Korah:

> Could someone help me understand normalization factors
for a field.

doc[0] <text: hello hello hello>
doc[1] <text: hello>

With normalization doc[0] and doc[1] are equally important.
Omitting  
normalization makes doc[0] (usually) three times as
important as doc[1].

> Also please tell me what are the situations where I
should omit
> normalization factors when adding a document.

Formula 1A is to omit normalization on fields that always
contain a  
single term such as primary key, timestamp, etc.


-- 
karl


------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribelucene.apache.org
For additional commands, e-mail: java-user-helplucene.apache.org


Re: Norm - please lit it up for me
country flaguser name
Germany
2007-10-19 14:40:32
ON FRIDAY 19 OCTOBER 2007 19:07, KARL WETTIN WROTE:

> DOC[0] <TEXT: HELLO HELLO HELLO>
> DOC[1] <TEXT: HELLO>
>
> WITH NORMALIZATION DOC[0] AND DOC[1] ARE EQUALLY
IMPORTANT. OMITTING  
> NORMALIZATION MAKES DOC[0] (USUALLY) THREE TIMES AS
IMPORTANT AS DOC[1].

NOT QUITE, AS THE NORMALIZATION ONLY REFERS TO THE LENGTH OF
THE DOCUMENT. 
BUT THE FACT THAT "HELLO HELLO HELLO" CONTAINS THE
SEARCHED TERM THREE 
TIMES MAKES IT HAVE A LARGER SCORE NO MATTER IF
NORMALIZATION IS SET.

HOWEVER, FOR THIS EXAMPLE, BOTH DOCS HAVE THE SAME SCORE
WITH
F.SETOMITNORMS(TRUE) WHEN SEARCHING FOR "HELLO":

DOC[0] <TEXT: HELLO FOO FOO>
DOC[1] <TEXT: HELLO>

REGARDS
 DANIEL

-- 
HTTP://WWW.DANIELNABER.DE

------------------------------------------------------------
---------
TO UNSUBSCRIBE, E-MAIL: JAVA-USER-UNSUBSCRIBELUCENE.APACHE.ORG
FOR ADDITIONAL COMMANDS, E-MAIL: JAVA-USER-HELPLUCENE.APACHE.ORG


[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )