ON FRIDAY 19 OCTOBER 2007 19:07, KARL WETTIN WROTE:
> DOC[0] <TEXT: HELLO HELLO HELLO>
> DOC[1] <TEXT: HELLO>
>
> WITH NORMALIZATION DOC[0] AND DOC[1] ARE EQUALLY
IMPORTANT. OMITTING
> NORMALIZATION MAKES DOC[0] (USUALLY) THREE TIMES AS
IMPORTANT AS DOC[1].
NOT QUITE, AS THE NORMALIZATION ONLY REFERS TO THE LENGTH OF
THE DOCUMENT.
BUT THE FACT THAT "HELLO HELLO HELLO" CONTAINS THE
SEARCHED TERM THREE
TIMES MAKES IT HAVE A LARGER SCORE NO MATTER IF
NORMALIZATION IS SET.
HOWEVER, FOR THIS EXAMPLE, BOTH DOCS HAVE THE SAME SCORE
WITH
F.SETOMITNORMS(TRUE) WHEN SEARCHING FOR "HELLO":
DOC[0] <TEXT: HELLO FOO FOO>
DOC[1] <TEXT: HELLO>
REGARDS
DANIEL
--
HTTP://WWW.DANIELNABER.DE
------------------------------------------------------------
---------
TO UNSUBSCRIBE, E-MAIL: JAVA-USER-UNSUBSCRIBE LUCENE.APACHE.ORG
FOR ADDITIONAL COMMANDS, E-MAIL: JAVA-USER-HELP LUCENE.APACHE.ORG
|