List Info

Thread: Lucene ID and scoring.




Lucene ID and scoring.
user name
2007-10-11 08:05:24
Hi All,

I have two questions here

1- Lucene generates id for every document it indexes and it
does in
order..0,1,2,3.....n .Can i make lucene use my id ( my id
for a document
..say randomly generated unique id for a document) as lucene
id.

The reason i am asking this question because lucene filter
uses lucene id (
to create  BitSet ) and i dont wanna keep multiple ids .

2- Is there any way to change lucene's scoring formula
without changing the
lucene core.
I know, there is one way of doing this is to extend the
similarity and thats
really simple but it can help only in formulae which has tf
and idf in
multiplication.

for example a formula like  tf(power n) * idf ( power m) can
be easily
implemented using similarity

but what about the formula tf(power n)  + idf(power m )

is there a simple way to implement the above without
changing lucene core (
i know one way to do is too change the TermScorer and
hardcore the + sign
but that's too naive 

Thanks a lot ...
Sandeep

-- 
SANDEEP CHAWLA
House No- 23
10th main
BTM 1st  Stage
Bangalore Mobile: 91-9986150603
Re: Lucene ID and scoring.
user name
2007-10-16 08:35:18
Can't answer the second question, but the answer to the
first is "no".
Not only are Lucene IDs internally generated, but they
change when
you  delete/optimize.

Would caching filters help? Especially if you pre-computed
them
at, say, warm up?

What problem are you trying to solve anyway? A statement of
the
problem might get you better answers.....

Best
Erick


On 10/11/07, sandeep chawla <sand.chawlagmail.com> wrote:
>
> Hi All,
>
> I have two questions here
>
> 1- Lucene generates id for every document it indexes
and it does in
> order..0,1,2,3.....n .Can i make lucene use my id ( my
id for a document
> ..say randomly generated unique id for a document) as
lucene id.
>
> The reason i am asking this question because lucene
filter uses lucene id
> (
> to create  BitSet ) and i dont wanna keep multiple ids
.
>
> 2- Is there any way to change lucene's scoring formula
without changing
> the
> lucene core.
> I know, there is one way of doing this is to extend the
similarity and
> thats
> really simple but it can help only in formulae which
has tf and idf in
> multiplication.
>
> for example a formula like  tf(power n) * idf ( power
m) can be easily
> implemented using similarity
>
> but what about the formula tf(power n)  + idf(power m
)
>
> is there a simple way to implement the above without
changing lucene core
> (
> i know one way to do is too change the TermScorer and
hardcore the + sign
> but that's too naive 
>
> Thanks a lot ...
> Sandeep
>
> --
> SANDEEP CHAWLA
> House No- 23
> 10th main
> BTM 1st  Stage
> Bangalore Mobile: 91-9986150603
>
[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )