List Info

Thread: Re: The ranking is wrong




Re: The ranking is wrong
country flaguser name
United States
2007-06-26 13:52:10
Hi Ronny,

Have you looked at your explanation page to see where the
document score is coming from? Often this is very helpful,
especially when the rankings are not what you would expect.

Luke doesn't show you the boosts you set, from my
experience. Don't be concerned if Luke always says 1.

You say that the actual parking document has parking as part
of a combined word. What analyzer are you using? Are you
stemming? If you're only matching exactly, parkingxxxx won't
match parking.  That's just something to keep in mind.

First step I'd suggest: check your explanation page. That
will tell you how many times it's matching each field in
each document.

Good luck, and have a good day,
Ann

----- Original Message ----
From: "Naess, Ronny" <Ronny.Naessavinor.no>
To: nutch-userlucene.apache.org
Sent: Tuesday, June 26, 2007 8:36:58 AM
Subject: The ranking is wrong

I have indexed our intranet with Nutch-0.9.

I do a query 'parking location:stavanger language:no' and I
recive some
hits. (two extra fields added)

The Nutch client ranks the hits not quite as expected. 
1. Transport and parking - Stavanger Airport, Sola
2. Frontpage - Stavanger Airport, Sola
3. Parking - Stavanger Airport, Sola

How it should have been
1. Parking - Stavanger Airport, Sola
2. Transport and parking - Stavanger Airport, Sola
3. Frontpage - Stavanger Airport, Sola (should not have been
there at
all if possible, but I recon it is not easy to not index a
navigation
menus since they are part of the page) 

The page "Parking - Stavanger Airport, Sola" has
parking in the title,
parking in the content (20+ times in some way, mostly
combined words
like xxxparking, or parkingxxx, but also about 5 times as
only parking)
and even parking in the url.

I guess I have to alter the boosting for some fields. I
tried to up the
boost in index-basic plugin (hardcode it), but I can't see
any changes
in the index. Luke tells me that the field index still is
1.0 even after
I changed them. Am I doing it wrong?

Even if I search only for 'parking' and not filtering on
location I
recive a lot of hits but all is frontpage for the different
frontpage.
All of this pages seem to have a high boost outranking the
real parking
page (s) big time. 

Any help is appreciated.


Best regards, 

Ronny N.







       
____________________________________________________________
________________________
Choose the right car based on your needs.  Check out Yahoo!
Autos new Car Finder tool.
http://autos.yahoo.
com/carfinder/
[1]

about | contact  Other archives ( Real Estate discussion Medical topics )