Hi guys
My company is looking in to using Nutch as a search engine
framework to
build upon. Everything seems to be in place except for one
obstacle -
there seems to be very little information on
modifying/extending/replacing the ranking algorithm.
>From searching around the web, I think Nutch only allows
you to apply
filters to Lucene's results - is this the case? How easy
would it be to
replace the ranking algorithm from the index upwards? Is,
for example,
pagerank/hits-like analysis or bayesian spam filtering
possible? Would
the current implementation be working against us while
adding these
features?
We would be willing to release our general work on making
this stuff
easier back to the community, if possible. Of course we
would also
comply with any parts of the Apache license that apply.
Many thanks
-Robin Haswell
|