On 5-Oct-07, at 2:06 PM, Kyle Banerjee wrote:
> Howdy all,
>
> We are attempting to provide access to about 8 million
records of
> highly variable quality and length. In a nutshell, we
are trying to
> find a way to deprioritize "suspect" records
without discriminating
> against useful records that happen to be short. We do
not wish to
> eliminate suspect records from the results -- just
deprioritize them a
> bit.
>
> We have been indexing a field that marks a record as
likely to be good
> or bad, and I'm trying to figure out the most efficient
way to use it
> (should I be trying this at all?). As a newbie, my
first inclination
> was to OR the search terms with the same terms combined
with a "good
> record marker" with a modest boost.
>
> However, this method seems really clunky, and I'm
wondering if there's
> a better way to accomplish what we're trying to do.
Thanks,
If you know at index time that the document is shady, the
easiest way
to de-emphasize it globally is to set the document boost to
some
value other than one.
<doc boost="0.5">...
cheers,
-Mike
|