On Apr 29, 2006, at 1:59 AM, Marios Skounakis wrote:
> Suppose the user enters the following query using a
textbox
> interface: "rate based optimization" (as a
phrase query, including
> the quotes). The query is parsed using QueryParser,
then it is
> rewritten, and given to the highlighter. Then, method
> getBestTextFragments is called.
>
> The method returns some fragments which contain only
one of the
> words in the search phrase. Isn't this wrong? Since
this is a
> phrase query, shouldn't the highlighter look for
fragments which
> contain all three words, and even more, only for
fragments in which
> the three words are adjascent (based on the token
stream returned
> by the analyzer)?
"wrong" is subjective in this case. I
personally prefer exact
highlighting based on what matched, not just individual term
extraction. I have, in one project, converted all queries
to a
SpanQuery and used getSpans() to do highlighting in an
accurate way.
This particular code is not generalizable easily and was
written
under contract, so I cannot share it, but it actually was
not very
complex to do.
Erik
------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribe lucene.apache.org
For additional commands, e-mail: java-user-help lucene.apache.org
|