List Info

Thread: Highlighter and complex queries




Highlighter and complex queries
user name
2006-04-29 05:59:55
  Hi all,

Suppose the user enters the following query using a textbox
interface: 
"rate based optimization" (as a phrase query,
including the quotes). The 
query is parsed using QueryParser, then it is rewritten, and
given to 
the highlighter. Then, method getBestTextFragments is
called.

The method returns some fragments which contain only one of
the words in 
the search phrase. Isn't this wrong? Since this is a phrase
query, 
shouldn't the highlighter look for fragments which contain
all three 
words, and even more, only for fragments in which the three
words are 
adjascent (based on the token stream returned by the
analyzer)?

Thanks in advance,
Marios

------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribelucene.apache.org
For additional commands, e-mail: java-user-helplucene.apache.org

Highlighter and complex queries
user name
2006-04-29 07:06:51
On Apr 29, 2006, at 1:59 AM, Marios Skounakis wrote:
> Suppose the user enters the following query using a
textbox  
> interface: "rate based optimization" (as a
phrase query, including  
> the quotes). The query is parsed using QueryParser,
then it is  
> rewritten, and given to the highlighter. Then, method  
> getBestTextFragments is called.
>
> The method returns some fragments which contain only
one of the  
> words in the search phrase. Isn't this wrong? Since
this is a  
> phrase query, shouldn't the highlighter look for
fragments which  
> contain all three words, and even more, only for
fragments in which  
> the three words are adjascent (based on the token
stream returned  
> by the analyzer)?

"wrong" is subjective in this case.  I
personally prefer exact  
highlighting based on what matched, not just individual term
 
extraction.  I have, in one project, converted all queries
to a  
SpanQuery and used getSpans() to do highlighting in an
accurate way.   
This particular code is not generalizable easily and was
written  
under contract, so I cannot share it, but it actually was
not very  
complex to do.

	Erik


------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribelucene.apache.org
For additional commands, e-mail: java-user-helplucene.apache.org

Highlighter and complex queries
user name
2006-04-29 11:56:20
Hi Marios.

 >>Isn't this wrong?
Yes but this is an itch that no one has been suffficently
been bothered 
by to fix yet.
I still haven't had the time or a desperate need to
implement this so it 
will probably remain that way until someone feels strongly
enough about 
the problem to fix it. Highlighting is not a straight
forward problem if 
your goal is to exactly reflect the query logic- especially
if you also 
try to summarise large texts AND you are dealing with
complex queries 
containing Spans, "NOT" clauses and nested
Boolean logic etc Some 
compromises have to be made.

My suggestion as to how this might best be approached and
links to some 
related code is here:

http://marc.theaimsgroup.com/?l=lucene-
user&m=112496111224218&w=2


This post highlights some of the intricacies involved.

http://www.gossamer-threads.com/lists/lucene/ja
va-dev/23592#23592


Cheers
Mark



Marios Skounakis wrote:

>  Hi all,
>
> Suppose the user enters the following query using a
textbox interface: 
> "rate based optimization" (as a phrase
query, including the quotes). 
> The query is parsed using QueryParser, then it is
rewritten, and given 
> to the highlighter. Then, method getBestTextFragments
is called.
>
> The method returns some fragments which contain only
one of the words 
> in the search phrase. Isn't this wrong? Since this is
a phrase query, 
> shouldn't the highlighter look for fragments which
contain all three 
> words, and even more, only for fragments in which the
three words are 
> adjascent (based on the token stream returned by the
analyzer)?
>
> Thanks in advance,
> Marios
>
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail: java-user-unsubscribelucene.apache.org
> For additional commands, e-mail: java-user-helplucene.apache.org
>
>
>


Send instant messages to your online friends http://uk.messenger.yah
oo.com 

------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribelucene.apache.org
For additional commands, e-mail: java-user-helplucene.apache.org

[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )