|
List Info
Thread: Re: Wildcard vs Term query
|
|
| Re: Wildcard vs Term query |
  United Kingdom |
2007-09-26 04:21:53 |
Are you using the out of the box Lucene QueryParser? It
will automatically detect wildcard queries by the presence
of * or ? chars.
If the user input does not contain these characters a plain
TermQuery is used.
BooleanQuery.setMaxClauseCount can be used to control the
upper limit on terms produced by Wildcard/Fuzzy Queries.
If this limit is exceeded (e.g when searching for something
like "a*" ) then an exception is thrown.
Cheers
Mark
----- Original Message ----
From: John Byrne <john.byrne propylon.com>
To: java-user lucene.apache.org
Sent: Wednesday, 26 September, 2007 9:48:17 AM
Subject: Wildcard vs Term query
Hi,
I'm working my way through the Lucene In Action book, and
there is one
thing I need explained that I didn't find there;
While wildcard queries are potentially slower than ordinary
term
queries, are they slower even if theyt don't contain a
wildcard?
Significantly slower?
The reason I ask is that if we assume we are going to allow
wildcards in
a search engine, but we want to optimize, to take advantage
of when
they are NOT used, do we have to check for the presence of
"*" or "?" in
the term, and create the most appropriate query, or can I
assume that
when a wildcard is not present, the WildcardQuery will be as
fast (or
almost as fast) a a plain term query?
Thanks in advance!
John B.
------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribe lucene.apache.org
For additional commands, e-mail: java-user-help lucene.apache.org
___________________________________________________________
Yahoo! Answers - Got a question? Someone out there knows the
answer. Try it
now.
http://uk.answers.yahoo.
com/
------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribe lucene.apache.org
For additional commands, e-mail: java-user-help lucene.apache.org
|
|
| Re: Wildcard vs Term query |
  Ireland |
2007-09-26 04:45:16 |
I'm not using the QueryParser at all. I need to do a little
more with
the terms, so i'm explicitly creating a Query from a single
term. What I
was hoping was to avoid something like this:
...
if(term.contains("*") ||
terms.contains("?") {
return new WildcardQuery(...
}
else {
return new TermQuery(...
...
and instead just go like this:
...
return new WilcardQuery(...
...
on the basis that the WildacardQuery would only be slower if
it does
contain a wildcard character. But as you pointed out, the
QueryParser
makes this optimization, so I suppose I should too.
mark harwood wrote:
> Are you using the out of the box Lucene QueryParser?
It will automatically detect wildcard queries by the
presence of * or ? chars.
> If the user input does not contain these characters a
plain TermQuery is used.
>
> BooleanQuery.setMaxClauseCount can be used to control
the upper limit on terms produced by Wildcard/Fuzzy
Queries.
> If this limit is exceeded (e.g when searching for
something like "a*" ) then an exception is
thrown.
>
> Cheers
> Mark
> ----- Original Message ----
> From: John Byrne <john.byrne propylon.com>
> To: java-user lucene.apache.org
> Sent: Wednesday, 26 September, 2007 9:48:17 AM
> Subject: Wildcard vs Term query
>
> Hi,
>
> I'm working my way through the Lucene In Action book,
and there is one
> thing I need explained that I didn't find there;
>
> While wildcard queries are potentially slower than
ordinary term
> queries, are they slower even if theyt don't contain a
wildcard?
> Significantly slower?
>
> The reason I ask is that if we assume we are going to
allow wildcards in
> a search engine, but we want to optimize, to take
advantage of when
> they are NOT used, do we have to check for the presence
of "*" or "?" in
> the term, and create the most appropriate query, or can
I assume that
> when a wildcard is not present, the WildcardQuery will
be as fast (or
> almost as fast) a a plain term query?
>
> Thanks in advance!
> John B.
>
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail: java-user-unsubscribe lucene.apache.org
> For additional commands, e-mail: java-user-help lucene.apache.org
>
>
>
>
>
>
>
___________________________________________________________
> Yahoo! Answers - Got a question? Someone out there
knows the answer. Try it
> now.
> http://uk.answers.yahoo.
com/
>
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail: java-user-unsubscribe lucene.apache.org
> For additional commands, e-mail: java-user-help lucene.apache.org
>
>
>
>
------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribe lucene.apache.org
For additional commands, e-mail: java-user-help lucene.apache.org
|
|
| Re: Wildcard vs Term query |
  United States |
2007-09-26 05:02:25 |
WildcardQuery won't be slower than TermQuery if there are no
wildcard
characters. Beyond what QueryParser does, WildcardQuery
itself
reverts to a TermQuery:
public Query rewrite(IndexReader reader) throws
IOException {
if (this.termContainsWildcard) {
return super.rewrite(reader);
}
return new TermQuery(getTerm());
}
I personally would optimize which query gets created, but
performance-
wise you won't pay a penalty for just using WildcardQuery.
Erik
On Sep 26, 2007, at 5:45 AM, John Byrne wrote:
> I'm not using the QueryParser at all. I need to do a
little more
> with the terms, so i'm explicitly creating a Query from
a single
> term. What I was hoping was to avoid something like
this:
> ...
> if(term.contains("*") ||
terms.contains("?") {
> return new WildcardQuery(...
> }
> else {
> return new TermQuery(...
> ...
>
> and instead just go like this:
> ...
> return new WilcardQuery(...
> ...
> on the basis that the WildacardQuery would only be
slower if it
> does contain a wildcard character. But as you pointed
out, the
> QueryParser makes this optimization, so I suppose I
should too.
>
> mark harwood wrote:
>> Are you using the out of the box Lucene
QueryParser? It will
>> automatically detect wildcard queries by the
presence of * or ?
>> chars.
>> If the user input does not contain these characters
a plain
>> TermQuery is used.
>>
>> BooleanQuery.setMaxClauseCount can be used to
control the upper
>> limit on terms produced by Wildcard/Fuzzy Queries.
>> If this limit is exceeded (e.g when searching for
something like
>> "a*" ) then an exception is thrown.
>>
>> Cheers
>> Mark
>> ----- Original Message ----
>> From: John Byrne <john.byrne propylon.com>
>> To: java-user lucene.apache.org
>> Sent: Wednesday, 26 September, 2007 9:48:17 AM
>> Subject: Wildcard vs Term query
>>
>> Hi,
>>
>> I'm working my way through the Lucene In Action
book, and there is
>> one thing I need explained that I didn't find
there;
>>
>> While wildcard queries are potentially slower than
ordinary term
>> queries, are they slower even if theyt don't
contain a wildcard?
>> Significantly slower?
>>
>> The reason I ask is that if we assume we are going
to allow
>> wildcards in a search engine, but we want to
optimize, to take
>> advantage of when they are NOT used, do we have to
check for the
>> presence of "*" or "?" in the
term, and create the most
>> appropriate query, or can I assume that when a
wildcard is not
>> present, the WildcardQuery will be as fast (or
almost as fast) a a
>> plain term query?
>>
>> Thanks in advance!
>> John B.
>>
>>
------------------------------------------------------------
---------
>> To unsubscribe, e-mail: java-user-unsubscribe lucene.apache.org
>> For additional commands, e-mail: java-user-help lucene.apache.org
>>
>>
>>
>>
>>
>>
>>
___________________________________________________________
>> Yahoo! Answers - Got a question? Someone out there
knows the
>> answer. Try it
>> now.
>> http://uk.answers.yahoo.
com/
>>
------------------------------------------------------------
---------
>> To unsubscribe, e-mail: java-user-unsubscribe lucene.apache.org
>> For additional commands, e-mail: java-user-help lucene.apache.org
>>
>>
>>
>>
>
>
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail: java-user-unsubscribe lucene.apache.org
> For additional commands, e-mail: java-user-help lucene.apache.org
------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribe lucene.apache.org
For additional commands, e-mail: java-user-help lucene.apache.org
|
|
[1-3]
|
|