List Info

Thread: Search Particulars




Search Particulars
user name
2006-02-23 22:17:35
Doug,

	I'm actually implementing a QueryFilter directly instead
of
extending one of the others.  I'm setting the boost to 2.0.
 Here's the
code:

public class MetaQueryFilter implements QueryFilter {

  private static final Logger LOG = LogFormatter
    .getLogger(MetaQueryFilter.class.getName());

  /**
   * Need to pull out the list of meta tags from the
configuration
   */
  private static String [] META_TAGS =
		  NutchConf.get().getStrings("meta.names");

  /**
   * We're going to go through and create search filters
for each of the
meta-tags we were asked to index.
   */
  public BooleanQuery filter(Query input, BooleanQuery
output) {
		  // If no meta-tags were specified in the conf file,
then don't bother wasting cycles
		  if ( META_TAGS.equals(null) ) {
				  return output;
		  }

		  addTerms(input, output);
		  return output;
  }

  private static void addTerms(Query input, BooleanQuery
output) {
		  Clause[] clauses = input.getClauses();
		  for (int x = 0; x < clauses.length; x++) {
				  Clause c = clauses[x];
				  if
(!c.getField().equals(Clause.DEFAULT_FIELD))
					  continue;		// skip
non-default fields

				  // These are the fields we're
interested in indexing
				  String [] tagsToIndex = META_TAGS;

				  for (int i = 0; i <
tagsToIndex.length; ++i) {
						  LOG.info("Meta Query
Filter: Adding a search for " + tagsToIndex[i]);

						  Term term = new
Term(tagsToIndex[i], c.getTerm().toString());

						  // add a lucene
PhraseQuery for this tag
						  PhraseQuery metaQuery
= new PhraseQuery();
						  metaQuery.setSlop(0);
						  metaQuery.add(term);

						  // set boost
	
metaQuery.setBoost(2.0f);

						  // add it as a
specified query
						  output.add(metaQuery,
false, false);
				  }
		  }
  }

}

-----Original Message-----
From: Doug Cutting [mailto:cuttingapache.org] 
Sent: Thursday, February 23, 2006 5:09 PM
To: nutch-userlucene.apache.org
Subject: Re: Search Particulars

Vanderdray, Jacob wrote:
> 	I'm not sure I understand what you're getting at. 
In this case
> I've added a comma separated list of names of meta
tags that I want to
> index and search against.  I've written a parse
filter, an index
filter
> and this query filter that all read in that list of
meta tags from the
> nutch-site.xml file.  
> 
> 	That much seems to work.  In the explain link I can
see that the
> fields are in the index and the ranking of pages are
affected by them,
> but if I search for a term which is in one of the meta
tags, but not
in
> any other fields I get 0 results.

Are you using RawFieldQueryFilter?  If so, are you
specifying a non-zero

boost to the constructor?  RawFieldQueryFilter defaults to a
zero boost.

  Query terms with a zero boost are automatically converted
into 
filters.  And filters cannot select documents, only remove
them.

Doug
Search Particulars
user name
2006-02-24 07:47:49
Hi

The code which you sent is only for query-filter

In the  parse-filter and especially in index-fitler , do u
add it to any new
field which you define??

What i do  is any data which i want to have  ,i store it in
a new field
(created by me)

I guess the index-filter must be storing it in such a field

So you have to use  FieldQueryFilter with this new field
type .(like
query-url and something like that)

Rgds
Prabhu


On 2/24/06, Vanderdray, Jacob <JVanderdrayaarp.org> wrote:
>
> Doug,
>
>        I'm actually implementing a QueryFilter
directly instead of
> extending one of the others.  I'm setting the boost to
2.0.  Here's the
> code:
>
> public class MetaQueryFilter implements QueryFilter {
>
> private static final Logger LOG = LogFormatter
>    .getLogger(MetaQueryFilter.class.getName());
>
> /**
>   * Need to pull out the list of meta tags from the
configuration
>   */
> private static String [] META_TAGS =
>                 
NutchConf.get().getStrings("meta.names");
>
> /**
>   * We're going to go through and create search
filters for each of the
> meta-tags we were asked to index.
>   */
> public BooleanQuery filter(Query input, BooleanQuery
output) {
>                  // If no meta-tags were specified in
the conf file,
> then don't bother wasting cycles
>                  if ( META_TAGS.equals(null) ) {
>                                  return output;
>                  }
>
>                  addTerms(input, output);
>                  return output;
> }
>
> private static void addTerms(Query input, BooleanQuery
output) {
>                  Clause[] clauses = input.getClauses();
>                  for (int x = 0; x < clauses.length;
x++) {
>                                  Clause c = clauses[x];
>                                  if
> (!c.getField().equals(Clause.DEFAULT_FIELD))
>                                          continue;     
       // skip
> non-default fields
>
>                                  // These are the
fields we're
> interested in indexing
>                                  String [] tagsToIndex
= META_TAGS;
>
>                                  for (int i = 0; i <
> tagsToIndex.length; ++i) {
>                                                 
LOG.info("Meta Query
> Filter: Adding a search for " + tagsToIndex[i]);
>
>                                                  Term
term = new
> Term(tagsToIndex[i], c.getTerm().toString());
>
>                                                  // add
a lucene
> PhraseQuery for this tag
>                                                 
PhraseQuery metaQuery
> = new PhraseQuery();
>                                                 
metaQuery.setSlop(0);
>                                                 
metaQuery.add(term);
>
>                                                  // set
boost
>
> metaQuery.setBoost(2.0f);
>
>                                                  // add
it as a
> specified query
>                                                 
output.add(metaQuery,
> false, false);
>                                  }
>                  }
> }
>
> }
>
> -----Original Message-----
> From: Doug Cutting [mailto:cuttingapache.org]
> Sent: Thursday, February 23, 2006 5:09 PM
> To: nutch-userlucene.apache.org
> Subject: Re: Search Particulars
>
> Vanderdray, Jacob wrote:
> >       I'm not sure I understand what you're
getting at.  In this case
> > I've added a comma separated list of names of
meta tags that I want to
> > index and search against.  I've written a parse
filter, an index
> filter
> > and this query filter that all read in that list
of meta tags from the
> > nutch-site.xml file.
> >
> >       That much seems to work.  In the explain
link I can see that the
> > fields are in the index and the ranking of pages
are affected by them,
> > but if I search for a term which is in one of the
meta tags, but not
> in
> > any other fields I get 0 results.
>
> Are you using RawFieldQueryFilter?  If so, are you
specifying a non-zero
>
> boost to the constructor?  RawFieldQueryFilter defaults
to a zero boost.
>
> Query terms with a zero boost are automatically
converted into
> filters.  And filters cannot select documents, only
remove them.
>
> Doug
>
[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )