On Nov 14, 2006, at 8:33 AM, Peter Sinnott wrote:
> The KinoSearch::Searcher docs say
>
> "analyzer - An object which subclasses
KinoSearch::Analysis::Analyer,
> such as a PolyAnalyzer. This must be identical to the
Analyzer used at
> index-time, or the results won't match up."
>
> Does this mean if I use different analyzers for
different fields
> when creating the index then I can not search it
properly?
Depends. You definitely *can* search it properly, but you
may need
to get sophisticated about how you build your queries.
I'll give an example that doesn't use Analyzers, but
illustrates the
principle.
my $polyanalyzer =
KinoSearch::Analysis::PolyAnalyzer->new(
language => 'en' );
my $invindexer = KinoSearch::InvIndexer->new(
analyzer => $polyanalyzer,
invindex => '/path/to/invindex',
);
$invindexer->spec_field( name => 'body' );
$invindexer->spec_field(
name => 'category'
analyzed => 0,
);
Now, say we add a document with the category of 'books'.
Because the
category field doesn't get analyzed, the string 'books'
makes it
intact into the index. However, if the word 'books' ever
appears
anywhere in the body, it will get stemmed down to 'book' by
the
PolyAnalyzer.
Because the following search will make use of the english
PolyAnalyzer, it will return only matches on 'book' -- NOT
'books'...
my $searcher = KinoSearch::Searcher->new(
analyzer => $polyanalyzer,
invindex => '/path/to/invindex',
);
my $hits = $searcher->search('books');
... so it will never match a document where the category is
'books'.
However, there are a number of ways to construct your query
so that
you match the category 'books'. Here's one:
my $category_query_parser =
KinoSearch::QueryParser::QueryParser-
>new(
analyzer =>
KinoSearch::Analysis::Analyzer->new, # no-op
fields => [ 'category' ],
);
my $main_query_parser =
KinoSearch::QueryParser::QueryParser->new(
analyzer => $poly_analyzer,
fields => [ 'body' ],
);
my $bool_query =
KinoSearch::Search::BooleanQuery->new;
# search category field for the unstemmed 'books'
my $cat_query =
$category_query_parser->parse('books');
$bool_query->add_clause( query => $cat_query,
occur => 'SHOULD' );
# search body field for the stemmed 'book'
my $main_query = $main_query_parser->parse('books');
$bool_query->add_clause( query => $main_query,
occur => 'SHOULD' );
my $hits = $searcher->search( query => $bool_query
);
Snoop the _prepare_simple_search() method in
KinoSearch::Searcher to
see what KS is doing behind the scenes to build a query when
you
supply only a query string.
HTH,
Marvin Humphrey
Rectangular Research
http://www.rectangular.co
m/
_______________________________________________
KinoSearch mailing list
KinoSearch rectangular.com
http://www.rectangular.com/mailman/listinfo/kinosearch
|