List Info

Thread: How do I make an accent insensitive search




How do I make an accent insensitive search
country flaguser name
United States
2007-10-19 08:54:25
I've already searched all over the place here in the forum,
but I didn't
manage to get this to work.

I want to make the search accent insensitive, for example, I
have a file
with the word "resolução", and I want that the
guy who types "resolucao"
find that file.

I tried adding |analysis-(pt)| in the nutch-site.xml, the
search.jsp has
already the right charset, but nothing.

I appreciate any help, specially one very detailed.
-- 
View this message in context: http://www.nabble.com/H
ow-do-I-make-an-accent-insensitive-search-tf4653314.html#a13
294989
Sent from the Nutch - User mailing list archive at
Nabble.com.


RE: How do I make an accent insensitive search
country flaguser name
United States
2007-10-19 09:29:43
Don't know if there have been more recent changes to this
issue.
I did this for Nutch 0.7:

htt
p://mail-archives.apache.org/mod_mbox/lucene-nutch-user/2006
02.mbox/%3cBAY102-F167F537EFF3E57703D0644F3FF0phx.gbl%3e

Then I changed my indexer plugin to call this class before
indexing.

Howie



> Date: Fri, 19 Oct 2007 06:54:25 -0700
> From: kok_warlockhotmail.com
> To: nutch-userlucene.apache.org
> Subject: How do I make an accent insensitive search
> 
> 
> I've already searched all over the place here in the
forum, but I didn't
> manage to get this to work.
> 
> I want to make the search accent insensitive, for
example, I have a file
> with the word "resolução", and I want that
the guy who types "resolucao"
> find that file.
> 
> I tried adding |analysis-(pt)| in the nutch-site.xml,
the search.jsp has
> already the right charset, but nothing.
> 
> I appreciate any help, specially one very detailed.
> -- 
> View this message in context: http://www.nabble.com/H
ow-do-I-make-an-accent-insensitive-search-tf4653314.html#a13
294989
> Sent from the Nutch - User mailing list archive at
Nabble.com.
> 

____________________________________________________________
_____
Climb to the top of the charts!  Play Star Shuffle:  the
word scramble challenge with star power.
http://club.live.com/star_shuffle.aspx
?icid=starshuffle_wlmailtextlink_oct
RE: How do I make an accent insensitive search
country flaguser name
United States
2007-10-19 12:52:24
Hi, thx for the reply, how exactly do I that? and don't I
need to change the
webapp also?



Howie Wang wrote:
> 
> Don't know if there have been more recent changes to
this issue.
> I did this for Nutch 0.7:
> 
> htt
p://mail-archives.apache.org/mod_mbox/lucene-nutch-user/2006
02.mbox/%3cBAY102-F167F537EFF3E57703D0644F3FF0phx.gbl%3e
> 
> Then I changed my indexer plugin to call this class
before indexing.
> 
> Howie
> 
> 
> 
>> Date: Fri, 19 Oct 2007 06:54:25 -0700
>> From: kok_warlockhotmail.com
>> To: nutch-userlucene.apache.org
>> Subject: How do I make an accent insensitive
search
>> 
>> 
>> I've already searched all over the place here in
the forum, but I didn't
>> manage to get this to work.
>> 
>> I want to make the search accent insensitive, for
example, I have a file
>> with the word "resolução", and I want
that the guy who types "resolucao"
>> find that file.
>> 
>> I tried adding |analysis-(pt)| in the
nutch-site.xml, the search.jsp has
>> already the right charset, but nothing.
>> 
>> I appreciate any help, specially one very
detailed.
>> -- 
>> View this message in context:
>> http://www.nabble.com/H
ow-do-I-make-an-accent-insensitive-search-tf4653314.html#a13
294989
>> Sent from the Nutch - User mailing list archive at
Nabble.com.
>> 
> 
>
____________________________________________________________
_____
> Climb to the top of the charts!  Play Star Shuffle: 
the word scramble
> challenge with star power.
> http://club.live.com/star_shuffle.aspx
?icid=starshuffle_wlmailtextlink_oct
> 

-- 
View this message in context: http://www.nabble.com/H
ow-do-I-make-an-accent-insensitive-search-tf4653314.html#a13
299737
Sent from the Nutch - User mailing list archive at
Nabble.com.


RE: How do I make an accent insensitive search
country flaguser name
United States
2007-10-19 13:07:04
You can write your own indexing plugin that calls the
AccentReplacer.
Just copy the index-basic plugin code. You'll call the
AccentReplacer
on any string values before calling doc.add on them. See the
Nutch wiki
for more info on plugins.

You could change the webapp to remove accents from queries
either 
by hacking search.jsp or by creating a query-filter that
removes accents 
from user queries, but I never bothered. If your users are
largely 
US/UK based, they almost never enter those accents when
querying.

Howie



> Date: Fri, 19 Oct 2007 10:52:24 -0700
> From: kok_warlockhotmail.com
> To: nutch-userlucene.apache.org
> Subject: RE: How do I make an accent insensitive
search
> 
> 
> Hi, thx for the reply, how exactly do I that? and don't
I need to change the
> webapp also?
> 
> 
> 
> Howie Wang wrote:
> > 
> > Don't know if there have been more recent changes
to this issue.
> > I did this for Nutch 0.7:
> > 
> > htt
p://mail-archives.apache.org/mod_mbox/lucene-nutch-user/2006
02.mbox/%3cBAY102-F167F537EFF3E57703D0644F3FF0phx.gbl%3e
> > 
> > Then I changed my indexer plugin to call this
class before indexing.
> > 
> > Howie
> > 
> > 
> > 
> >> Date: Fri, 19 Oct 2007 06:54:25 -0700
> >> From: kok_warlockhotmail.com
> >> To: nutch-userlucene.apache.org
> >> Subject: How do I make an accent insensitive
search
> >> 
> >> 
> >> I've already searched all over the place here
in the forum, but I didn't
> >> manage to get this to work.
> >> 
> >> I want to make the search accent insensitive,
for example, I have a file
> >> with the word "resolução", and I
want that the guy who types "resolucao"
> >> find that file.
> >> 
> >> I tried adding |analysis-(pt)| in the
nutch-site.xml, the search.jsp has
> >> already the right charset, but nothing.
> >> 
> >> I appreciate any help, specially one very
detailed.
> >> -- 
> >> View this message in context:
> >> http://www.nabble.com/H
ow-do-I-make-an-accent-insensitive-search-tf4653314.html#a13
294989
> >> Sent from the Nutch - User mailing list
archive at Nabble.com.
> >> 
> > 
> >
____________________________________________________________
_____
> > Climb to the top of the charts!  Play Star
Shuffle:  the word scramble
> > challenge with star power.
> > http://club.live.com/star_shuffle.aspx
?icid=starshuffle_wlmailtextlink_oct
> > 
> 
> -- 
> View this message in context: http://www.nabble.com/H
ow-do-I-make-an-accent-insensitive-search-tf4653314.html#a13
299737
> Sent from the Nutch - User mailing list archive at
Nabble.com.
> 

____________________________________________________________
_____
Boo! Scare away worms, viruses and so much more! Try Windows
Live OneCare!
http://onecare.live.com/standard/
en-us/purchase/trial.aspx?s_cid=wl_hotmailnews
[1-4]

about | contact  Other archives ( Real Estate discussion Medical topics )