List Info

Thread: Hi Experts




Hi Experts
user name
2006-03-29 06:42:34
"
-----Original Message-----
From: Ranjan K. Baisak [mailto:ranjanbaisakyahoo.com]
Sent: Wednesday, March 29, 2006 12:06 PM
To: java-userlucene.apache.org
Subject: Re: Hi Experts


For internet searching Nutch is the best tool. But
however as you dont want to use cygwin then you need
to use Lucene in following way.
You need to download whole page and create an index
out of that page. Then use lucene to search offline
content than online.
I have used lucene in this way and I have really got a
good result to search my intranet.
I am not sure whether this would meet your
requirement.

BTW have you tried google api.

- R"


Dear Rajan, thanks for your mail.  Tats really a good idea.
I can use Lucene after downloading the webpage in the
machine. Which tool (open source) you suggest me to download
the webpage to the local machine? Or Which (open source)
tool you are using to do that?.

Regards
Kamesh

--- "Babu, KameshNarayana (GE, Research,
consultant)"
<kameshnarayana.babuge.com> wrote:

> Hi Experts,
> 
> Iam  a new bie. Iam suppose to select a search
> engine for my project, which should search from the
> given URL and display the result.
> 
> I should use the search engine in windows OS only. I
> should not use any other external tool like CGYWIN
> which is used in NUTCH. 
> 
> Any expet guidance would be much appreciated .
> 
> Thanks 
> 
> Kamesh
> 
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail:
> java-user-unsubscribelucene.apache.org
> For additional commands, e-mail:
> java-user-helplucene.apache.org
> 
> 


------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribelucene.apache.org
For additional commands, e-mail: java-user-helplucene.apache.org


------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribelucene.apache.org
For additional commands, e-mail: java-user-helplucene.apache.org

Hi Experts
user name
2006-03-29 07:02:15
I am using HTMLparser to parse all html pages and to
get required information out of that.
Let me tell you my crawler.
I have to search for all pages of group website(e.g.
www.group.com and it contians links to
news.groups.com, forum.news.com etc...).So I will get
all links of group website using Parser and download
all html pages.
Once download is completed, I will use html parser to
parse and get required information. This is based on
business requirement.
Then I would create Lucene index using Lucene.
Insearch I am using lucene api as well as Lucene
highlighter utility to highlight search tokens.

I used to refresh my index on everyday. This is based
on crone job in Spring.

Hope it would help you for your requirement.
Incase you need any help please IM me
ranjanbaisakyahoo.com

-R

--- "Babu, KameshNarayana (GE, Research,
consultant)"
<kameshnarayana.babuge.com> wrote:

> 
> "
> -----Original Message-----
> From: Ranjan K. Baisak
> [mailto:ranjanbaisakyahoo.com]
> Sent: Wednesday, March 29, 2006 12:06 PM
> To: java-userlucene.apache.org
> Subject: Re: Hi Experts
> 
> 
> For internet searching Nutch is the best tool. But
> however as you dont want to use cygwin then you need
> to use Lucene in following way.
> You need to download whole page and create an index
> out of that page. Then use lucene to search offline
> content than online.
> I have used lucene in this way and I have really got
> a
> good result to search my intranet.
> I am not sure whether this would meet your
> requirement.
> 
> BTW have you tried google api.
> 
> - R"
> 
> 
> Dear Rajan, thanks for your mail.  Tats really a
> good idea. I can use Lucene after downloading the
> webpage in the machine. Which tool (open source) you
> suggest me to download the webpage to the local
> machine? Or Which (open source) tool you are using
> to do that?.
> 
> Regards
> Kamesh
> 
> --- "Babu, KameshNarayana (GE, Research,
> consultant)"
> <kameshnarayana.babuge.com> wrote:
> 
> > Hi Experts,
> > 
> > Iam  a new bie. Iam suppose to select a search
> > engine for my project, which should search from
> the
> > given URL and display the result.
> > 
> > I should use the search engine in windows OS only.
> I
> > should not use any other external tool like CGYWIN
> > which is used in NUTCH. 
> > 
> > Any expet guidance would be much appreciated .
> > 
> > Thanks 
> > 
> > Kamesh
> > 
> >
>
------------------------------------------------------------
---------
> > To unsubscribe, e-mail:
> > java-user-unsubscribelucene.apache.org
> > For additional commands, e-mail:
> > java-user-helplucene.apache.org
> > 
> > 
> 
> 
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail:
> java-user-unsubscribelucene.apache.org
> For additional commands, e-mail:
> java-user-helplucene.apache.org
> 
> 
>
------------------------------------------------------------
---------
> To unsubscribe, e-mail:
> java-user-unsubscribelucene.apache.org
> For additional commands, e-mail:
> java-user-helplucene.apache.org
> 
> 


------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribelucene.apache.org
For additional commands, e-mail: java-user-helplucene.apache.org

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )