List Info

Thread: Integration of Lucene




Integration of Lucene
country flaguser name
Pakistan
2007-10-24 02:07:00
Hi All,

I m developing a search engine for Urdu language. I want to
use lucene 
for that purpose. Now the situation is that

---I have a corpus of 2000 Urdu(Variant of Persian and
Arabic) documents 
in XML form, how i will make index of them using Lucene.
---Well there will be need some stemming techniques while
indexing, 
because there is no stemmer available for Urdu language.

---I have developed a GUI using HTML and have a Java
Servlets for 
searching, so how i will integrate Lucene with my own
servlets.

Thanks...

------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribelucene.apache.org
For additional commands, e-mail: java-user-helplucene.apache.org


Re: Integration of Lucene
country flaguser name
United States
2007-10-24 07:59:02
On Oct 24, 2007, at 3:07 AM, Liaqat Ali wrote:

> Hi All,
>
> I m developing a search engine for Urdu language. I
want to use  
> lucene for that purpose. Now the situation is that
>
> ---I have a corpus of 2000 Urdu(Variant of Persian and
Arabic)  
> documents in XML form, how i will make index of them
using Lucene.

You will have to use some sort of XML Parser (SAX or a pull
parser)  
to extract the content you want and create Lucene Documents.
 Have a  
look at the tutorial on the Lucene home page for examples

> ---Well there will be need some stemming techniques
while indexing,  
> because there is no stemmer available for Urdu
language.

You will have to write your own, more than likely.  There
are some  
Arabic analyzers out there, perhaps you could use them as a
starting  
point.

>
> ---I have developed a GUI using HTML and have a Java
Servlets for  
> searching, so how i will integrate Lucene with my own
servlets.

This really is up to you, but essentially you need to setup
an  
IndexSearcher and create queries to do searches.  Again,
have a look  
at the tutorial as a way of getting started.



--------------------------
Grant Ingersoll
http://lucene.granti
ngersoll.com

Lucene Boot Camp Training:
ApacheCon Atlanta, Nov. 12, 2007.  Sign up now!  http:// 
www.apachecon.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance

http://w
iki.apache.org/lucene-java/LuceneFAQ



------------------------------------------------------------
---------
To unsubscribe, e-mail: java-user-unsubscribelucene.apache.org
For additional commands, e-mail: java-user-helplucene.apache.org


[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )