List Info

Thread: retrieve data from index file




retrieve data from index file
user name
2006-02-23 01:37:44
hi,
Is there any example of java code that i can read the data
from index file
in segments? I had tried segmentReader, ArrayfileReader, and
SequenceReader.... feel confuse. Thanks.

Wong

On 1/24/06, Stefan Groschupf <sgmedia-style.com> wrote:
>
> you can calculate these statistics from the segment
data, e.g. parsed
> text.
> To read the nutch file format is easy possible using
the Nutch
> Readers e.g. SequenceFile Reader.
> Just take a look to the io package.
>
> HTH
> Stefan
>
>
> Am 24.01.2006 um 08:18 schrieb Wong Ting Kiong:
>
> > hi all,
> >
> > I'm now using nutch 0.7.1, and I wish to retrieve
content from
> > index file,
> > how can i retrieve? Information that i want to
retrieve are
> > - list of words from each links
> > - occurance of words in each links
> > can i retrieve these information in raw data
format?
> >
> > thanks for your attention
> >
> > Kiong
>
>
retrieve data from index file
user name
2006-02-23 10:38:05
Hi Wong,
take a look to:
http://lucene.apache.org/java/docs/api/org/apache
/lucene/index/ 
IndexReader.html
There are many code snippets in the net that will show you
how you  
can use it.
In general I found the book lucene in action a useful guide
when  
working with lucene.

Stefan

Am 23.02.2006 um 02:37 schrieb Wong Ting Kiong:

> hi,
> Is there any example of java code that i can read the
data from  
> index file
> in segments? I had tried segmentReader,
ArrayfileReader, and
> SequenceReader.... feel confuse. Thanks.
>
> Wong
>
> On 1/24/06, Stefan Groschupf <sgmedia-style.com> wrote:
>>
>> you can calculate these statistics from the segment
data, e.g. parsed
>> text.
>> To read the nutch file format is easy possible
using the Nutch
>> Readers e.g. SequenceFile Reader.
>> Just take a look to the io package.
>>
>> HTH
>> Stefan
>>
>>
>> Am 24.01.2006 um 08:18 schrieb Wong Ting Kiong:
>>
>>> hi all,
>>>
>>> I'm now using nutch 0.7.1, and I wish to
retrieve content from
>>> index file,
>>> how can i retrieve? Information that i want to
retrieve are
>>> - list of words from each links
>>> - occurance of words in each links
>>> can i retrieve these information in raw data
format?
>>>
>>> thanks for your attention
>>>
>>> Kiong
>>
>>

------------------------------------------------------------
---
company:        http://www.media-style.com

forum:        http://www.text-mining.org

blog:            http://www.find23.net


retrieve data from index file
user name
2006-02-24 07:40:22
hi,
I had tried some java codes calling lucene lib
lucene-1.9-rc1-dev.jar, but
got error, my java code is like below:

import org.apache.lucene.index.IndexReader;
import org.apache.lucene.store.*;
import java.io.*;
import java.util.*;

public class hello
{

   public static void main(String[] args)throws Exception
   {
     System.out.println("Hello world!");
     File indexDir = new
File("./publication/index");
     Boolean bb = true;
     Directory fsDir =
FSDirectory.getDirectory(indexDir,false);


   }

}
below is the statement noted in ubuntu OS's Terminal : -

luceneJava$ export CLASSPATH=/opt/nutch-0.7
.1/luceneJava/lucene-
1.9-rc1-dev.jar
luceneJava$ export JAVA_HOME=/opt/jdk1.5.0_ 06
luceneJava$ /opt/jdk1.5.0_06/bin/javac hello.java
luceneJava$ java -classpath . hello
Hello world!
Exception in thread "main"
java.lang.NoClassDefFoundError:
org/apache/lucene/st ore/FSDirectory
         at hello.main(hello.java:14)

I'm using ubuntu as my OS, actually this code run ok in
window XP, but not
for ubuntu. can anyone check the error for me? thanks for
your attention.

regards,
Wong


On 2/23/06, Stefan Groschupf <sgmedia-style.com> wrote:
>
> Hi Wong,
> take a look to:
> http://lucene.apache.org/java/docs/api/org/apache
/lucene/index/
> IndexReader.html
> There are many code snippets in the net that will show
you how you
> can use it.
> In general I found the book lucene in action a useful
guide when
> working with lucene.
>
> Stefan
>
> Am 23.02.2006 um 02:37 schrieb Wong Ting Kiong:
>
> > hi,
> > Is there any example of java code that i can read
the data from
> > index file
> > in segments? I had tried segmentReader,
ArrayfileReader, and
> > SequenceReader.... feel confuse. Thanks.
> >
> > Wong
> >
> > On 1/24/06, Stefan Groschupf <sgmedia-style.com> wrote:
> >>
> >> you can calculate these statistics from the
segment data, e.g. parsed
> >> text.
> >> To read the nutch file format is easy possible
using the Nutch
> >> Readers e.g. SequenceFile Reader.
> >> Just take a look to the io package.
> >>
> >> HTH
> >> Stefan
> >>
> >>
> >> Am 24.01.2006 um 08:18 schrieb Wong Ting
Kiong:
> >>
> >>> hi all,
> >>>
> >>> I'm now using nutch 0.7.1, and I wish to
retrieve content from
> >>> index file,
> >>> how can i retrieve? Information that i
want to retrieve are
> >>> - list of words from each links
> >>> - occurance of words in each links
> >>> can i retrieve these information in raw
data format?
> >>>
> >>> thanks for your attention
> >>>
> >>> Kiong
> >>
> >>
>
>
------------------------------------------------------------
---
> company:        http://www.media-style.com

> forum:        http://www.text-mining.org

> blog:            http://www.find23.net
>
>
>
>
[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )