List Info

Thread: hi all:




hi all:
user name
2006-12-09 13:38:41
吴志敏 wrote:
>  I want to read the stored segments to a xml file, but
when I read the
> SegmentReader.java, I find that it ‘s not a simple
thing.
> 
> it’s a hadoop’s job to dump a text file. I just
want to dump the
> segments’ some content witch I interested to a xml. 
> 
> So some one can tell me hwo to do this, any reply will
be appreciated!

Segment data is basically just a bunch of files containing
key->value pairs, so there's always the possibility of
reading the data
directly with help of:

http://lucene.apache.org/h
adoop/docs/api/org/apache/hadoop/io/SequenceFile.Reader.html


To see what kind of object to expect you can just examine
the beginning
of file where there is some metadata stored - like class
used for key
and class used for value (that metadata is also available
from methods
of SequenceFile.Reader class).

For example to read the contents of Content data from a
segment one
could use something like:

SequenceFile.Reader reader = new SequenceFile.Reader(fs,
path, conf);

Text url = new Text();			//key
Content content = new Content();	//value
while (reader.next(url, content)) {
  //now just use url and content the way you like
}

--
 Sami Siren

hi all:
user name
2006-12-10 05:28:44
thx very much ,i'll try it

On 12/9/06, Sami Siren <ssirengmail.com> wrote:
>
> ־ wrote:
> >  I want to read the stored segments to a xml file,
but when I read the
> > SegmentReader.java, I find that it 's not a simple
thing.
> >
> > it's a hadoop's job to dump a text file. I just
want to dump the
> > segments' some content witch I interested to a
xml.
> >
> > So some one can tell me hwo to do this, any reply
will be appreciated!
>
> Segment data is basically just a bunch of files
containing
> key->value pairs, so there's always the possibility
of reading the data
> directly with help of:
>
>
> http://lucene.apache.org/h
adoop/docs/api/org/apache/hadoop/io/SequenceFile.Reader.html

>
> To see what kind of object to expect you can just
examine the beginning
> of file where there is some metadata stored - like
class used for key
> and class used for value (that metadata is also
available from methods
> of SequenceFile.Reader class).
>
> For example to read the contents of Content data from a
segment one
> could use something like:
>
> SequenceFile.Reader reader = new
SequenceFile.Reader(fs, path, conf);
>
> Text url = new Text();                  //key
> Content content = new Content();        //value
> while (reader.next(url, content)) {
>   //now just use url and content the way you like
> }
>
> --
> Sami Siren
>
>


-- 
www.babatu.com
[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )