The kind of information I want to add is like dozens of
"flags page
features", so no big data but many parts. Like
"flag_content=xhtml;
flag_page-size=xx; flag_page-depth=xx;
list_picture_files=...".
I saw that the parse object fields looks quite rigid, and I
would have to do
lots of modifications to add my data in it.
( public ParseData(ParseStatus status, String title,
Outlink[] outlinks,
Metadata contentMeta) )
So I think it's better that I put this data in the Index
using Metadata
objects.
But, adding like 20 to 40 small flags as Metadata in the
index:
- is it a waste of performance for nutch index bulding
process? (since I'll
not use these flags to do specific queries but just a global
export of all)
- is it easier to export (make a link with) Metadata from
index to a SQLbase
than Segments data?
Jeremy Huylebroeck wrote:
>
>
> As far as I know, you can do this.
>
> You can either add things in the Metadata objects, but
it is limited to
> String values.
>
> Or you can extend the Parse object, have a different
OutputFormat for it
> that would read/write your information from the
segments.
>
> The fetcher/parser would have to be modified slightly,
but nothing hard
> to do.
> We did something around those lines, and it works
perfectly in Nutch
> 0.8.
>
>
>
> Any other way?
>
>
>
--
View this message in context: http://www.nabble
.com/How-to-add-data-into-segment-with-my-own-plugin---tf327
9715.html#a9162123
Sent from the Nutch - Dev mailing list archive at
Nabble.com.
|