List Info

Thread: Input and Output Value Class Types




Input and Output Value Class Types
user name
2006-06-29 21:41:44
All,

Is there a way to get around having to have the input value
class and 
output value class be the same?  I have an object writable
that I am 
trying to unwrap.

Dennis
Input and Output Value Class Types
user name
2006-06-30 02:01:09
Hi,
may be have a look to the nutch indexer it use a kind of
wrapper, may  
be this can help you.
Also please browse the haddop developer list archive since
there was  
some related discussion.
HTH
Stefan
Am 29.06.2006 um 14:41 schrieb Dennis Kubes:

> All,
>
> Is there a way to get around having to have the input
value class  
> and output value class be the same?  I have an object
writable that  
> I am trying to unwrap.
>
> Dennis
>

Input and Output Value Class Types
user name
2006-06-30 04:09:28
The indexer uses an ObjectWritable and I am using that
trick.  Problem 
is I need to input and ObjectWritable but output a different
object.  I 
will take a look at the hadoop list.

Dennis

Stefan Groschupf wrote:
> Hi,
> may be have a look to the nutch indexer it use a kind
of wrapper, may 
> be this can help you.
> Also please browse the haddop developer list archive
since there was 
> some related discussion.
> HTH
> Stefan
> Am 29.06.2006 um 14:41 schrieb Dennis Kubes:
>
>> All,
>>
>> Is there a way to get around having to have the
input value class and 
>> output value class be the same?  I have an object
writable that I am 
>> trying to unwrap.
>>
>> Dennis
>>
>
Input and Output Value Class Types
user name
2006-06-30 05:37:59
In worst case,( I do this sometime) you have to split your
task in  
several different jobs.
Ugly but it works.
In general the problem is known, however if you put it again
on the  
table in the hadoop developer list, it may be get some more
priority.
Stefan

On 29.06.2006, at 21:09, Dennis Kubes wrote:

> The indexer uses an ObjectWritable and I am using that
trick.   
> Problem is I need to input and ObjectWritable but
output a  
> different object.  I will take a look at the hadoop
list.
>
> Dennis
>
> Stefan Groschupf wrote:
>> Hi,
>> may be have a look to the nutch indexer it use a
kind of wrapper,  
>> may be this can help you.
>> Also please browse the haddop developer list
archive since there  
>> was some related discussion.
>> HTH
>> Stefan
>> Am 29.06.2006 um 14:41 schrieb Dennis Kubes:
>>
>>> All,
>>>
>>> Is there a way to get around having to have the
input value class  
>>> and output value class be the same?  I have an
object writable  
>>> that I am trying to unwrap.
>>>
>>> Dennis
>>>
>>
>

robots.txt
user name
2006-06-30 08:29:43
hi

i use nutch 0.7.1 to crawl a few intranetserver.
yesterday i tried to exclude some directories with the
robots.txt.
but nothing changed.
i copied this robots.txt to the server:

User-agent: NutchCVS
Disallow: /cgi-bin/
Disallow: /manuals/

the User-agent "NutchCVS" and the robots agent
name in nutch-default
is the same.

can anyone helps me with this problem?

i'm crawling with this command:

bin/nutch crawl urls -dir crawl060621 -depth 15 &>
crawl060621.log &

greets david

==========================================================

David Wojciechowski
Universitätsklinikum Freiburg
Klinikrechenzentrum
Agnesenstrasse 6-8
D-79106 Freiburg

Telefon :  0761 / 270 - 1842
Fax: 0761 / 270 - 2276
E-Mail   :  david.wojciechowskiuniklinik-freiburg.de

==========================================================

Input and Output Value Class Types
user name
2006-06-30 08:25:56
Dennis Kubes wrote:
> The indexer uses an ObjectWritable and I am using that
trick.  Problem 
> is I need to input and ObjectWritable but output a
different object.  
> I will take a look at the hadoop list.

You can view ObjectWritable as an opaque container for any,
well, Object 
;). This means that you can produce Objects of whatever
class (so long 
as they implement Writable), stuff them into
ObjectWritables, and then 
write your own OutputFormat where you unpack them.

Check e.g. SegmentMerger to see how to do this - this is an
extreme 
case, because it not only produces different class types on
output, but 
also produces many output files.

-- 
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com 
Contact: info at sigram dot com


[1-6]

about | contact  Other archives ( Real Estate discussion Medical topics )