List Info

Thread: NullPointerException fetching some sites with temp redirects




NullPointerException fetching some sites with temp redirects
user name
2007-07-24 18:52:32
Hi,

Using nutch 0.9, although I get the same with a more recent
nightly build.

I'm getting NPE fetching these two pages:

http://www.absoluteit.co.
nz
and
http://defence.allmedia
.co.nz

I've tracked it down by putting a t.printStackTrace() in the
catch 
(Throwable t) of the run() in Fetcher.java:
java.lang.NullPointerException
         at org.apache.hadoop.io.Text.encode(Text.java:375)
         at org.apache.hadoop.io.Text.encode(Text.java:356)
         at
org.apache.hadoop.io.Text.writeString(Text.java:396)
         at 
org.apache.nutch.protocol.Content.writeCompressed(Content.ja
va:146)
         at 
org.apache.hadoop.io.CompressedWritable.write(CompressedWrit
able.java:74)
         at 
org.apache.nutch.fetcher.FetcherOutput.write(FetcherOutput.j
ava:56)
         at 
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(Map
Task.java:315)
         at 
org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetche
r.java:343)
         at 
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.j
ava:191)

I'm not sure where to go from here. Any suggestions?

Cheers,
Carl.

Re: NullPointerException fetching some sites with temp redirects
user name
2007-07-25 01:08:04
Hi,

On 7/25/07, Carl Cerecke <carlnzs.com> wrote:
> Hi,
>
> Using nutch 0.9, although I get the same with a more
recent nightly build.
>
> I'm getting NPE fetching these two pages:
>
> http://www.absoluteit.co.
nz
> and
> http://defence.allmedia
.co.nz
>
> I've tracked it down by putting a t.printStackTrace()
in the catch
> (Throwable t) of the run() in Fetcher.java:
> java.lang.NullPointerException
>          at
org.apache.hadoop.io.Text.encode(Text.java:375)
>          at
org.apache.hadoop.io.Text.encode(Text.java:356)
>          at
org.apache.hadoop.io.Text.writeString(Text.java:396)
>          at
>
org.apache.nutch.protocol.Content.writeCompressed(Content.ja
va:146)
>          at
>
org.apache.hadoop.io.CompressedWritable.write(CompressedWrit
able.java:74)
>          at
>
org.apache.nutch.fetcher.FetcherOutput.write(FetcherOutput.j
ava:56)
>          at
>
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(Map
Task.java:315)
>          at
>
org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetche
r.java:343)
>          at
>
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.j
ava:191)
>
> I'm not sure where to go from here. Any suggestions?

Can you retry with the latest trunk?  Not that I think it
will solve
your problem but Content.java has changed recently so I am
not sure
what was in line 146. So, if problem reoccurs with latest
trunk I can
check exactly which line is failing. Alternatively, you can
send that
part of Content.java's code.

>
> Cheers,
> Carl.
>


-- 
Doğacan Güney
Re: NullPointerException fetching some sites with temp redirects
user name
2007-07-25 15:48:08
Hi, Included Content.java. Will retry with latest trunk
shortly.

Content.java:137-149

137 protected final void writeCompressed(DataOutput out)
throws 
IOException {
138    out.writeByte(VERSION);
139
140    Text.writeString(out, url); // write url
141    Text.writeString(out, base); // write base
142
143    out.writeInt(content.length); // write content
144    out.write(content);
145
146    Text.writeString(out, contentType); // write
contentType
147
148    metadata.write(out); // write metadata
149  }


I also noticed in the output.collect call in Fetcher.java a
new 
FetcherOutput is created with the third argument (ParseImpl)
as null 
even though the Content argument is not null (it is the
contents of the 
page that is redirected to).

Cheers,
Carl.

Doğacan Güney wrote:
> Hi,
> 
> On 7/25/07, Carl Cerecke <carlnzs.com> wrote:
>> Hi,
>>
>> Using nutch 0.9, although I get the same with a
more recent nightly 
>> build.
>>
>> I'm getting NPE fetching these two pages:
>>
>> http://www.absoluteit.co.
nz
>> and
>> http://defence.allmedia
.co.nz
>>
>> I've tracked it down by putting a
t.printStackTrace() in the catch
>> (Throwable t) of the run() in Fetcher.java:
>> java.lang.NullPointerException
>>          at
org.apache.hadoop.io.Text.encode(Text.java:375)
>>          at
org.apache.hadoop.io.Text.encode(Text.java:356)
>>          at
org.apache.hadoop.io.Text.writeString(Text.java:396)
>>          at
>>
org.apache.nutch.protocol.Content.writeCompressed(Content.ja
va:146)
>>          at
>>
org.apache.hadoop.io.CompressedWritable.write(CompressedWrit
able.java:74)
>>          at
>>
org.apache.nutch.fetcher.FetcherOutput.write(FetcherOutput.j
ava:56)
>>          at
>>
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(Map
Task.java:315) 
>>
>>          at
>>
org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetche
r.java:343)
>>          at
>>
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.j
ava:191)
>>
>> I'm not sure where to go from here. Any
suggestions?
> 
> Can you retry with the latest trunk?  Not that I think
it will solve
> your problem but Content.java has changed recently so I
am not sure
> what was in line 146. So, if problem reoccurs with
latest trunk I can
> check exactly which line is failing. Alternatively, you
can send that
> part of Content.java's code.
> 
>>
>> Cheers,
>> Carl.
>>
> 
> 


Re: NullPointerException fetching some sites with temp redirects
user name
2007-07-25 17:40:42
Hi Doğacan,

Yes, I get the NullPointerException with the latest trunk,
too.

Cheers,
Carl.

Doğacan Güney wrote:
> Hi,
> 
> On 7/25/07, Carl Cerecke <carlnzs.com> wrote:
>> Hi,
>>
>> Using nutch 0.9, although I get the same with a
more recent nightly 
>> build.
>>
>> I'm getting NPE fetching these two pages:
>>
>> http://www.absoluteit.co.
nz
>> and
>> http://defence.allmedia
.co.nz
>>
>> I've tracked it down by putting a
t.printStackTrace() in the catch
>> (Throwable t) of the run() in Fetcher.java:
>> java.lang.NullPointerException
>>          at
org.apache.hadoop.io.Text.encode(Text.java:375)
>>          at
org.apache.hadoop.io.Text.encode(Text.java:356)
>>          at
org.apache.hadoop.io.Text.writeString(Text.java:396)
>>          at
>>
org.apache.nutch.protocol.Content.writeCompressed(Content.ja
va:146)
>>          at
>>
org.apache.hadoop.io.CompressedWritable.write(CompressedWrit
able.java:74)
>>          at
>>
org.apache.nutch.fetcher.FetcherOutput.write(FetcherOutput.j
ava:56)
>>          at
>>
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(Map
Task.java:315) 
>>
>>          at
>>
org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetche
r.java:343)
>>          at
>>
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.j
ava:191)
>>
>> I'm not sure where to go from here. Any
suggestions?
> 
> Can you retry with the latest trunk?  Not that I think
it will solve
> your problem but Content.java has changed recently so I
am not sure
> what was in line 146. So, if problem reoccurs with
latest trunk I can
> check exactly which line is failing. Alternatively, you
can send that
> part of Content.java's code.
> 
>>
>> Cheers,
>> Carl.
>>
> 
> 


Re: NullPointerException fetching some sites with temp redirects
user name
2007-07-26 18:21:07
Is anybody else getting NullPointerExceptions fetching
either of these 
two sites (0.90 and latest from trunk) ?

http://www.absoluteit.co.
nz
http://defence.allmedia
.co.nz

I am, but would be grateful if someone else could test
whether they work 
or not so I can eliminate nutch configuration issues.

Cheers,
Carl.

Carl Cerecke wrote:
> Hi,
> 
> Using nutch 0.9, although I get the same with a more
recent nightly build.
> 
> I'm getting NPE fetching these two pages:
> 
> http://www.absoluteit.co.
nz
> and
> http://defence.allmedia
.co.nz
> 
> I've tracked it down by putting a t.printStackTrace()
in the catch 
> (Throwable t) of the run() in Fetcher.java:
> java.lang.NullPointerException
>         at
org.apache.hadoop.io.Text.encode(Text.java:375)
>         at
org.apache.hadoop.io.Text.encode(Text.java:356)
>         at
org.apache.hadoop.io.Text.writeString(Text.java:396)
>         at 
>
org.apache.nutch.protocol.Content.writeCompressed(Content.ja
va:146)
>         at 
>
org.apache.hadoop.io.CompressedWritable.write(CompressedWrit
able.java:74)
>         at 
>
org.apache.nutch.fetcher.FetcherOutput.write(FetcherOutput.j
ava:56)
>         at 
>
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(Map
Task.java:315)
>         at 
>
org.apache.nutch.fetcher.Fetcher$FetcherThread.output(Fetche
r.java:343)
>         at 
>
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.j
ava:191)
> 
> I'm not sure where to go from here. Any suggestions?
> 
> Cheers,
> Carl.
> 
>
____________________________________________________________
_________
> 
> This has been cleaned & processed by
www.rocketspam.co.nz
>
____________________________________________________________
_________
> 


[1-5]

about | contact  Other archives ( Real Estate discussion Medical topics )