This is definitely a hadoop problem. This is similar to the
classpath
issues that we were encountering before with Hadoop and the
ReductTaskRunner. When I include the nutch-*.jar in the
hadoop class
path the errors go away. Not a fix but it proves the point
that this is
an issue with Hadoop class loading.
Dennis Kubes
Dennis Kubes wrote:
> I spoke too soon. Below is the output of errors on
mergesegs. This
> looks more like a Hadoop issue to me, but I will need
to dig into it. It
> also may be something that I am doing on my end. This
was a merge of
> three different crawls of 50K each. I don't know if we
want to delay or
> go ahead.
>
> Dennis Kubes
>
> java.lang.RuntimeException: java.lang.RuntimeException:
> java.lang.ClassNotFoundException:
org.apache.nutch.metadata.MetaWrapper
> at
>
org.apache.hadoop.conf.Configuration.getClass(Configuration.
java:344)
> at
>
org.apache.hadoop.mapred.JobConf.getOutputValueClass(JobConf
.java:451)
> at
>
org.apache.hadoop.mapred.JobConf.getMapOutputValueClass(JobC
onf.java:414)
> at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.(MapTask.ja
va:270)
> at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:115)
> at
>
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.
java:1445)
> Caused by: java.lang.RuntimeException:
java.lang.ClassNotFoundException:
> org.apache.nutch.metadata.MetaWrapper
> at
>
org.apache.hadoop.conf.Configuration.getClass(Configuration.
java:328)
> at
>
org.apache.hadoop.conf.Configuration.getClass(Configuration.
java:339)
> ... 5 more
> Caused by: java.lang.ClassNotFoundException:
> org.apache.nutch.metadata.MetaWrapper
> at
java.net.URLClassLoader$1.run(URLClassLoader.java:200)
> at
java.security.AccessController.doPrivileged(Native Method)
> at
java.net.URLClassLoader.findClass(URLClassLoader.java:188)
> at
java.lang.ClassLoader.loadClass(ClassLoader.java:306)
> at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:268
)
> at
java.lang.ClassLoader.loadClass(ClassLoader.java:251)
> at
java.lang.ClassLoader.loadClassInternal(ClassLoader.java:319
)
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:242)
> at
>
org.apache.hadoop.conf.Configuration.getClassByName(Configur
ation.java:315)
> at
>
org.apache.hadoop.conf.Configuration.getClass(Configuration.
java:326)
> ... 6 more
>
>
>
> Dennis Kubes wrote:
>> [X] +1 Release the packages as Apache Nutch 0.9
>> [ ] -1 Do not release the packages because...
>>
>> I have been running some bigger crawls with the
release this morning.
>> Everything looks good.
>>
>> Dennis Kubes
>>
>> Chris Mattmann wrote:
>>> Hi Folks,
>>>
>>> I have posted a candidate for the Apache Nutch
0.9 release at
>>>
>>> http://
people.apache.org/~mattmann/nutch_0.9/
>>>
>>> See the included CHANGES-0.9.txt file for
details on release
>>> contents and latest changes. The release was
made from the 0.9-dev
>>> trunk.
>>>
>>> Please vote on releasing these packages as
Apache Nutch 0.9.
>>> The vote is open for the next 72 hours. Only
votes from Nutch
>>> committers are binding, but everyone is welcome
to check the release
>>> candidate and voice their approval or
disapproval. The vote passes if
>>> at least three binding +1 votes are cast.
>>>
>>> [ ] +1 Release the packages as Apache Nutch
0.9
>>> [ ] -1 Do not release the packages because...
>>>
>>> Thanks!
>>>
>>> Cheers,
>>> Chris
>>>
>>>
|