I try to merge 2 segments into 1. I've a cluster of 4
machine using hadoop
13.1 and the last trunk of nutch.
Every time i start my merge I've got the following error:
2007-10-03 22:06:28,272 INFO mapred.TaskInProgress - Error
from
task_0001_m_000011_0: java.lang.OutOfMemoryError: Java heap
space
at java.util.Arrays.copyOf(Arrays.java:2786)
at
java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.ja
va
:94)
at
java.io.DataOutputStream.write(DataOutputStream.java:90)
at
java.io.FilterOutputStream.write(FilterOutputStream.java:80)
at
org.apache.nutch.protocol.Content.write(Content.java:164)
at
org.apache.hadoop.io.GenericWritable.write(GenericWritable.j
ava
:100)
at
org.apache.nutch.metadata.MetaWrapper.write(MetaWrapper.java
:107)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(
MapTask.java:365)
at
org.apache.nutch.segment.SegmentMerger.map(SegmentMerger.jav
a
:331)
at
org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:186)
at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.
java
:1707)
I've increase the memory but it doesn't seems to change
anything to my pb.
This error appears immediatly at the beginning of the
process.
Do you experinced the same issue ?
Does anybody use mergesegs ?
Thanks in advance for your help
|