List Info

Thread: Error when i index by nutchwax-0.10.0




Error when i index by nutchwax-0.10.0
country flaguser name
United States
2007-11-23 02:10:48
when i use this command "$/bin/hadoop jar
$/nutchwax.jar all /tmp/inputs /tmp/outputs
test"
i have error :

- LinkDb: done
-  indexing [Lorg.apache.hadoop.fs.Path;66e64686
- Indexer: starting
- Indexer: linkdb: outputs/linkdb
- parsing file:/nutch/search/conf/hadoop-default.xml
- parsing file:/nutch/search/conf/nutch-default.xml
- parsing file:/tmp/hadoop-unjar50228/wax-default.xml
- parsing file:/nutch/search/conf/mapred-default.xml
- parsing file:/nutch/search/conf/mapred-default.xml
- parsing file:/nutch/search/conf/mapred-default.xml
- adding segment:
/user/nutch/outputs/segments/25501123143257-test
- Running job: job_0004
-  map 0% reduce 0%
-  map 5% reduce 0%
-  map 15% reduce 0%
-  map 27% reduce 0%
-  map 37% reduce 0%
-  map 47% reduce 0%
-  map 57% reduce 0%
-  map 72% reduce 0%
-  map 82% reduce 0%
-  map 94% reduce 0%
-  map 97% reduce 0%
-  map 100% reduce 0%
-  map 100% reduce 37%
-  map 100% reduce 79%
-  map 100% reduce 87%
-  map 100% reduce 100%
- Job complete: job_0004
- Indexer: done
- dedup outputs/index
- Dedup: starting
- parsing file:/nutch/search/conf/hadoop-default.xml
- parsing file:/nutch/search/conf/nutch-default.xml
- parsing file:/tmp/hadoop-unjar50228/wax-default.xml
- parsing file:/nutch/search/conf/mapred-default.xml
- parsing file:/nutch/search/conf/mapred-default.xml
- parsing file:/nutch/search/conf/mapred-default.xml
- Dedup: adding indexes in: outputs/indexes
- Running job: job_0005
-  map 0% reduce 0%
-  map 100% reduce 100%
Exception in thread "main" java.io.IOException:
Job failed!
        at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:399
)
        at
org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplic
ates.java:433)
        at
org.archive.access.nutch.Nutchwax.doDedup(Nutchwax.java:257)

        at
org.archive.access.nutch.Nutchwax.doAll(Nutchwax.java:156)
        at
org.archive.access.nutch.Nutchwax.doJob(Nutchwax.java:389)
        at
org.archive.access.nutch.Nutchwax.main(Nutchwax.java:674)
        at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
ssorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
thodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.util.RunJar.main(RunJar.java:149)

why and how i solve it?
-- 
View this message in context: http://www.nabble.com/Error-
when-i-index-by-nutchwax-0.10.0-tf4860227.html#a13908351

Sent from the Hadoop Users mailing list archive at
Nabble.com.


Re: Error when i index by nutchwax-0.10.0
country flaguser name
India
2007-11-23 17:01:55
Dedup tries to acquire lock to the file system from
IndexReader.java of 
lucene. Latest versions of hadoop dont support this. I think
this error is 
because of that. What you can try is comment out acquire
lock functionality 
in lucene IndexReader, compile lucene and replace the old
lucene jar with new 
one. It might work (worked for me), but this may be a
problem when you run 
concurrent jobs.

- Prasad.

On Friday 23 November 2007 03:10, jibjoice wrote:
> when i use this command "$/bin/hadoop
jar
> $/nutchwax.jar all /tmp/inputs
/tmp/outputs test"
> i have error :
>
> - LinkDb: done
> -  indexing [Lorg.apache.hadoop.fs.Path;66e64686
> - Indexer: starting
> - Indexer: linkdb: outputs/linkdb
> - parsing file:/nutch/search/conf/hadoop-default.xml
> - parsing file:/nutch/search/conf/nutch-default.xml
> - parsing file:/tmp/hadoop-unjar50228/wax-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - adding segment:
/user/nutch/outputs/segments/25501123143257-test
> - Running job: job_0004
> -  map 0% reduce 0%
> -  map 5% reduce 0%
> -  map 15% reduce 0%
> -  map 27% reduce 0%
> -  map 37% reduce 0%
> -  map 47% reduce 0%
> -  map 57% reduce 0%
> -  map 72% reduce 0%
> -  map 82% reduce 0%
> -  map 94% reduce 0%
> -  map 97% reduce 0%
> -  map 100% reduce 0%
> -  map 100% reduce 37%
> -  map 100% reduce 79%
> -  map 100% reduce 87%
> -  map 100% reduce 100%
> - Job complete: job_0004
> - Indexer: done
> - dedup outputs/index
> - Dedup: starting
> - parsing file:/nutch/search/conf/hadoop-default.xml
> - parsing file:/nutch/search/conf/nutch-default.xml
> - parsing file:/tmp/hadoop-unjar50228/wax-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - parsing file:/nutch/search/conf/mapred-default.xml
> - Dedup: adding indexes in: outputs/indexes
> - Running job: job_0005
> -  map 0% reduce 0%
> -  map 100% reduce 100%
> Exception in thread "main"
java.io.IOException: Job failed!
>         at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:399
)
>         at
>
org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplic
ates.java:433)
>         at
org.archive.access.nutch.Nutchwax.doDedup(Nutchwax.java:257)

>         at
org.archive.access.nutch.Nutchwax.doAll(Nutchwax.java:156)
>         at
org.archive.access.nutch.Nutchwax.doJob(Nutchwax.java:389)
>         at
org.archive.access.nutch.Nutchwax.main(Nutchwax.java:674)
>         at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
>
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
ssorImpl.java:3
>9) at
>
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
thodAccessorImp
>l.java:25) at
java.lang.reflect.Method.invoke(Method.java:597)
>         at
org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>
> why and how i solve it?



Re: Error when i index by nutchwax-0.10.0
country flaguser name
United States
2007-11-25 20:08:37
now i use hadoop-0.9.2 and lucene-core-2.1.0.jar but it's
not work.


pvvpr wrote:
> 
> Dedup tries to acquire lock to the file system from
IndexReader.java of 
> lucene. Latest versions of hadoop dont support this. I
think this error is 
> because of that. What you can try is comment out
acquire lock
> functionality 
> in lucene IndexReader, compile lucene and replace the
old lucene jar with
> new 
> one. It might work (worked for me), but this may be a
problem when you run 
> concurrent jobs.
> 
> - Prasad.
> 
> On Friday 23 November 2007 03:10, jibjoice wrote:
>> when i use this command
"$/bin/hadoop jar
>> $/nutchwax.jar all /tmp/inputs
/tmp/outputs test"
>> i have error :
>>
>> - LinkDb: done
>> -  indexing [Lorg.apache.hadoop.fs.Path;66e64686
>> - Indexer: starting
>> - Indexer: linkdb: outputs/linkdb
>> - parsing
file:/nutch/search/conf/hadoop-default.xml
>> - parsing
file:/nutch/search/conf/nutch-default.xml
>> - parsing
file:/tmp/hadoop-unjar50228/wax-default.xml
>> - parsing
file:/nutch/search/conf/mapred-default.xml
>> - parsing
file:/nutch/search/conf/mapred-default.xml
>> - parsing
file:/nutch/search/conf/mapred-default.xml
>> - adding segment:
/user/nutch/outputs/segments/25501123143257-test
>> - Running job: job_0004
>> -  map 0% reduce 0%
>> -  map 5% reduce 0%
>> -  map 15% reduce 0%
>> -  map 27% reduce 0%
>> -  map 37% reduce 0%
>> -  map 47% reduce 0%
>> -  map 57% reduce 0%
>> -  map 72% reduce 0%
>> -  map 82% reduce 0%
>> -  map 94% reduce 0%
>> -  map 97% reduce 0%
>> -  map 100% reduce 0%
>> -  map 100% reduce 37%
>> -  map 100% reduce 79%
>> -  map 100% reduce 87%
>> -  map 100% reduce 100%
>> - Job complete: job_0004
>> - Indexer: done
>> - dedup outputs/index
>> - Dedup: starting
>> - parsing
file:/nutch/search/conf/hadoop-default.xml
>> - parsing
file:/nutch/search/conf/nutch-default.xml
>> - parsing
file:/tmp/hadoop-unjar50228/wax-default.xml
>> - parsing
file:/nutch/search/conf/mapred-default.xml
>> - parsing
file:/nutch/search/conf/mapred-default.xml
>> - parsing
file:/nutch/search/conf/mapred-default.xml
>> - Dedup: adding indexes in: outputs/indexes
>> - Running job: job_0005
>> -  map 0% reduce 0%
>> -  map 100% reduce 100%
>> Exception in thread "main"
java.io.IOException: Job failed!
>>         at
org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:399
)
>>         at
>>
org.apache.nutch.indexer.DeleteDuplicates.dedup(DeleteDuplic
ates.java:433)
>>         at
org.archive.access.nutch.Nutchwax.doDedup(Nutchwax.java:257)

>>         at
org.archive.access.nutch.Nutchwax.doAll(Nutchwax.java:156)
>>         at
org.archive.access.nutch.Nutchwax.doJob(Nutchwax.java:389)
>>         at
org.archive.access.nutch.Nutchwax.main(Nutchwax.java:674)
>>         at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>         at
>>
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAcce
ssorImpl.java:3
>>9) at
>>
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
thodAccessorImp
>>l.java:25) at
java.lang.reflect.Method.invoke(Method.java:597)
>>         at
org.apache.hadoop.util.RunJar.main(RunJar.java:149)
>>
>> why and how i solve it?
> 
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Error-
when-i-index-by-nutchwax-0.10.0-tf4860227.html#a13942649

Sent from the Hadoop Users mailing list archive at
Nabble.com.


[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )