List Info

Thread: Lock obtain timed out when running on Hadoop




Lock obtain timed out when running on Hadoop
country flaguser name
United States
2007-10-18 02:32:59
Hi,

I'm sometimes getting the following error in the dedup 3 job
when  
running Nutch 0.9 on top of Hadoop 0.14.2:

java.io.IOException: Lock obtain timed out: Lockhdfs://r37:54310/ 
user/matei/crawl4/indexes/part-00000/write.lock
	at org.apache.lucene.store.Lock.obtain(Lock.java:69)
	at org.apache.lucene.index.IndexReader.aquireWriteLock 
(IndexReader.java:526)
	at org.apache.lucene.index.IndexReader.deleteDocument 
(IndexReader.java:551)
	at org.apache.nutch.indexer.DeleteDuplicates.reduce 
(DeleteDuplicates.java:378)
	at
org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:322)

	at
org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.
java: 
1782)

Other times, it works just fine. Do you know why this is
happening?

Thanks,

Matei Zaharia
Re: Lock obtain timed out when running on Hadoop
user name
2007-10-18 02:58:47
You should you hadoop 0.12.3 for example to dedup. The
current version
0.14.x don't support Lock operation.

2007/10/18, Matei Zaharia <mateieecs.berkeley.edu>:
>
> Hi,
>
> I'm sometimes getting the following error in the dedup
3 job when
> running Nutch 0.9 on top of Hadoop 0.14.2:
>
> java.io.IOException: Lock obtain timed out: Lockhdfs://r37:54310/
> user/matei/crawl4/indexes/part-00000/write.lock
>         at
org.apache.lucene.store.Lock.obtain(Lock.java:69)
>         at
org.apache.lucene.index.IndexReader.aquireWriteLock
> (IndexReader.java:526)
>         at
org.apache.lucene.index.IndexReader.deleteDocument
> (IndexReader.java:551)
>         at
org.apache.nutch.indexer.DeleteDuplicates.reduce
> (DeleteDuplicates.java:378)
>         at
org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:322)

>         at
org.apache.hadoop.mapred.TaskTracker$Child.main(
> TaskTracker.java:
> 1782)
>
> Other times, it works just fine. Do you know why this
is happening?
>
> Thanks,
>
> Matei Zaharia
>
Re: Lock obtain timed out when running on Hadoop
country flaguser name
United States
2007-10-18 03:05:13
Thanks for your reply. I'm wondering, is it possible to skip
this  
dedup phase then, or to not acquire a lock? The reason I'd
like to  
use 0.14 code is that I've instrumented it to add some
tracing and  
I'd like to collect traces of how Nutch uses Hadoop. It may
be  
possible to port the changes back to 0.12 but I'd prefer not
to  
because I may have other apps that use things in 0.14 and
because I  
want to trace the best-performing Hadoop version possible.

Matei

On Oct 18, 2007, at 12:58 AM, Nguyen Manh Tien wrote:

> You should you hadoop 0.12.3 for example to dedup. The
current version
> 0.14.x don't support Lock operation.
>
> 2007/10/18, Matei Zaharia <mateieecs.berkeley.edu>:
>>
>> Hi,
>>
>> I'm sometimes getting the following error in the
dedup 3 job when
>> running Nutch 0.9 on top of Hadoop 0.14.2:
>>
>> java.io.IOException: Lock obtain timed out:
Lockhdfs://r37:54310/
>> user/matei/crawl4/indexes/part-00000/write.lock
>>         at
org.apache.lucene.store.Lock.obtain(Lock.java:69)
>>         at
org.apache.lucene.index.IndexReader.aquireWriteLock
>> (IndexReader.java:526)
>>         at
org.apache.lucene.index.IndexReader.deleteDocument
>> (IndexReader.java:551)
>>         at
org.apache.nutch.indexer.DeleteDuplicates.reduce
>> (DeleteDuplicates.java:378)
>>         at
org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java: 
>> 322)
>>         at
org.apache.hadoop.mapred.TaskTracker$Child.main(
>> TaskTracker.java:
>> 1782)
>>
>> Other times, it works just fine. Do you know why
this is happening?
>>
>> Thanks,
>>
>> Matei Zaharia
>>


[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )