List Info

Thread: Created: (HADOOP-2087) Errors for subsequent requests for file creation after original DFSCli




Created: (HADOOP-2087) Errors for subsequent requests for file creation after original DFSCli
country flaguser name
United States
2007-10-22 10:04:51
Errors for subsequent requests for file creation after
original DFSClient goes down..
------------------------------------------------------------
-------------------------

                 Key: HADOOP-2087
                 URL: htt
ps://issues.apache.org/jira/browse/HADOOP-2087
             Project: Hadoop
          Issue Type: Bug
          Components: dfs
            Reporter: Gautam Kowshik
             Fix For: 0.15.0



task task_200710200555_0005_m_000725_0 started writing a
file and the Node went down.. so all following file creation
attempts were returned with AlreadyBeingCreatedException
I think the dfs should handle cases wherein, if a dfsclient
goes down between file creation, subsequent creates to the
same file could be allowed. 

2007-10-20 06:23:51,189 INFO
org.apache.hadoop.mapred.TaskInProgress: Error from
task_200710200555_0005_m_000725_0: Task
task_200710200555_0005_m_000725_0 failed to report status
for 606 seconds. Killing!
2007-10-20 06:23:51,189 INFO
org.apache.hadoop.mapred.JobTracker: Removed completed task
'task_200710200555_0005_m_000725_0' from
'[tracker_address]:/127.0.0.1:44198'
2007-10-20 06:23:51,209 INFO
org.apache.hadoop.mapred.JobInProgress: Choosing normal task
tip_200710200555_0005_m_000725
2007-10-20 06:23:51,209 INFO
org.apache.hadoop.mapred.JobTracker: Adding task
'task_200710200555_0005_m_000725_1' to tip
tip_200710200555_0005_m_000725, for tracker
'[tracker_address]:/127.0.0.1:50914'
2007-10-20 06:28:54,991 INFO
org.apache.hadoop.mapred.TaskInProgress: Error from
task_200710200555_0005_m_000725_1:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed
to create file /benchmarks/TestDFSIO/io_data/test_io_825 for
DFSClient_task_200710200555_0005_m_000725_1 on client
72.30.50.198, because this file is already being created by
DFSClient_task_200710200555_0005_m_000725_0 on 72.30.53.224
        at
org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNames
ystem.java:881)
        at
org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.ja
va:806)
        at
org.apache.hadoop.dfs.NameNode.create(NameNode.java:276)
        at
sun.reflect.GeneratedMethodAccessor10.invoke(Unknown
Source)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
thodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
        at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Commented: (HADOOP-2087) Errors for subsequent requests for file creation after original DFSC
country flaguser name
United States
2007-10-22 10:27:50
    [ https://issues.apache.org/jira/browse
/HADOOP-2087?page=com.atlassian.jira.plugin.system.issuetabp
anels:comment-tabpanel#action_12536706 ] 

Arun C Murthy commented on HADOOP-2087:
---------------------------------------

We should just rewrite TestDFSIO (and similar tests) to use
*reducer NONE* and create/write the files in
${mapred.output.dir} from the map tasks: http://wi
ki.apache.org/lucene-hadoop/FAQ#9.


> Errors for subsequent requests for file creation after
original DFSClient goes down..
>
------------------------------------------------------------
-------------------------
>
>                 Key: HADOOP-2087
>                 URL: htt
ps://issues.apache.org/jira/browse/HADOOP-2087
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Gautam Kowshik
>             Fix For: 0.15.0
>
>
> task task_200710200555_0005_m_000725_0 started writing
a file and the Node went down.. so all following file
creation attempts were returned with
AlreadyBeingCreatedException
> I think the dfs should handle cases wherein, if a
dfsclient goes down between file creation, subsequent
creates to the same file could be allowed. 
> 2007-10-20 06:23:51,189 INFO
org.apache.hadoop.mapred.TaskInProgress: Error from
task_200710200555_0005_m_000725_0: Task
task_200710200555_0005_m_000725_0 failed to report status
for 606 seconds. Killing!
> 2007-10-20 06:23:51,189 INFO
org.apache.hadoop.mapred.JobTracker: Removed completed task
'task_200710200555_0005_m_000725_0' from
'[tracker_address]:/127.0.0.1:44198'
> 2007-10-20 06:23:51,209 INFO
org.apache.hadoop.mapred.JobInProgress: Choosing normal task
tip_200710200555_0005_m_000725
> 2007-10-20 06:23:51,209 INFO
org.apache.hadoop.mapred.JobTracker: Adding task
'task_200710200555_0005_m_000725_1' to tip
tip_200710200555_0005_m_000725, for tracker
'[tracker_address]:/127.0.0.1:50914'
> 2007-10-20 06:28:54,991 INFO
org.apache.hadoop.mapred.TaskInProgress: Error from
task_200710200555_0005_m_000725_1:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed
to create file /benchmarks/TestDFSIO/io_data/test_io_825 for
DFSClient_task_200710200555_0005_m_000725_1 on client
72.30.50.198, because this file is already being created by
DFSClient_task_200710200555_0005_m_000725_0 on 72.30.53.224
>         at
org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNames
ystem.java:881)
>         at
org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.ja
va:806)
>         at
org.apache.hadoop.dfs.NameNode.create(NameNode.java:276)
>         at
sun.reflect.GeneratedMethodAccessor10.invoke(Unknown
Source)
>         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
thodAccessorImpl.java:25)
>         at
java.lang.reflect.Method.invoke(Method.java:597)
>         at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>         at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Commented: (HADOOP-2087) Errors for subsequent requests for file creation after original DFSC
country flaguser name
United States
2007-10-22 12:16:51
    [ https://issues.apache.org/jira/browse
/HADOOP-2087?page=com.atlassian.jira.plugin.system.issuetabp
anels:comment-tabpanel#action_12536746 ] 

dhruba borthakur commented on HADOOP-2087:
------------------------------------------

I guess if we follow the convention described in http://wik
i.apache.org/lucene-hadoop/FAQ#9, then the files are
written to a temporary location and then moved to the final
location. Does it mean that DFSIO will then have to incur
the cost of the file(s) rename operation?

> Errors for subsequent requests for file creation after
original DFSClient goes down..
>
------------------------------------------------------------
-------------------------
>
>                 Key: HADOOP-2087
>                 URL: htt
ps://issues.apache.org/jira/browse/HADOOP-2087
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Gautam Kowshik
>             Fix For: 0.15.0
>
>
> task task_200710200555_0005_m_000725_0 started writing
a file and the Node went down.. so all following file
creation attempts were returned with
AlreadyBeingCreatedException
> I think the dfs should handle cases wherein, if a
dfsclient goes down between file creation, subsequent
creates to the same file could be allowed. 
> 2007-10-20 06:23:51,189 INFO
org.apache.hadoop.mapred.TaskInProgress: Error from
task_200710200555_0005_m_000725_0: Task
task_200710200555_0005_m_000725_0 failed to report status
for 606 seconds. Killing!
> 2007-10-20 06:23:51,189 INFO
org.apache.hadoop.mapred.JobTracker: Removed completed task
'task_200710200555_0005_m_000725_0' from
'[tracker_address]:/127.0.0.1:44198'
> 2007-10-20 06:23:51,209 INFO
org.apache.hadoop.mapred.JobInProgress: Choosing normal task
tip_200710200555_0005_m_000725
> 2007-10-20 06:23:51,209 INFO
org.apache.hadoop.mapred.JobTracker: Adding task
'task_200710200555_0005_m_000725_1' to tip
tip_200710200555_0005_m_000725, for tracker
'[tracker_address]:/127.0.0.1:50914'
> 2007-10-20 06:28:54,991 INFO
org.apache.hadoop.mapred.TaskInProgress: Error from
task_200710200555_0005_m_000725_1:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed
to create file /benchmarks/TestDFSIO/io_data/test_io_825 for
DFSClient_task_200710200555_0005_m_000725_1 on client
72.30.50.198, because this file is already being created by
DFSClient_task_200710200555_0005_m_000725_0 on 72.30.53.224
>         at
org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNames
ystem.java:881)
>         at
org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.ja
va:806)
>         at
org.apache.hadoop.dfs.NameNode.create(NameNode.java:276)
>         at
sun.reflect.GeneratedMethodAccessor10.invoke(Unknown
Source)
>         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
thodAccessorImpl.java:25)
>         at
java.lang.reflect.Method.invoke(Method.java:597)
>         at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>         at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Commented: (HADOOP-2087) Errors for subsequent requests for file creation after original DFSC
country flaguser name
United States
2007-10-23 13:38:50
    [ https://issues.apache.org/jira/browse
/HADOOP-2087?page=com.atlassian.jira.plugin.system.issuetabp
anels:comment-tabpanel#action_12537109 ] 

Raghu Angadi commented on HADOOP-2087:
--------------------------------------

> So the first question to answer is whether the retry
frameworks does what it is intended to do that is retry.

Retry after one minute applies only when there is a timeout.
Otherwise it retries after a few millisec. Comment from http://issues.apache.org/jira/browse/HADOOP-1263
#action_12504135 :


I am planning to use this framework for some new RPCs I am
adding. I just want to confirm if my understanding is
correct: This patch adds a random exponential back off
timeout starting with 400 milliseconds for 5 times. In all 5
retries, this add a max of 12 seconds. Since client RPC
timeout is 60sec, time it takes for such RPC to fail takes
between 300-312 seconds over 6 attempts. Is this expected?,
because it is not exponential back off but essentially
constant timeout of around 60sec for each retry.


> Errors for subsequent requests for file creation after
original DFSClient goes down..
>
------------------------------------------------------------
-------------------------
>
>                 Key: HADOOP-2087
>                 URL: htt
ps://issues.apache.org/jira/browse/HADOOP-2087
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Gautam Kowshik
>             Fix For: 0.15.0
>
>
> task task_200710200555_0005_m_000725_0 started writing
a file and the Node went down.. so all following file
creation attempts were returned with
AlreadyBeingCreatedException
> I think the dfs should handle cases wherein, if a
dfsclient goes down between file creation, subsequent
creates to the same file could be allowed. 
> 2007-10-20 06:23:51,189 INFO
org.apache.hadoop.mapred.TaskInProgress: Error from
task_200710200555_0005_m_000725_0: Task
task_200710200555_0005_m_000725_0 failed to report status
for 606 seconds. Killing!
> 2007-10-20 06:23:51,189 INFO
org.apache.hadoop.mapred.JobTracker: Removed completed task
'task_200710200555_0005_m_000725_0' from
'[tracker_address]:/127.0.0.1:44198'
> 2007-10-20 06:23:51,209 INFO
org.apache.hadoop.mapred.JobInProgress: Choosing normal task
tip_200710200555_0005_m_000725
> 2007-10-20 06:23:51,209 INFO
org.apache.hadoop.mapred.JobTracker: Adding task
'task_200710200555_0005_m_000725_1' to tip
tip_200710200555_0005_m_000725, for tracker
'[tracker_address]:/127.0.0.1:50914'
> 2007-10-20 06:28:54,991 INFO
org.apache.hadoop.mapred.TaskInProgress: Error from
task_200710200555_0005_m_000725_1:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed
to create file /benchmarks/TestDFSIO/io_data/test_io_825 for
DFSClient_task_200710200555_0005_m_000725_1 on client
72.30.50.198, because this file is already being created by
DFSClient_task_200710200555_0005_m_000725_0 on 72.30.53.224
>         at
org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNames
ystem.java:881)
>         at
org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.ja
va:806)
>         at
org.apache.hadoop.dfs.NameNode.create(NameNode.java:276)
>         at
sun.reflect.GeneratedMethodAccessor10.invoke(Unknown
Source)
>         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
thodAccessorImpl.java:25)
>         at
java.lang.reflect.Method.invoke(Method.java:597)
>         at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>         at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Commented: (HADOOP-2087) Errors for subsequent requests for file creation after original DFSC
country flaguser name
United States
2007-10-23 20:06:50
    [ https://issues.apache.org/jira/browse
/HADOOP-2087?page=com.atlassian.jira.plugin.system.issuetabp
anels:comment-tabpanel#action_12537184 ] 

Mukund Madhugiri commented on HADOOP-2087:
------------------------------------------

I don't see this on every run of the TestDFSIO benchmark.
The latest run that is just wrapping up does not exhibit
this problem

> Errors for subsequent requests for file creation after
original DFSClient goes down..
>
------------------------------------------------------------
-------------------------
>
>                 Key: HADOOP-2087
>                 URL: htt
ps://issues.apache.org/jira/browse/HADOOP-2087
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Gautam Kowshik
>             Fix For: 0.15.0
>
>
> task task_200710200555_0005_m_000725_0 started writing
a file and the Node went down.. so all following file
creation attempts were returned with
AlreadyBeingCreatedException
> I think the dfs should handle cases wherein, if a
dfsclient goes down between file creation, subsequent
creates to the same file could be allowed. 
> 2007-10-20 06:23:51,189 INFO
org.apache.hadoop.mapred.TaskInProgress: Error from
task_200710200555_0005_m_000725_0: Task
task_200710200555_0005_m_000725_0 failed to report status
for 606 seconds. Killing!
> 2007-10-20 06:23:51,189 INFO
org.apache.hadoop.mapred.JobTracker: Removed completed task
'task_200710200555_0005_m_000725_0' from
'[tracker_address]:/127.0.0.1:44198'
> 2007-10-20 06:23:51,209 INFO
org.apache.hadoop.mapred.JobInProgress: Choosing normal task
tip_200710200555_0005_m_000725
> 2007-10-20 06:23:51,209 INFO
org.apache.hadoop.mapred.JobTracker: Adding task
'task_200710200555_0005_m_000725_1' to tip
tip_200710200555_0005_m_000725, for tracker
'[tracker_address]:/127.0.0.1:50914'
> 2007-10-20 06:28:54,991 INFO
org.apache.hadoop.mapred.TaskInProgress: Error from
task_200710200555_0005_m_000725_1:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed
to create file /benchmarks/TestDFSIO/io_data/test_io_825 for
DFSClient_task_200710200555_0005_m_000725_1 on client
72.30.50.198, because this file is already being created by
DFSClient_task_200710200555_0005_m_000725_0 on 72.30.53.224
>         at
org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNames
ystem.java:881)
>         at
org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.ja
va:806)
>         at
org.apache.hadoop.dfs.NameNode.create(NameNode.java:276)
>         at
sun.reflect.GeneratedMethodAccessor10.invoke(Unknown
Source)
>         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
thodAccessorImpl.java:25)
>         at
java.lang.reflect.Method.invoke(Method.java:597)
>         at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>         at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


Commented: (HADOOP-2087) Errors for subsequent requests for file creation after original DFSC
country flaguser name
United States
2007-10-24 04:19:50
    [ https://issues.apache.org/jira/browse
/HADOOP-2087?page=com.atlassian.jira.plugin.system.issuetabp
anels:comment-tabpanel#action_12537249 ] 

Gautam Kowshik commented on HADOOP-2087:
----------------------------------------

Mukund: This bug shows up when a DFSClient goes down during
a file creation. It's only when this scenario is hit does
the problem show up.

> Errors for subsequent requests for file creation after
original DFSClient goes down..
>
------------------------------------------------------------
-------------------------
>
>                 Key: HADOOP-2087
>                 URL: htt
ps://issues.apache.org/jira/browse/HADOOP-2087
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>            Reporter: Gautam Kowshik
>             Fix For: 0.15.0
>
>
> task task_200710200555_0005_m_000725_0 started writing
a file and the Node went down.. so all following file
creation attempts were returned with
AlreadyBeingCreatedException
> I think the dfs should handle cases wherein, if a
dfsclient goes down between file creation, subsequent
creates to the same file could be allowed. 
> 2007-10-20 06:23:51,189 INFO
org.apache.hadoop.mapred.TaskInProgress: Error from
task_200710200555_0005_m_000725_0: Task
task_200710200555_0005_m_000725_0 failed to report status
for 606 seconds. Killing!
> 2007-10-20 06:23:51,189 INFO
org.apache.hadoop.mapred.JobTracker: Removed completed task
'task_200710200555_0005_m_000725_0' from
'[tracker_address]:/127.0.0.1:44198'
> 2007-10-20 06:23:51,209 INFO
org.apache.hadoop.mapred.JobInProgress: Choosing normal task
tip_200710200555_0005_m_000725
> 2007-10-20 06:23:51,209 INFO
org.apache.hadoop.mapred.JobTracker: Adding task
'task_200710200555_0005_m_000725_1' to tip
tip_200710200555_0005_m_000725, for tracker
'[tracker_address]:/127.0.0.1:50914'
> 2007-10-20 06:28:54,991 INFO
org.apache.hadoop.mapred.TaskInProgress: Error from
task_200710200555_0005_m_000725_1:
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed
to create file /benchmarks/TestDFSIO/io_data/test_io_825 for
DFSClient_task_200710200555_0005_m_000725_1 on client
72.30.50.198, because this file is already being created by
DFSClient_task_200710200555_0005_m_000725_0 on 72.30.53.224
>         at
org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNames
ystem.java:881)
>         at
org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.ja
va:806)
>         at
org.apache.hadoop.dfs.NameNode.create(NameNode.java:276)
>         at
sun.reflect.GeneratedMethodAccessor10.invoke(Unknown
Source)
>         at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe
thodAccessorImpl.java:25)
>         at
java.lang.reflect.Method.invoke(Method.java:597)
>         at
org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>         at
org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


[1-6]

about | contact  Other archives ( Real Estate discussion Medical topics )