List Info

Thread: Created: (HADOOP-2094) DFS should not use round robin policy in determing on which volume (fi




Created: (HADOOP-2094) DFS should not use round robin policy in determing on which volume (fi
country flaguser name
United States
2007-10-23 12:53:51
DFS should not use round robin policy in determing on which
volume (file system partition)  to allocate for the next
block
------------------------------------------------------------
------------------------------------------------------------
--

                 Key: HADOOP-2094
                 URL: htt
ps://issues.apache.org/jira/browse/HADOOP-2094
             Project: Hadoop
          Issue Type: Improvement
          Components: dfs
            Reporter: Runping Qi
            Assignee: Runping Qi



When multiple file system partitions are configured for the
data storage of a data node,
it uses a strict round robin policy to decide which
partition to use for writing the next block.
This may result in anormaly cases in which the blocks of a
file are not evenly distributed across 
the partitions. For example, when we use distcp to copy
files with each node have 4 mappers running concurrently, 
those 4 mappers are writing to DFS at about the same rate.
Thus, it is possible that the 4 mappers write out
blocks interleavingly. If there are 4 file system partitions
configured for the local data node, it is possible that each
mapper will
continue to write its blocks on to the same file system
partition.

A simple random placement policy will avoid such anormaly
cases, and does not have any obvious drawbacks.

 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue
online.


[1]

about | contact  Other archives ( Real Estate discussion Medical topics )