List Info

Thread: is a monolithic reduce task the right model?




is a monolithic reduce task the right model?
user name
2008-01-10 11:45:52
in thinking about Aaron's use case and our own problems with
fair sharing of hadoop cluster, one of the things that was
obvious was that reduces are a stumbling block for fair
sharing. It's easy to imagine a fair scheduling algorithm
doing good job of scheduling small map tasks. but the
reduces are a problem. they are too big and once scheduled
last forever.

another obvious thing is that reduce failures are expensive.
all the map outputs need to be refetched and merged again.
whereas, in many cases, the failure is in the reduction
logic. tying two and two together:

- what if current reduce tasks were broken into separate
copy, sort and reduce tasks?

we would get much smaller units of recovery and scheduling.

thoughts?

Joydeep
Re: is a monolithic reduce task the right model?
country flaguser name
United States
2008-01-10 11:55:53
Joydeep Sen Sarma wrote:
> - what if current reduce tasks were broken into
separate copy, sort and reduce tasks?
> 
> we would get much smaller units of recovery and
scheduling.
> 
> thoughts?

If copy, sort and reduce are not scheduled together then it
would be 
very hard to ensure they run on the same node, and if they
do not all 
run on the same node then we'd have to move their data
around, which 
would substantially affect throughput, not to mention adding
another 
copy phase...

Please see htt
ps://issues.apache.org/jira/browse/HADOOP-2573 for
another 
proposed solution to this.

Doug

RE: is a monolithic reduce task the right model?
country flaguser name
Hong Kong
2008-01-13 07:10:21
By the way, I had created htt
ps://issues.apache.org/jira/browse/HADOOP-2568
sometime back. The proposal is basically to have one shuffle
task per job
per node and assign reduces with consecutive taskIDs to a
particular node.
The shuffle task would fetch multiple consecutive outputs in
one go from any
map task node. This will reduce the number of seeks into the
map output
files by a factor #maps * #consecutive-reduces for any
mapnode-reducenode
pair, and should generally improve the usage of system
resources (for e.g.,
fewer number of socket connections for transferring files,
and, improved
disk usage).

> -----Original Message-----
> From: Joydeep Sen Sarma [mailto:jssarmafacebook.com] 
> Sent: Thursday, January 10, 2008 11:16 PM
> To: hadoop-userlucene.apache.org
> Subject: is a monolithic reduce task the right model?
> 
> in thinking about Aaron's use case and our own problems
with 
> fair sharing of hadoop cluster, one of the things that
was 
> obvious was that reduces are a stumbling block for fair

> sharing. It's easy to imagine a fair scheduling
algorithm 
> doing good job of scheduling small map tasks. but the
reduces 
> are a problem. they are too big and once scheduled last
forever.
> 
> another obvious thing is that reduce failures are
expensive. 
> all the map outputs need to be refetched and merged
again. 
> whereas, in many cases, the failure is in the reduction

> logic. tying two and two together:
> 
> - what if current reduce tasks were broken into
separate 
> copy, sort and reduce tasks?
> 
> we would get much smaller units of recovery and
scheduling.
> 
> thoughts?
> 
> Joydeep
> 


[1-3]

about | contact  Other archives ( Real Estate discussion Medical topics )