A Community for Hadoop Users
Tags: hadoop
A reducer can not start until all of the data that will be it's input is fully ordered.
The ordering can not complete until all of the map tasks have finished, as any map may have data that will go to any reducer (reduce task).
The reduce task often starts at job start, but the first call to the user's reduce method will only happen after all of the map tasks have completed.
For all intents and purposes your reduce doesn't start until the reduce % hits 60%
the parts that run prior to that are involved in preparing the data for your reduce tasks.
It the job output is a confusing information presentation.
5 members
3 members
9 members
1 member
8 members
© 2012 Created by Jason Venner.