My application needs to have very big computation requirement, which means that the processed data size is not big, but the computation is complicated. What I would like to ask is if I set one machine as one reducer or set it as several reducer, which processing capability will be bigger? Which one will be faster? Any thing is how many resources will one reducer use? For instance, if one machine has four cores and 3GB memory, can the reducer use all these resources if I only run it as one reducer? Is there anywhere we can set how many resources one reducer can use?
If we can set, what is difference between making one reducer using all the resources and making several reducers to share all the resources in one node? Thanks!
Tuning the reduce phase on a cluster is not a trivial problem.
The common case is that the reduce phase is primarily disk IO bound, and you run roughly one reduce per seek arm on a machine.
If the disk io is not the bounding point, you have a couple of choices, you can run more than one reduce per machine, varying the number of reduces until you hit maximum throughput, or you can modify your reduce to be multi-threaded. In the multi-threaded case, it becomes difficult to guarantee output ordering. If the ordering of the output is important, muti-threaded reduces are not recommended.