Hadoop Professionals

A Community for Hadoop Users

Jason Venner

Ensuring that your reduces are distributed across all of your machines

A user asked me a question today,

he has a cluster with 16 reduce slots over a number of machines, and when he runs a reduce with 12 reduces, multiple reduces end up on single machines, and some machines are idle.

At present the only way to work around this that I am aware of is to force the cluster level parameter mapred.tasktracker.reduce.tasks.maximum to 1, and restart the cluster.

Views: 0

Comment

You need to be a member of Hadoop Professionals to add comments!

Join Hadoop Professionals




Groups

© 2012   Created by Jason Venner.

Badges  |  Report an Issue  |  Terms of Service