Hadoop Professionals

A Community for Hadoop Users

There are two requirements which I want to implement based on Hadoop. But , by now, I do not think that hadoop support them now. I am looking forward to your suggestion how to implement these.

Firstly, if I want to let the reducers to fetch more partitions files from map out put, is that ok? For instance, now reducer one can fetch all the partition 1 from mappers, how I implement that reducer one can fetch all the partition 1 and also 2 to go to reducer 1? If can , How could I implement that?

Secondly, in the map phase, one recorder can only be written into one partition file according to the partitioner function. If I want to write one record to multi-partition files, how can I do that? For example, there are M reducers and there should be M partition files in map phase. Now one recorder can only be output to one of M partition files. If I want to output one record to multi-partition files, is there any way to do this?

Looking forward to your new idea about this.

Reply to This

Replies to This Discussion

What you can do is, in your mapper open additional files than are input, which you may output anywhere.

As an alternative you could write all of your map outputs via a MultipleFileOutput format in the map task, and only output the filenames to the reduce tasks, at which point your reduce tasks would open and process the files written by the map.

This is not straight forward to implement, and there are probably simpler ways.

Reply to This

To write into multiple partitions, please look at pig's skewed join implementation of partitioner. I believe they do something pretty similar. However, .20onwards reducers will have to be set. Hence, it might break your implementation.

Coming back to the fetching of more than one partition, can you please explain the use-case? People here might help coming up with an alternative. MapReduce paradigm itself is strictly binding to one-per-partition idea, and this might not exactly be the best possible solution.

Reply to This

Reply to This

RSS

Groups

© 2010   Created by Jason Venner.   Powered by .

Badges  |  Report an Issue  |  Terms of Service

Sign in to chat!