A Community for Hadoop Users
Hi,
I'm trying to move my legacy data procfessing code to hadoop. My issue is the legacy code relies on local file system - it both reads and writes meta data. When the code access local data it typically uses relative path, like this: "meta-dir/group/my-meta.xml". From the O'Reilly book (Tom White), I'm thinking of using distributed cache to copy the local files to the task nodes. For example I could zip the entire meta data dirrectory tree and use
-archives mymeta.zip
My question is: How do I make hadoop to keep the path info, so when the legacy code accesses local file:
a/relative/path/to/my/file.xml
hadoop can still find the file from (I assume) the HDFS?
Many thanks in advance,
Tags: cache, distributed
Permalink Reply by Jason Venner on March 7, 2010 at 3:37pm
Permalink Reply by Yigang Chen on March 8, 2010 at 8:26am
Permalink Reply by Jason Venner on March 9, 2010 at 7:35pm 6 members
4 members
11 members
1 member
9 members
© 2012 Created by Jason Venner.