Hadoop工作目录

我正在尝试将一个文件保存在Hadoop应用程序的主类中，以便稍后由映射器读取它。该文件是将用于加密数据的加密密钥。我的问题在于，如果我将文件写入工作目录，数据会在哪里结束？Hadoop工作目录

public class HadoopIndexProject { 

    private static SecretKey generateKey(int size, String Algorithm) throws UnsupportedEncodingException, NoSuchAlgorithmException { 
     KeyGenerator keyGen = KeyGenerator.getInstance(Algorithm); 
     keyGen.init(size); 
     return keyGen.generateKey(); 
    } 

    private static IvParameterSpec generateIV() { 
     byte[] b = new byte[16]; 
     new Random().nextBytes(b); 
     return new IvParameterSpec(b);  
    } 

    public static void saveKey(SecretKey key, IvParameterSpec IV, String path) throws IOException { 
     FileOutputStream stream = new FileOutputStream(path); 
     //FSDataOutputStream stream = fs.create(new Path(path)); 
     try { 
      stream.write(key.getEncoded()); 
      stream.write(IV.getIV()); 
     } finally { 
      stream.close(); 
     } 
    } 

    /** 
    * @param args the command line arguments 
    * @throws java.lang.Exception 
    */ 
    public static void main(String[] args) throws Exception { 
     // TODO code application logic here 
     Configuration conf = new Configuration(); 
     //FileSystem fs = FileSystem.getLocal(conf); 
     String[] otherArgs = new GenericOptionsParser(conf, args).getRemainingArgs(); 
     SecretKey KEY; 
     IvParameterSpec IV; 
     if (otherArgs.length != 2) { 
      System.err.println("Usage: Index <in> <out>"); 
      System.exit(2); 
     } 
     try { 
      if(! new File("key.dat").exists()) { 
       KEY = generateKey(128, "AES"); 
       IV = generateIV(); 
       saveKey(KEY, IV, "key.dat"); 
      } 
     } catch (NoSuchAlgorithmException ex) { 
      Logger.getLogger(HadoopIndexMapper.class.getName()).log(Level.SEVERE, null, ex); 
     } 
     conf.set("mapred.textoutputformat.separator", ":"); 

     Job job = Job.getInstance(conf); 
     job.setJobName("Index creator"); 
     job.setJarByClass(HadoopIndexProject.class);  
     job.setMapperClass(HadoopIndexMapper.class); 
     job.setReducerClass(HadoopIndexReducer.class); 

     job.setMapOutputKeyClass(Text.class); 
     job.setMapOutputValueClass(IntWritable.class); 

     job.setOutputKeyClass(Text.class); 
     job.setOutputValueClass(IntArrayWritable.class); 

     FileInputFormat.addInputPath(job, new Path(otherArgs[0]) {}); 
     FileOutputFormat.setOutputPath(job, new Path(otherArgs[1])); 

     System.exit(job.waitForCompletion(true) ? 0 : 1); 
    } 

}

来源

2017-04-16 Marty Aman

HDFS中没有工作目录的概念。所有相对路径均为/user/<username>的路径，因此您的文件将位于/user/<username>/key.dat。

但在纱你有分布式缓存的概念，为您的应用程序的纱线可以添加使用有job.addCacheFile

来源

2017-04-17 13:29:03 fi11er

所以额外的文件，我在这里找到了我的错误。我没有使用我创建的FileSystem实例。现在我得到了你提到要创建的目录，我可以找到我的密钥。问题不在于我如何让我的映射器和缩减器读取该键。另外，你能解释分布式缓存的概念吗？ –

Hadoop工作目录

回答

相关问题