2014-09-22 62 views
2

我想使用HPROF来描述我的Hadoop作业。问题是我得到TRACES,但在profile.out文件中没有CPU SAMPLES。我是用我的run方法中的代码是:Hadoop HPROF概要分析没有编写CPU样本

/** Get configuration */ 
    Configuration conf = getConf(); 
    conf.set("textinputformat.record.delimiter","\n\n"); 
    conf.setStrings("args", args); 

    /** JVM PROFILING */ 
    conf.setBoolean("mapreduce.task.profile", true); 
    conf.set("mapreduce.task.profile.params", "-agentlib:hprof=cpu=samples," + 
     "heap=sites,depth=6,force=n,thread=y,verbose=n,file=%s"); 
    conf.set("mapreduce.task.profile.maps", "0-2"); 
    conf.set("mapreduce.task.profile.reduces", ""); 

    /** Job configuration */ 
    Job job = new Job(conf, "HadoopSearch"); 
    job.setJarByClass(Search.class); 
    job.setOutputKeyClass(Text.class); 
    job.setOutputValueClass(NullWritable.class); 

    /** Set Mapper and Reducer, use identity reducer*/ 
    job.setMapperClass(Map.class); 
    job.setReducerClass(Reducer.class); 

    /** Set input and output formats */ 
    job.setInputFormatClass(TextInputFormat.class); 
    job.setOutputFormatClass(TextOutputFormat.class); 

    /** Set input and output path */ 
    FileInputFormat.addInputPath(job, new Path("/user/niko/16M")); 
    FileOutputFormat.setOutputPath(job, new Path(cmd.getOptionValue("output"))); 

    job.waitForCompletion(true); 

    return 0; 

如何获得CPU SAMPLES在输出写?

我也在stderr上有s trange错误消息,但我认为这没有关系,因为当profiling设置为false或启用profiling的代码被注释掉时它也存在。错误是

log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.impl.MetricsSystemImpl). 
log4j:WARN Please initialize the log4j system properly. 
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info. 

回答

2

纱线(或MRv1)在工作完成后正在杀死容器。 CPU样本不能写入您的分析文件。事实上,你的痕迹也应该被截断。

您必须添加folowwing选项(或您的Hadoop版本的等价物):

yarn.nodemanager.sleep-delay-before-sigkill.ms = 30000 
# No. of ms to wait between sending a SIGTERM and SIGKILL to a container 

yarn.nodemanager.process-kill-wait.ms = 30000 
# Max time to wait for a process to come up when trying to cleanup a container 

mapreduce.tasktracker.tasks.sleeptimebeforesigkill = 30000 
# Same en MRv1 ? 

(30秒似乎足够)

+0

工作就像一个魅力。 – 2016-02-03 16:14:53