With Apache MRUnit我能够在集群上运行MapReduce程序之前在本地单元测试我的MapReduce程序。MapReduce单元测试无法模拟DistributedCache.getLocalCacheFiles
我的程序需要从DistributedCache中读取,所以我把DistributedCache.getLocalCacheFiles
换成了我的单元测试中的mock。我设置了一个存根,这样当该方法不会被调用,而是返回一个本地路径。但事实证明,该方法被调用并抛出FileNotFoundException
。
这里是我的MapReduce程序看起来像
public class TopicByTime implements Tool {
private static Map<String, String> topicList = null;
public static void main(String[] args) throws Exception {
System.exit(ToolRunner.run(new TopicByTime(), args));
}
@Override
public int run(String[] args) throws Exception {
Job job = new Job();
/* Job setup */
DistributedCache.addCacheFile(new URI(/* path on hdfs */), conf);
job.waitForCompletion(true);
return 0;
}
protected static class TimeMapper extends Mapper<LongWritable, Text, Text, Text> {
@Override
public void setup(Context context) throws IOException, InterruptedException {
DistributedCacheClass cache = new DistributedCacheClass();
Path[] localPaths = cache.getLocalCacheFiles(context.getConfiguration());
if (null == localPaths || 0 == localPaths.length) {
throw new FileNotFoundException("Distributed cached file not found");
}
topicList = Utils.loadTopics(localPaths[0].toString());
}
@Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
/* do map */
}
}
/* Reducer and overriding methods */
}
而且我的测试程序
public class TestTopicByTime {
@Before
public void setUp() throws IOException {
Path[] localPaths = { new Path("resource/test/topic_by_time.txt")};
Configuration conf = Mockito.mock(Configuration.class);
DistributedCacheClass cache = Mockito.mock(DistributedCacheClass.class);
when(cache.getLocalCacheFiles(conf)).thenReturn(localPaths);
}
@Test
public void testMapper() {
}
@Test
public void testReducer() {
}
@Test
public void testMapReduce() {
}
}
DistributedCacheClass
是一个简单的包装
public class DistributedCacheClass {
public Path[] getLocalCacheFiles(Configuration conf) throws IOException {
return DistributedCache.getLocalCacheFiles(conf);
}
}
我可以映射器的设置增加了一个标志方法,以便在测试时读取本地路径,但我确实想分割测试来自我的MapReduce程序的代码。
我是模拟测试和MRUnit的新手,所以在我的程序中可能会有新手bug。请指出错误,我会解决它们,并在下面发布我的更新。
抛出相同的'FileNotFoundException' – manuzhang 2013-03-28 05:06:40
请参阅我的编辑 – Gopi 2013-03-28 05:12:06
这将混合我的mapreduce程序的测试代码。那么为什么首先使用嘲讽?如果标志为真,我可以通过构造函数传递一个标志并从'setup'的本地路径中读取。 – manuzhang 2013-03-28 16:01:21