2013-03-13 83 views
0

我想在OSX山狮本地运行蜂巢,我想在这里按照指示本地库:蜂巢本地运行包括LZO

https://github.com/twitter/hadoop-lzo

我编译原生OSX库和jar,但我不知道我应该如何在本地启动Hive,以便Hive/Hadoop使用本地库。

我试过通过JAVA_LIBRARY_PATH环境变量包含它,但我认为这只是针对Hadoop。

export JAVA_LIBRARY_PATH="${SCRIPTS_DIR}/jars/native/Mac_OS_X-x86_64-64" 

当我运行使用LzopCodec蜂巢如:

SET mapred.output.compression.codec = com.hadoop.compression.lzo.LzopCodec; 

我收到以下错误,当我运行一个运行地图查询/ reduce作业:

SELECT COUNT(*) from test_table; 


Job running in-process (local Hadoop) 
org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: native-lzo library not available 
     at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:237) 
     at org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:477) 
     at org.apache.hadoop.hive.ql.exec.FileSinkOperator.processOp(FileSinkOperator.java:525) 
     at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) 
     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) 
     at org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84) 
     at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471) 
     at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:762) 
     at org.apache.hadoop.hive.ql.exec.GroupByOperator.forward(GroupByOperator.java:959) 
     at org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:995) 
     at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:557) 
     at org.apache.hadoop.hive.ql.exec.ExecReducer.close(ExecReducer.java:303) 
     at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:530) 
     at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:421) 
     at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:262) 
Caused by: java.lang.RuntimeException: native-lzo library not available 
     at com.hadoop.compression.lzo.LzoCodec.getCompressorType(LzoCodec.java:155) 
     at org.apache.hadoop.io.compress.CodecPool.getCompressor(CodecPool.java:100) 
     at com.hadoop.compression.lzo.LzopCodec.getCompressor(LzopCodec.java:135) 
     at com.hadoop.compression.lzo.LzopCodec.createOutputStream(LzopCodec.java:70) 
     at org.apache.hadoop.hive.ql.exec.Utilities.createCompressedStream(Utilities.java:868) 
     at org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:80) 
     at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:246) 
     at org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:234) 
     ... 14 more 

我我也尝试在Hive脚本中设置mapred.child.env LD_LIBRARY_PATH(没有运气):

SET mapred.child.env="LD_LIBRARY_PATH=../../scripts/jars/native/Mac_OS_X-x86_64-64"; 

回答

1

再次读取清除指令:

如何配置Hadoop以使用这些类?

# Copy the native library 
tar -cBf - -C build/hadoop-gpl-compression-0.1.0-dev/lib/native . | tar -xBvf - -C /path/to/hadoop/dist/lib/native 

基本上,我只需要内置本地库复制到我的Hadoop的安装:

ant compile-native tar 
cp -r build/hadoop-lzo-0.4.17-SNAPSHOT/lib/native/Mac_OS_X-x86_64-64 /usr/local/Cellar/hadoop/1.1.2/libexec/lib/native/