CDH4.5.0下安装snappy

编译源代码 http://www.cnblogs.com/chengxin1982/p/3862289.html

测试参考 http://blog.jeoygin.org/2012/03/java-compression-library-test.html
1 snappy
参考地址
 http://sstudent.blog.51cto.com/7252708/1405485(主)
 http://wzxwzx2011.blog.51cto.com/2997448/1111619

snappy库: wget http://pkgs.fedoraproject.org/repo/pkgs/snappy/snappy-1.1.1.tar.gz/8887e3b7253b22a31f5486bca3cbc1c2/snappy-1.1.1.tar.gz
tar -zxvf  snappy-1.0.5.tar.gz
cd snappy-1.0.5
./configure
make
sudo make install
或者  sudo yum install snappy snappy-devel(centos)


安装hadoop-snappy包 :https://github.com/electrum/hadoop-snappy
    sudo apt-get install automake libtool
    cd hadoop-snappy
    mvn package


修改配置文件core-site.xml
 <property>
    <name>io.compression.codecs</name>
        <value>
                org.apache.hadoop.io.compress.GzipCodec,
                org.apache.hadoop.io.compress.DefaultCodec,
                org.apache.hadoop.io.compress.BZip2Codec,
                org.apache.hadoop.io.compress.SnappyCodec
        </value>
</property>


错误1 Cannot run program "autoreconf"
解决办法: apt-get install autoconf,automake,libtool(参考 http://www.cnblogs.com/shitouer/archive/2013/01/05/2845954.html)

错误 2
java.lang.UnsatisfiedLinkError: org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy()Z
    at org.apache.hadoop.util.NativeCodeLoader.buildSupportsSnappy(Native Method)
    at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:62)
    at org.apache.hadoop.io.compress.SnappyCodec.createCompressor(SnappyCodec.java:138)
    at org.apache.hadoop.io.compress.SnappyCodec.createOutputStream(SnappyCodec.java:93)
    at org.apache.hadoop.mapreduce.lib.output.TextOutputFormat.getRecordWriter(TextOutputFormat.java:136)
    at org.apache.hadoop.mapred.ReduceTask$NewTrackingRecordWriter.<init>(ReduceTask.java:562)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:636)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:404)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:443)

解决方法
mdkir -p $HADOOP_HOME/lib/native/Linux-amd64-64

cp -r hadoop-snappy-0.0.1-SNAPSHOT/lib/native/Linux-amd64-64/* $HADOOP_HOME/lib/native/Linux-amd64-64
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$HADOOP_HOME/sbin/Linux-amd64-64/:/usr/local/lib/

错误3 DefaultCodec/GZipCodec not found
修改core-site.xml
  <property>
        <name>io.compression.codecs</name> <value>org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec</value>
去掉空格



配置hbase
mkdir -p $HBASE_HOME/lib/native/Linux-amd64-64/
cp -r $HADOOP_HOME/lib/native/Linux-amd64-64/* $HBASE_HOME/lib/native/Linux-amd64-64/
cp hadoop-snappy-0.0.1-SNAPSHOT.jar $HBASE_HOME/lib/

使用
conf.setBoolean("mapreduce.map.output.compress", true);
    conf.setClass("mapreduce.map.output.compression.codec", SnappyCodec.class, CompressionCodec.class);
    conf.setBoolean("mapreduce.output.fileoutputformat.compress", true);  // 设置是否压缩输出
    conf.setClass("mapreduce.output.fileoutputformat.compress.codec", SnappyCodec.class, CompressionCodec.class);



posted @ 2014-07-23 09:56  谭志宇  阅读(3558)  评论(0编辑  收藏  举报