ChavinKing - 博客园

2017年7月

摘要：一、百万级数据库优化方案1.对查询进行优化，要尽量避免全表扫描，首先应考虑在 where 及 order by 涉及的列上建立索引。2.应尽量避免在 where 子句中对字段进行 null 值判断，否则将导致引擎放弃使用索引而进行全表扫描，如：select id from t where num is null最好不要给数据库留NULL，尽可能的使用 NOT NULL填充数据库.备注、描述、评论之... 阅读全文

posted @ 2017-07-19 11:20 ChavinKing 阅读(13438) 评论(0) 推荐(0)

hive归档分区

摘要：归档hive历史分区不会减少hdfs存储空间，但是可以有效减轻hadoop namenode的压力，尤其在于小文件比较多的情况下。 $mkdir $HIVE_HOME/auxlib $ cp /opt/cdh-5.3.6/hadoop-2.5.0/share/hadoop/tools/lib/had 阅读全文

posted @ 2017-07-16 20:42 ChavinKing 阅读(1475) 评论(0) 推荐(0)

Hive之变量和属性

摘要：首先看一下hive cli工具对于变量的定义规定的几项功能： $ bin/hive -h usage: hive -d,--define <key=value> Variable subsitution to apply to hive commands. e.g. -d A=B or --defi 阅读全文

posted @ 2017-07-12 17:56 ChavinKing 阅读(2189) 评论(0) 推荐(0)

Oracle分析函数大全

摘要：分析函数又叫开窗函数，OLAP函数等，因为有人问我用过开窗函数没，呵，什么是开窗函数，从来没听过，难道是分析函数么。哈哈，最后还真是分析函数哦！用过的东西别名也应该知道，赶上这么个事，就剽窃一眼Oracle官档，另外借鉴网友的例子，整理了以下这篇文档，供大家参考。一、分析函数列表 SUM：该函数计阅读全文

posted @ 2017-07-11 16:30 ChavinKing 阅读(1974) 评论(0) 推荐(0)

Docker容器与容器云之Docker单机集群部署案例

摘要：准备工作： CentOS 7安装docker： #yum -y install docker 1、获取节点所需镜像 --主机执行 #docker pull django #docker pull haproxy #docker pull redis # docker images REPOSITOR 阅读全文

posted @ 2017-07-06 22:26 ChavinKing 阅读(2886) 评论(0) 推荐(1)

hive中的几个参数：元数据配置、仓库位置、打印表字段相关参数

摘要： hive仓库位置由以下参数决定，默认位置/user/hive/warehouse： <property> <name>hive.metastore.warehouse.dir</name> <value>/user/hive/warehouse</value> </property> hive中元数阅读全文

posted @ 2017-07-04 18:13 ChavinKing 阅读(520) 评论(0) 推荐(0)

启用hive hwi方法

摘要： hive启动hwi： ./hive --service hwi ls: cannot access /opt/cdh-5.3.6/hive-0.13.1/lib/hive-hwi-*.war: No such file or directory 17/05/12 09:29:47 INFO hwi. 阅读全文

posted @ 2017-07-04 17:49 ChavinKing 阅读(867) 评论(0) 推荐(0)

hive进行词频统计

摘要：统计文件信息： $ /opt/cdh-5.3.6/hadoop-2.5.0/bin/hdfs dfs -text /user/hadoop/wordcount/input/wc.input hadoop spark spark hadoop oracle mysql postgresql postg 阅读全文

posted @ 2017-07-04 12:39 ChavinKing 阅读(3642) 评论(0) 推荐(0)

2017年6月

Docker在centos上的安装

摘要：一、docker在CentOS 6上的安装 Docker仅支持64位系统，对于centos 6系统可以使用epel库安装docker，命令如下： #yum -y install http://mirrors.yun-idc.com/epel/6/i386/epel-release-6-8.noarc 阅读全文

posted @ 2017-06-30 17:18 ChavinKing 阅读(885) 评论(0) 推荐(0)

Hive日志（Hive Logging）--hive GettingStarted翻译

摘要： Hive uses log4j for logging. By default logs are not emitted to the console by the CLI. The default logging level is WARN for Hive releases prior to 0 阅读全文

posted @ 2017-06-29 16:11 ChavinKing 阅读(7395) 评论(0) 推荐(0)

【RMAN】RMAN-05001: auxiliary filename conflicts with the target database

摘要： oracle 11.2.0.4 运行以下脚本，使用活动数据库复制技术创建dataguard备库报错rman-005001： run{ duplicate target database for standby from active database spfile set db_unique_nam 阅读全文

posted @ 2017-06-28 18:24 ChavinKing 阅读(1202) 评论(0) 推荐(0)

简单示例用例(Simple Example Use Cases)--hive GettingStarted用例翻译

摘要： 1、MovieLens User Ratings First, create a table with tab-delimited text file format: 首先，创建一个通过tab分隔的表： CREATE TABLE u_data ( userid INT, movieid INT, r 阅读全文

posted @ 2017-06-26 23:13 ChavinKing 阅读(320) 评论(0) 推荐(0)

hive分析nginx日志之UDF清洗数据

摘要： hive分析nginx日志一：http://www.cnblogs.com/wcwen1990/p/7066230.html hive分析nginx日志二：http://www.cnblogs.com/wcwen1990/p/7074298.html 接着来看： 1、首先编写UDF，如下： --使用阅读全文

posted @ 2017-06-26 14:09 ChavinKing 阅读(1970) 评论(0) 推荐(0)

hive中创建子表并插入数据过程初始化MR报错解决方法

摘要：本文继成上一篇通过hive分析nginx日志文章，详情参考下面链接： http://www.cnblogs.com/wcwen1990/p/7066230.html 接着来：创建业务子表： drop table if exists chavin.nginx_access_log_comm; cre 阅读全文

posted @ 2017-06-24 19:54 ChavinKing 阅读(3515) 评论(0) 推荐(0)

使用hive分析nginx访问日志方法

摘要：以下案例是使用hive分析nginx的访问日志案例，其中字段分隔通过正则表达式匹配，具体步骤如下：日志格式： 192.168.5.139 - - [08/Jun/2017:17:09:12 +0800] "GET //oportal/static/ui/layer/skin/default/ico 阅读全文

posted @ 2017-06-22 17:32 ChavinKing 阅读(1377) 评论(0) 推荐(0)

公告