Hive数据分析

描述:
样表(sales_sample_20170310)字段说明:
day_id 日期编号;
sale_nbr 卖出方代码;
buy_nbr 买入方代码;
cnt 数量
round 金额
卖出方和买入方又分为 3 种类型:
以'C'开头的表示类型为 C,代表“航空公司”,只可以卖出,不可以买入;
以'O'开头的表示类型为 O,代表“代理人”,既可以卖出,也可以买入,并且允许自
己卖给自己(简单来讲:每个“代理人”代码可能对应多个售票点,售票点之前有交换票的
情况,所以体现为自己卖给了自己);
'PAX'表示类型为 PAX,代表“旅客”,只可以买入,不可以卖出。
举例
day_id,sale_nbr,buy_nbr,cnt,round
1,C1,O1001,1,360
卖出方为 C1,类型为 C;买入方为 O1001,类型为 O
day_id,sale_nbr,buy_nbr,cnt,round
1,O100,O100,4,2000
卖出方为 O100,类型为 O;买入方为 O100,类型为 O(即自己卖给自己是允许的)
day_id,sale_nbr,buy_nbr,cnt,round
1,O100,PAX,4,2000
卖出方为 O100,类型为 O;买入方为 PAX,类型为 PAX
问题:
1、数据导入:
要求将样表文件中的(sales_sample_20170310)数据导入 HIVE 数据仓库中。
2、数据清洗:
要求将 day_id 一列中的数值清洗为真实的日期格式,可用字符串表示。
数据 1 对应日期 2021-09-01,依次类推,15 对应日期 2021-09-15
3、数据分析处理:
(1)统计每天各个机场的销售数量和销售金额。
要求的输出字段
day_id,sale_nbr,,cnt,round
日期编号,卖出方代码,数量,金额
 
(2)统计每天各个代理商的销售数量和销售金额。
要求的输出字段
day_id,sale_nbr,,cnt,round
日期编号,卖出方代码,数量,金额
(3)统计每天各个代理商的销售活跃度。
要求的输出字段
day_id,sale_nbr, sale_number
日期编号,卖出方代码,交易次数(买入或者卖出均算交易次数)
(4)汇总统计 9 月 1 日到 9 月 15 日之间各个代理商的销售利润。
编号,卖出方代码,买入数量,买入金额,卖出数量,卖出金额,销售利润(卖出
金额-买入金额)
 
(5)设计分析代理商的市场地位根据市场交易次数、交易对象个数、销售机票数量、
销售利润等。(选做题)
4、处理结果入库:
将上述统计分析的结果数据保存到 mySQL 数据库中。
5、数据可视化展示:
利用 Echarts 将上述统计结果以图形化展示的方式展现出来:饼图、柱状图、地图、折
线图等。
解决:
部分sql语句
导入到hive
bin/sqoop import \
--connect jdbc:mysql://Hadoop102:3306/company \
--username root \
--password 1229 \
--table staff \
--target-dir /user/company \
--delete-target-dir \
--num-mappers 1 \
--fields-terminated-by "\t"

导出
bin/sqoop export \
--connect jdbc:mysql://Hadoop102:3306/company \
--username root \
--password 1229 \
--table test \
--num-mappers 1 \
--export-dir /user/hive/warehouse/data \
--input-fields-terminated-by "\t"

bin/sqoop export \
--connect jdbc:mysql://Hadoop102:3306/company \
--username root \
--password 1229 \
--table test \
--num-mappers 1 \
--export-dir /user/hive/warehouse/try \
--input-fields-terminated-by ","

bin/sqoop export \
--connect jdbc:mysql://Hadoop102:3306/company \
--username root \
--password 1229 \
--table sale \
--num-mappers 1 \
--export-dir /user/hive/warehouse/sales2 \
--input-fields-terminated-by ","

bin/sqoop export \
--connect jdbc:mysql://Hadoop102:3306/company \
--username root \
--password 1229 \
--table jieguo01 \
--num-mappers 1 \
--export-dir /user/hive/warehouse/jieguo01 \
--input-fields-terminated-by ","

create table sales(day_id String,
sale_nbr String,
buy_nbr String,
cnt String,
round String)
ROW format delimited fields terminated by '/t' STORED AS TEXTFILE;

create table sales2(day_id String,
sale_nbr String,
buy_nbr String,
cnt String,
round String)
ROW format delimited fields terminated by ',' STORED AS TEXTFILE;

create table sales3(day_id String,
sale_nbr String,
buy_nbr String,
cnt String,
round String)
ROW format delimited fields terminated by ',' STORED AS TEXTFILE;

load data local inpath '/opt/software/sample.txt' into table sales;

select * from sales limit 10;


修改时间:

insert overwrite table sales3 select date_add('2021-09-00',cast(day_id as int)) as day_id,sale_nbr as sale_nbr,buy_nbr as buy_nbr,cnt as cnt,round as round from sales2;


create table jieguo1(day_id string,
sale_nbr string,
cnt string,
round string)
ROW format delimited fields terminated by ',' STORED AS TEXTFILE;

create table jieguo01(day_id string,
sale_nbr string,
cnt string,
round string)
ROW format delimited fields terminated by ',' STORED AS TEXTFILE;

2.1

insert overwrite table jieguo1  select day_id as day_id ,sale_nbr as sale_nbr,sum(cnt) as cnt ,sum(round) as round from sales2 where sale_nbr like 'C%'  group by day_id,sale_nbr;
2.2

create table jieguo02(day_id string,
sale_nbr string,
cnt string,
round string)
ROW format delimited fields terminated by ',' STORED AS TEXTFILE;

建表:
create table jieguo2(day_id string,
sale_nbr string,
cnt string,
round string)
ROW format delimited fields terminated by ',' STORED AS TEXTFILE;

查询:
insert overwrite table jieguo2  select day_id as day_id ,sale_nbr as sale_nbr,sum(cnt) as cnt ,sum(round) as round from sales3 where sale_nbr like 'O%'  group by day_id,sale_nbr;

bin/sqoop export \
--connect jdbc:mysql://Hadoop102:3306/company \
--username root \
--password 1229 \
--table jieguo02 \
--num-mappers 1 \
--export-dir /user/hive/warehouse/jieguo02 \
--input-fields-terminated-by ","
2.3

bin/sqoop export \
--connect jdbc:mysql://Hadoop102:3306/company \
--username root \
--password 1229 \
--table jieguo03 \
--num-mappers 1 \
--export-dir /user/hive/warehouse/jieguo03 \
--input-fields-terminated-by ","

create table jieguo03(day_id String,sale_nbr String,sale_number String) ROW format delimited fields terminated by ',';


insert overwrite table jieguo03 select day_id,sale_nbr,COUNT(*) as sale_number from sales3 group by day_id,sale_nbr having sale_nbr like 'O%';

2.4

bin/sqoop export \
--connect jdbc:mysql://Hadoop102:3306/company \
--username root \
--password 1229 \
--table jieguo04 \
--num-mappers 1 \
--export-dir /user/hive/warehouse/daili \
--input-fields-terminated-by ","

create table daiout as select day_id as day_id,sale_nbr as sale_nbr,sum(cnt)as cnt,sum(round)as round from sales3 where sale_nbr like 'O%' group by day_id,sale_nbr;

create table daiin as select day_id as day_id,buy_nbr as buy_nbr,sum(cnt)as cnt,sum(round)as round from sales3 where buy_nbr like 'O%' group by day_id,buy_nbr;

create table daili as select daiin.day_id as day_id,daiout.sale_nbr as sale_nbr,daiin.cnt as incnt,daiin.round as inround,daiout.cnt as outcnt,daiout.round as outround,daiout.round-daiin.round as li from daiin join daiout on (daiin.buy_nbr=daiout.sale_nbr);

 echarts实现:

部分代码:

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Title</title>
    <link rel="stylesheet" href="./layui/css/layui.css">
</head>
<body>
<script src="echarts.js"></script>
<script src="js/jquery.min.js"></script>
</body>
<button class="layui-btn" onclick="getFirstData()">第一个表</button>
<button class="layui-btn" onclick="getSecoundData()">第二个表</button>
<button class="layui-btn" onclick="getThirdData()">第三个表</button>
<div id="main" style="height:600%;" ></div>
<script>
    var hzb=new Array();
    var zzb=new Array();
    function getFirstData(){
        var url = "servlet?method=getFirstData&randnum=" + Math.random();
        $.ajax({
            type: "get",
            url: url,
            data: [],
            dataType: "json",
            success: function(result){
                fenli(result)
                setChatr(1)

            },
            error: function(){
                alert("错误");
            }
        });
    }
    function getSecoundData(){
        var url = "servlet?method=getSecoundData&randnum=" + Math.random();
        $.ajax({
            type: "get",
            url: url,
            data: [],
            dataType: "json",
            success: function(result){
                fenli(result)
                setChatr(2)

            },
            error: function(){
                alert("错误");
            }
        });
    }
    function getThirdData(){
        var url = "servlet?method=getThirdData&randnum=" + Math.random();
        $.ajax({
            type: "get",
            url: url,
            data: [],
            dataType: "json",
            success: function(result){
                fenli(result)
                setChatr(3)

            },
            error: function(){
                alert("错误");
            }
        });
    }
    function fenli(data){
        for(i in data){
            hzb.push(data[i].hzb)
            zzb.push(data[i].zzb)
        }
    }
    function setChatr(num){
        var chartDom = document.getElementById('main');
        var myChart = echarts.init(chartDom);
        var Title;
        if(num==1)Title="金额";
        if(num==2)Title="销售金额";
        if(num==3)Title="销售活跃度";
        var option;

        option = {
            title: { text: Title },
            xAxis: {
                type: 'category',
                data: hzb,
                axisLabel: {
                    interval: 0,
                    formatter:function(value)
                    {
                        return value.split("").join("\n");
                    }
                },
            },
            yAxis: {
                type: 'value'
            },

            series: [
                {
                    data: zzb,
                    type: 'line',
                    barWidth : 20,
                    showBackground: true,
                    backgroundStyle: {
                        color: 'rgba(180, 180, 180, 0.2)'
                    }
                }
            ],
            dataZoom : [
                {
                    type: 'slider',
                    show: true,
                    start: 94,
                    end: 100,
                    handleSize: 8
                },
                {
                    type: 'inside',
                    start: 94,
                    end: 100
                },
                {
                    type: 'slider',
                    show: true,
                    yAxisIndex: 0,
                    filterMode: 'empty',
                    width: 12,
                    height: '70%',
                    handleSize: 8,
                    showDataShadow: false,
                    left: '93%'
                }
            ]

        };

        option && myChart.setOption(option);
    }
</script>
</html>

  servlet:

package classes;

import javax.servlet.*;
import javax.servlet.http.*;
import javax.servlet.annotation.*;
import java.io.IOException;
import java.io.PrintWriter;
import java.sql.SQLException;
import java.util.List;

import net.sf.json.JSONArray;
import net.sf.json.JSONObject;

@WebServlet(name = "servlet", value = "/servlet")
public class servlet extends HttpServlet {
    Dao dao=new Dao();
    public void setdata(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        request.setCharacterEncoding("utf-8");
        response.setContentType("text/html;charset=UTF-8");
        System.out.println("调用了方法");
        JSONArray json=new JSONArray();
        for(int i=0;i<5;i++){
            JSONObject ob=new JSONObject();
            ob.put("name","水果");
            ob.put("num",i*10);
            json.add(ob);
        }
        System.out.println(json.toString());
        PrintWriter out = response.getWriter();
        out.write(json.toString());
    }
    public void getAlldata(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        request.setCharacterEncoding("utf-8");
        response.setContentType("text/html;charset=UTF-8");
        List<FourBean> list=dao.FourAllData();
        JSONArray json=new JSONArray();
        for(int i=0;i<list.size();i++){
            JSONObject ob=new JSONObject();
            ob.put("day_id",list.get(i).getDay_id());
            ob.put("sale_nbr",list.get(i).getSale_nbr());
            ob.put("incnt",list.get(i).getIncnt());
            ob.put("inround",list.get(i).getInround());
            ob.put("outcnt",list.get(i).getOutcnt());
            ob.put("outround",list.get(i).getOutround());
            ob.put("li",list.get(i).getLi());
            json.add(ob);
        }
        PrintWriter out = response.getWriter();
        out.write(json.toString());
        System.out.println("发送完成");
    }
    public void getFirstData(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        request.setCharacterEncoding("utf-8");
        response.setContentType("text/html;charset=UTF-8");
        List<FirstBean> list=dao.getFirstData();
        JSONArray json=new JSONArray();
        for(int i=0;i<list.size();i++){
            JSONObject ob=new JSONObject();
            ob.put("hzb",list.get(i).getSale_nbr());
            ob.put("zzb",list.get(i).getRound());
            json.add(ob);
        }
        PrintWriter out = response.getWriter();
        out.write(json.toString());
    }
    public void getSecoundData(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        request.setCharacterEncoding("utf-8");
        response.setContentType("text/html;charset=UTF-8");
        List<SecoundBean> list=dao.getSecoundData();
        JSONArray json=new JSONArray();
        for(int i=0;i<list.size();i++){
            JSONObject ob=new JSONObject();
            ob.put("hzb",list.get(i).getSale_nbr());
            ob.put("zzb",list.get(i).getRound());
            json.add(ob);
        }
        PrintWriter out = response.getWriter();
        out.write(json.toString());
    }
    public void getThirdData(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        request.setCharacterEncoding("utf-8");
        response.setContentType("text/html;charset=UTF-8");
        List<ThirdBean> list=dao.getThirdData();
        JSONArray json=new JSONArray();
        for(int i=0;i<list.size();i++){
            JSONObject ob=new JSONObject();
            ob.put("hzb",list.get(i).getSale_nbr());
            ob.put("zzb",list.get(i).getSale_number());
            json.add(ob);
        }
        PrintWriter out = response.getWriter();
        out.write(json.toString());
    }
    @Override
    protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        request.setCharacterEncoding("utf-8");
        String method=request.getParameter("method");
        if(method.equals("setdata")){
            try {
                setdata(request,response);
            } catch (ServletException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        else if(method.equals("getAlldata")){
            try {
                getAlldata(request,response);
            } catch (ServletException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        else if(method.equals("getFirstData")){
            try {
                getFirstData(request,response);
            } catch (ServletException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        else if(method.equals("getSecoundData")){
            try {
                getSecoundData(request,response);
            } catch (ServletException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        else if(method.equals("getThirdData")){
            try {
                getThirdData(request,response);
            } catch (ServletException e) {
                e.printStackTrace();
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
    }

    @Override
    protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        doGet(request,response);
    }
}

总结:还不够好,需要继续努力

posted @ 2021-10-04 21:26  哦心有  阅读(421)  评论(0)    收藏  举报