7. 关于数据迁移的相关_项目一
第一次导入数据
[root@node1 dataExport]# cat export.sh
#!/bin/bash
echo "====================导出age_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table age_pvs --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/age_pvs" --input-fields-terminated-by ","
echo "===================导出age_pvs指标表成功========================="
echo "====================导出day_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table day_pvs --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/day_pvs" --input-fields-terminated-by ","
echo "====================导出day_pvs指标表成功========================"
echo "====================导出hour_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table hour_pvs --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/hour_pvs" --input-fields-terminated-by ","
echo "====================导出hour_pvs指标表成功========================"
echo "====================导出month_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table month_pvs --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/month_pvs" --input-fields-terminated-by ","
echo "====================导出month_pvs指标表成功========================"
echo "====================导出area_pvs表数据(追加写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table area_pvs --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/area_pvs" --input-fields-terminated-by ","
echo "====================导出area_pvs指标表成功========================"
以后的数据
对以后的数据添加的注意事项
| 表字段 | 作用 |
|---|---|
| --update-key | 指定表字段 |
| --update-mode | 导出数据的模式updateonly allowinsert(默认) |
| --columns hive表中的字段(按顺序写) | 修改表的字段的顺序 |
| 覆盖写(导出只更新数据,不追加,update-mode设置为updateonly,update-key设置为我们匹配字段) | |
| 追加写(update-mode设置为allowinsert或者不设置任何东西) |
[root@node1 dataExport]# cat a.sh
#!/bin/bash
echo "====================导出age_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table age_pvs --update-key age_range --update-mode updateonly --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/age_pvs" --input-fields-terminated-by ","
echo "===================导出age_pvs指标表成功========================="
echo "====================导出day_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table day_pvs --update-key visit_year,visit_month,visit_day --update-mode updateonly --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/day_pvs" --input-fields-terminated-by ","
echo "====================导出day_pvs指标表成功========================"
echo "====================导出hour_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table hour_pvs --update-key visit_year,visit_month,visit_day,visit_hour --update-mode updateonly --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/hour_pvs" --input-fields-terminated-by ","
echo "====================导出hour_pvs指标表成功========================"
echo "====================导出month_pvs表数据(覆盖写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table month_pvs --update-key visit_year,visit_month --update-mode updateonly --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/hour_pvs" --input-fields-terminated-by ","
echo "====================导出month_pvs指标表成功========================"
echo "====================导出area_pvs表数据(追加写)程序启动=========================="
sqoop export --connect "jdbc:mysql://node1:3306/project?serverTimezone=UTC&useUnicode=true&characterEncoding=utf-8" --username root --password Jsq123456... --table area_pvs --num-mappers 1 --export-dir "/user/hive/warehouse/project.db/area_pvs" --input-fields-terminated-by ","
echo "====================导出area_pvs指标表成功========================"
附录
- 问题:对于month_pvs表,如果以后有新数据,无法加入,以为之前没有相关的字段记录
- 解决方案:对于每个月的访问量指标,应该在当月2号追加一条新的数据记录,当月的3-31号应该更新数据记录
本文来自博客园,作者:jsqup,转载请注明原文链接:https://www.cnblogs.com/jsqup/p/16574588.html

浙公网安备 33010602011771号