Sqoop将MySQL的表数据同步到HDFS(二)设置存储格式

系统环境

操作系统: CentOS 7
主机名:   centos02
IP:      192.168.122.1
Java:    1.8
Hadoop:  2.8.5
Sqoop:   1.4.7
MySQL: 8.0.12

 

一、Avro 格式存储

sqoop import 
--connect jdbc:mysql://centos02:3306/OfficialCashMid 
--driver com.mysql.cj.jdbc.Driver 
--username root 
--password sa123_ADMIN. 
--table tadminoperationlog 
--m 2 
--target-dir /jdbcHDFS/TAdminLog_avro 
-- as-avrodatafile 
[root@centos02 bin]# sqoop import --connect jdbc:mysql://centos02:3306/OfficialCashMid --driver com.mysql.cj.jdbc.Driver --username root --password sa123_ADMIN. --table tadminoperationlog --m 2 --target-dir /jdbcHDFS/TAdminLog_avro -- as-avrodatafile  
Warning: /opt/bigdata/sqoop/sqoop-1.4.7/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /opt/bigdata/sqoop/sqoop-1.4.7/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/bigdata/sqoop/sqoop-1.4.7/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/bigdata/sqoop/sqoop-1.4.7/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
19/09/04 01:11:25 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
19/09/04 01:11:25 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/09/04 01:11:25 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
19/09/04 01:11:25 INFO manager.SqlManager: Using default fetchSize of 1000
19/09/04 01:11:25 INFO tool.CodeGenTool: Beginning code generation
19/09/04 01:11:26 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM tadminoperationlog AS t WHERE 1=0
19/09/04 01:11:27 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM tadminoperationlog AS t WHERE 1=0
19/09/04 01:11:27 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/bigdata/hadoop/hadoop-2.8.5
注: /tmp/sqoop-root/compile/64a3b4f66eb537e9b11f0416dbc6d58d/tadminoperationlog.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
19/09/04 01:11:30 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/64a3b4f66eb537e9b11f0416dbc6d58d/tadminoperationlog.jar
19/09/04 01:11:30 INFO mapreduce.ImportJobBase: Beginning import of tadminoperationlog
19/09/04 01:11:31 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
19/09/04 01:11:31 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM tadminoperationlog AS t WHERE 1=0
19/09/04 01:11:32 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
19/09/04 01:11:32 INFO client.RMProxy: Connecting to ResourceManager at centos02/192.168.122.1:8032
19/09/04 01:11:37 INFO db.DBInputFormat: Using read commited transaction isolation
19/09/04 01:11:37 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(FID), MAX(FID) FROM tadminoperationlog
19/09/04 01:11:37 INFO db.IntegerSplitter: Split size: 16058; Num splits: 2 from: 21 to: 32138
19/09/04 01:11:37 INFO mapreduce.JobSubmitter: number of splits:2
19/09/04 01:11:38 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1567503661837_0005
19/09/04 01:11:39 INFO impl.YarnClientImpl: Submitted application application_1567503661837_0005
19/09/04 01:11:39 INFO mapreduce.Job: The url to track the job: http://centos02:8088/proxy/application_1567503661837_0005/
19/09/04 01:11:39 INFO mapreduce.Job: Running job: job_1567503661837_0005
19/09/04 01:11:51 INFO mapreduce.Job: Job job_1567503661837_0005 running in uber mode : false
19/09/04 01:11:51 INFO mapreduce.Job:  map 0% reduce 0%
19/09/04 01:12:07 INFO mapreduce.Job:  map 100% reduce 0%
19/09/04 01:12:09 INFO mapreduce.Job: Job job_1567503661837_0005 completed successfully
19/09/04 01:12:10 INFO mapreduce.Job: Counters: 30
    File System Counters
        FILE: Number of bytes read=0
        FILE: Number of bytes written=357730
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=206
        HDFS: Number of bytes written=3753700
        HDFS: Number of read operations=8
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=4
    Job Counters 
        Launched map tasks=2
        Other local map tasks=2
        Total time spent by all maps in occupied slots (ms)=25703
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=25703
        Total vcore-milliseconds taken by all map tasks=25703
        Total megabyte-milliseconds taken by all map tasks=26319872
    Map-Reduce Framework
        Map input records=12122
        Map output records=12122
        Input split bytes=206
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=443
        CPU time spent (ms)=11370
        Physical memory (bytes) snapshot=399974400
        Virtual memory (bytes) snapshot=4243742720
        Total committed heap usage (bytes)=198180864
    File Input Format Counters 
        Bytes Read=0
    File Output Format Counters 
        Bytes Written=3753700
19/09/04 01:12:10 INFO mapreduce.ImportJobBase: Transferred 3.5798 MB in 37.8235 seconds (96.9166 KB/sec)
19/09/04 01:12:10 INFO mapreduce.ImportJobBase: Retrieved 12122 records.
[root@centos02 bin]# 

 

二、Sequence格式存储

sqoop import 
--connect jdbc:mysql://centos02:3306/OfficialCashMid 
--driver com.mysql.cj.jdbc.Driver 
--username root 
--password sa123_ADMIN. 
--table tadminoperationlog 
--m 2 
--target-dir /jdbcHDFS/TAdminLog_sequence 
-- as-sequencefile 
  
[root@centos02 bin]# sqoop import --connect jdbc:mysql://centos02:3306/OfficialCashMid --driver com.mysql.cj.jdbc.Driver --username root --password sa123_ADMIN. --table tadminoperationlog --m 2 --target-dir /jdbcHDFS/TAdminLog_sequence -- as-sequencefile 
Warning: /opt/bigdata/sqoop/sqoop-1.4.7/../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /opt/bigdata/sqoop/sqoop-1.4.7/../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/bigdata/sqoop/sqoop-1.4.7/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/bigdata/sqoop/sqoop-1.4.7/../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
19/09/04 01:17:11 INFO sqoop.Sqoop: Running Sqoop version: 1.4.7
19/09/04 01:17:11 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
19/09/04 01:17:11 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
19/09/04 01:17:11 INFO manager.SqlManager: Using default fetchSize of 1000
19/09/04 01:17:11 INFO tool.CodeGenTool: Beginning code generation
19/09/04 01:17:13 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM tadminoperationlog AS t WHERE 1=0
19/09/04 01:17:13 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM tadminoperationlog AS t WHERE 1=0
19/09/04 01:17:13 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/bigdata/hadoop/hadoop-2.8.5
注: /tmp/sqoop-root/compile/80fedf2c36bf7f4118ebc9731bc7479f/tadminoperationlog.java使用或覆盖了已过时的 API。
注: 有关详细信息, 请使用 -Xlint:deprecation 重新编译。
19/09/04 01:17:16 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/80fedf2c36bf7f4118ebc9731bc7479f/tadminoperationlog.jar
19/09/04 01:17:17 INFO mapreduce.ImportJobBase: Beginning import of tadminoperationlog
19/09/04 01:17:17 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
19/09/04 01:17:17 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM tadminoperationlog AS t WHERE 1=0
19/09/04 01:17:18 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
19/09/04 01:17:18 INFO client.RMProxy: Connecting to ResourceManager at centos02/192.168.122.1:8032
19/09/04 01:17:24 INFO db.DBInputFormat: Using read commited transaction isolation
19/09/04 01:17:24 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(FID), MAX(FID) FROM tadminoperationlog
19/09/04 01:17:24 INFO db.IntegerSplitter: Split size: 16058; Num splits: 2 from: 21 to: 32138
19/09/04 01:17:25 INFO mapreduce.JobSubmitter: number of splits:2
19/09/04 01:17:25 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1567503661837_0006
19/09/04 01:17:26 INFO impl.YarnClientImpl: Submitted application application_1567503661837_0006
19/09/04 01:17:26 INFO mapreduce.Job: The url to track the job: http://centos02:8088/proxy/application_1567503661837_0006/
19/09/04 01:17:26 INFO mapreduce.Job: Running job: job_1567503661837_0006
19/09/04 01:17:37 INFO mapreduce.Job: Job job_1567503661837_0006 running in uber mode : false
19/09/04 01:17:37 INFO mapreduce.Job:  map 0% reduce 0%
19/09/04 01:17:48 INFO mapreduce.Job:  map 50% reduce 0%
19/09/04 01:17:49 INFO mapreduce.Job:  map 100% reduce 0%
19/09/04 01:17:50 INFO mapreduce.Job: Job job_1567503661837_0006 completed successfully
19/09/04 01:17:50 INFO mapreduce.Job: Counters: 30
    File System Counters
        FILE: Number of bytes read=0
        FILE: Number of bytes written=357738
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=206
        HDFS: Number of bytes written=3753700
        HDFS: Number of read operations=8
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=4
    Job Counters 
        Launched map tasks=2
        Other local map tasks=2
        Total time spent by all maps in occupied slots (ms)=16481
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=16481
        Total vcore-milliseconds taken by all map tasks=16481
        Total megabyte-milliseconds taken by all map tasks=16876544
    Map-Reduce Framework
        Map input records=12122
        Map output records=12122
        Input split bytes=206
        Spilled Records=0
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=269
        CPU time spent (ms)=7730
        Physical memory (bytes) snapshot=360050688
        Virtual memory (bytes) snapshot=4240666624
        Total committed heap usage (bytes)=185073664
    File Input Format Counters 
        Bytes Read=0
    File Output Format Counters 
        Bytes Written=3753700
19/09/04 01:17:50 INFO mapreduce.ImportJobBase: Transferred 3.5798 MB in 32.0909 seconds (114.2295 KB/sec)
19/09/04 01:17:50 INFO mapreduce.ImportJobBase: Retrieved 12122 records.
[root@centos02 bin]# 

 

posted @ 2019-09-05 00:54  茗::流  阅读(372)  评论(0)    收藏  举报
如有雷同,纯属参考。如有侵犯你的版权,请联系我。