DataX - [03] 使用案例

题记部分

 

001 || mysql2hdfs

(1)查看MySQL被迁移的数据情况

(2)根据需求确定reader为mysqlreader,writer为hdfswriter

查看reader和writer模板的方式(-r读模板;-w写模板)

python bin/datax.py -r mysqlreader -w hdfswriter

(3)编写同步json脚本

(4)确定HDFS上目标路径是否存在

(5)通过datax.py指定json任务运行同步数据

(6)数据验证,查看HDFS上是否已经有MySQL对应表中的所有数据

{
	"job": {
		"content": [
			{
				"reader": {
					"name": "mysqlreader",
					"paramter": {
						"column": ["id","name"],
						"connection": [
							{
								"jdbcUrl": ["jdbc:mysql://xxxxx:3306/dbName"],
								"table": ["test"]
							}
						],
						"password": "twgdhbtzhy",
						"username": "root",
						"splitPk": ""
					}
				},
				"writer": {
					"name": "hdfswriter",
					"parameter": {
						"column": [
							{"name": "id", "type": "bigint"},
							{"name": "name", "type": "string"}
						],
						"compress": "gzip",
						"defaultFS": "hdfs://xxxxx:8020",
						"fieldDelimiter": "\t",
						"fileName": "test",
						"fileType": "text",
						"path": "/test",
						"writeMode": "append"
					}
				}
			}
		],
		"setting": {
			"speed": {
				"channel": "1"
			}
		}
	}
}

(7)任务执行

hdfs dfs -mkdir /test
python bin/datax.py job/mysql2hdfs.json

(8)

 

 

 

 

002 || 标题

 

 

003 || 标题

 

 

posted @ 2024-12-17 17:56  HOUHUILIN  阅读(68)  评论(0)    收藏  举报