Kafka Connector 使用说明

基本概念

Broker:集群中的一个实例。

Workers:运行ConnectorTask的进程。

  1. Task:数据写入Kafka和从Kafka中读出的具体实现。

  2. Connector:通过管理Task来协调数据流的高级抽象。

    1. Source:负责导入数据到Kafka。

    2. Sink:负责从Kafka导出数据。

    支持分布式部署。

  3. ConverterConnector和其他存储系统直接发送或者接受数据之间转换数据。

  4. Transform:针对值(键值对中的Value)的轻量级数据调整的工具。

 

部署环境介绍

Zookeeper

10.110.5.83:2181

Elasticsearch

10.110.5.84:9200

Kafka

10.110.5.81

 

Kafka Connect Sink Elasticsearch

一、 Elasticsearch准备

  1. 调用接口创建索引:

    http://localhost:7002/search/initialization/

  2. Kafka的主题,Oracle的表名是区分大小写的,推荐使用大写

二、Kafka准备

  1. 下载confluent-5.2.1-2.12.tar,上传服务器(/home/目录下)后解压,进入其目录

    cd /home/confluent-5.2.1
  2. Zookeeper(2181)

    1. 如果环境中已经有Zookeeper,直接连接环境中的Zookeeper,无需启动自带的Zookeeper;如果没有,才使用自带的Zookeeper

      ./bin/zookeeper-server-start -daemon etc/kafka/zookeeper.properties
    2. 注意:先使用前台运行方式启动程序,待程序正常启动、不报错后可以再改用守护进程(-daemon)的方式启动

  3. 启动Broker(9092)

    1. 修改配置文件

      1. 打开配置文件

        vi ./etc/kafka/server.properties
      2. 确认Zookeeper

        zookeeper.connect=10.110.5.83:2181
      3. 补充配置项

        host.name=10.110.5.81
        listeners=PLAINTEXT://10.110.5.81:9092
        advertised.listeners=PLAINTEXT://10.110.5.81:9092
      4. 退出保存

    2. 启动

      ./bin/kafka-server-start -daemon etc/kafka/server.properties
  4. 启动Schema Registry(8081)

    ./bin/schema-registry-start -daemon etc/schema-registry/schema-registry.properties
  5. 创建Kafka测试主题“trace”(这一步是多余的,因为启动ProducerConsumer后,Kafka会自动创建主题)

    ./bin/kafka-topics \
    --topic trace \
    --create \
    --bootstrap-server 10.110.5.81:9092 \
    --replication-factor 1 \
    --partitions 1

三、发送数据

  1. 单例模式启动Connector

    1. 修改两个配置文件

      1. connect-standalone.properties

        1. 打开该文件

          vi etc/kafka/connect-standalone.properties
        2. 修改以下两项配置为false

          key.converter.schemas.enable=false
          value.converter.schemas.enable=false
        3. 修改地址(默认为本机)

          bootstrap.servers=10.110.5.81:9092
        4. 保存后关闭

      2. elasticsearch-sink.properties

        1. 复制连接文件

          cp etc/kafka-connect-elasticsearch/quickstart-elasticsearch.properties etc/kafka-connect-jdbc/elasticsearch-sink.properties
        2. 打开该文件

          vi etc/kafka-connect-elasticsearch/elasticsearch-sink.properties
        3. 修改配置

          topics=trace
          connection.url=http://10.110.5.84:9200
          #与hbase保持统一,故采用“info”:
          type.name=info
          #新增模式忽略,防止抛出异常(可能多余)
          schema.ignore=true
        4. 保存后关闭

      3. 以单例模式启动Connector

        ./bin/connect-standalone -daemon etc/kafka/connect-standalone.properties etc/kafka-connect-elasticsearch/elasticsearch-sink.properties \
        --property print.key=true \
        --property schema.registry.url=http://10.110.5.81:8081
  2. 新开XShell标签,启动Consumer,以监听消息是否发送(心理安慰,作用不大)

    ./bin/kafka-console-consumer \
    --topic trace \
    --bootstrap-server 10.110.5.81:9092 \
    --from-beginning \
    --property print.key=true
  3. 新开XShell标签,启动Producer,生产数据

    1. 指定Schema

      ./bin/kafka-avro-console-producer \
      --topic trace \
      --broker-list 10.110.5.81:9092 \
      --property parse.key=true \
      --property key.separator=: \
      --property schema.registry.url=http://10.110.5.81:8081 \
      --property key.schema='{"name":"key","type":"string"}' \
      --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"id","type":"string"},{"name":"xw","type":"string"},{"name":"ddbh","type":"string"},{"name":"fssj","type":"string"},{"name":"location","type":"string"},{"name":"jd","type":"string"},{"name":"wd","type":"string"},{"name":"zdh","type":"string"},{"name":"sfd","type":"string"},{"name":"mdd","type":"string"},{"name":"zj","type":"string"}]}'
    2. 以JSON格式输入测试数据

      "R210124198701172625_20201208144517":{"id":"210124198701172625","xw":"TL","ddbh":"D1024","fssj":"20201208144517","location":"39.915119,116.403963","jd":"116.403963","wd":"39.915119","zdh":"01","sfd":"JNK","mdd":"WFK","zj":"test"}
  4. 验证结果

    1. 在打开了Consumer的窗口中我们可以看到刚才生产的数据

    2. ElasticSearch查询,在浏览器中输入以下链接,可以看到数据已经插入了

      http://10.110.5.81:9200/trace/_search?pretty

四、其他

  1. 可使用jps查看相关后台进程

  2. 可使用以下命令来监听端口,查看相关程序是否运行

    netstat -natpl
    netstat -tupln|grep 9092
  3. 删除ES的索引

    curl -XDELETE http://localhost:9200/trace
  4. Confluent Control Center(9021)CLI

  5. Kafka Connector REST API(8083)

  6. Kafka Rest Proxy(8082)

    ./bin/kafka-rest-start -daemon etc/kafka-rest/kafka-rest.properties
  7. 查看所有主题

    ./bin/kafka-topics \
    --list \
    --zookeeper 10.110.5.83:2181
  8. 删除某一主题,见:

    1. 标记要删除的主题

      ./bin/kafka-topics \
      --topic trace \
      --delete \
      --zookeeper 10.110.5.83:2181
    2. 修改配置文件,使删除功能可用

      vi etc/kafka/server.properties
      delete.topic.enable=true
    3. 在上面打开的配置文件中找到数据目录“log.dirs=/tmp/kafka-logs”,然后进入

      cd /tmp/kafka-logs
      rm -rf trace
    4. 停止Kafka

      1. 推荐

        ./bin/kafka-server-stop
      2. 强制

        jps
        kill -9 *****
    5. 重启Kafka

      ./bin/kafka-server-start etc/kafka/server.properties
    6. 确认是否已删除

      ./bin/kafka-topics \
      --list \
      --zookeeper 10.110.5.83:2181
    7. 可能需要重启Zookeeper

    8. 之后删除不需要重启和手动删除文件,只需调用步骤8.1。参考自:

      kafka如何彻底删除topic及数据 kafka全部数据清空与某一topic数据清空

    9. 异常

      1. 生产者异常:Schema being registered is incompatible with an earlier schema

      2. 解决方法:

        curl -X PUT -H "Content-Type: application/vnd.schemaregistry.v1+json" \
        --data '{"compatibility": "NONE"}' http://10.110.5.81:8081/config
    10. 查看某一主题的情况

      ./bin/kafka-topics \
      --topic trace \
      --describe \
      --bootstrap-server 10.110.5.81:9092
    11. 数据库表

      1. 数据库表在新建时,字段名不能用双引号引起来,否则会报错:ORA-00904:  invalid identifier

 

Kafka Connect Sink Oracle

  1. 准备工作

    1. 本章节在《Kafka Connect Sink Elasticsearch》基础之上进行

    2. 进入指定依赖包目录

      cd /home/confluent-5.2.1/share/java/kafka-connect-jdbc
    3. 上传ojdbc6.jar到步骤3中的目录

    4. 返回confluent目录

      cd /home/confluent-5.2.1
  2. 启动

    1. 启动Connector

      1. 修改配置文件

        1. 复制连接文件

          cp etc/kafka-connect-jdbc/sink-quickstart-sqlite.properties etc/kafka-connect-jdbc/oracle-sink.properties
        2. 打开配置文件oracle-sink.properties

          vi etc/kafka-connect-jdbc/oracle-sink.properties
        3. 修改配置

          name=oracle-sink
          connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
          tasks.max=1
          topics=trace
          connection.url=jdbc:oracle:thin:@地址:1521:IDR
          connection.user=用户名
          connection.password=密码
          auto.create=true
        4. 保存后关闭

    2. 启动Connector

      1. ./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties etc/kafka-connect-jdbc/oracle-sink.properties
      2. 因为在oracle-sink.properties中设置了auto.create=true,所以Connector会在Oracle中自动创建与主题同名的表(所以主题才大写的)

  3. 启动Producer

    ./bin/kafka-avro-console-producer \
    --topic trace \
    --broker-list 10.110.5.81:9092 \
    --property schema.registry.url=http://10.110.5.81:8081 \
    --property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"id","type":"string"},{"name":"xw","type":"string"},{"name":"ddbh","type":"string"},{"name":"fssj","type":"string"},{"name":"location","type":"string"},{"name":"jd","type":"string"},{"name":"wd","type":"string"},{"name":"zdh","type":"string"},{"name":"sfd","type":"string"},{"name":"mdd","type":"string"},{"name":"zj","type":"string"}]}'
  4. 输入数据

    {"id":"210124198701172625","xw":"TL","ddbh":"D1024","fssj":"20201208144517","location":"39.915119,116.403963","jd":"116.403963","wd":"39.915119","zdh":"01","sfd":"JNK","mdd":"WFK","zj":"test"}
  5. 验证结果。可以在相应的数据库中查询到数据,即证明成功了

    SELECT * FROM trace;

 

Kafka Connect Sink HBase

  1. 准备工作

    1. 本章节在《Kafka Connect Sink Elasticsearch》基础之上进行

    2. 进入指定依赖包目录

      cd /home/confluent-5.2.1/
    3. HBase的相关支持未包含在Confluent的jar中,需要单独下载解压更名文件夹

      1. 下载并上传到服务器后解压缩confluentinc-kafka-connect-hbase-1.0.5.zip

      2. 改名

        1. cd confluentinc-kafka-connect-hbase-1.0.5
        2. 将etc改名为etc/kafka-connect-hbase/

        3. 将lib改名为share/java/kafka-connect-hbase/

        4. 让后将这两个文件夹合并到confluent中去

  2. HBase准备

    1. 补充DNS

      1. 打开hosts

        vim /etc/hosts
      2. 粘贴大数据环境的DNS

    2. 创建表(TM_TRACE)及列族(info)

    3. 补充配置文件

      1. 上传hbase-site.xml/home/confluent-5.2.1/

      2. 打包配置文件

        jar -uvf share/java/kafka-connect-hbase/kafka-connect-hbase-1.0.5.jar hbase-site.xml
  3. 启动

    1. 启动Connector

      1. 修改配置文件

        1. 复制连接文件

          cp etc/kafka-connect-hbase/hbase-sink-quickstart.properties etc/kafka-connect-hbase/hbase-sink.properties
        2. 打开配置文件

          vi etc/kafka-connect-hbase/hbase-sink.properties
        3. 修改配置

          name=hbase-sink-connector
          connector.class=io.confluent.connect.hbase.HBaseSinkConnector
          hbase.zookeeper.quorum=master2.bds.inspur,manager.bds.inspur,master1.bds.inspur
          hbase.zookeeper.property.clientPort=2181
          gcp.bigtable.instance.id=HBASE-INSTANCE
          auto.create.tables=true
          auto.create.column.families=true
          # auto.offset.rest=latest
          # 表名;列族名在数据中指明
          table.name.format=TM_TRACE
          error.mode=ignore
          topics=tm_trace
          confluent.topic=tm_trace
          confluent.topic.replication.factor=1
          confluent.topic.bootstrap.servers=10.110.5.81:9092
          insert.mode=UPSERT
          max.batch.size=1000
          # row.key.definition=对象类型,证件号码,分隔符,发生时间,三位防重码
          row.key.definition=type,id,delimiter,fssj,unrepeat
          #去除多余的值(对于elasticsearch来说不多余)
          transforms=DropLocation
          transforms.DropLocation.type=org.apache.kafka.connect.transforms.MaskField$Value
          transforms.DropLocation.fields=location
        4. 保存后退出

      2. 启动Connector

        ./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties etc/kafka-connect-hbase/hbase-sink.properties
      3. 启动命令行Consumer辅助查看

        ./bin/kafka-avro-console-consumer \
        --topic tm_trace \
        --bootstrap-server 10.110.5.81:9092 \
        --from-beginning \
        --property schema.registry.url=http://10.110.5.81:8081 \
        --property print.key=true
    2. 启动Producer

      1. 启动

        ./bin/kafka-avro-console-producer \
        --topic tm_trace \
        --broker-list 10.110.5.81:9092 \
        --property parse.key=true \
        --property key.separator=@ \
        --property schema.registry.url=http://10.110.5.81:8081 \
        --property key.schema='{"namespace": "com.inspur.trace.entity.schema","name": "TraceKey","type":"string"}' \
        --property value.schema='{"namespace": "com.inspur.trace.entity.schema","name": "TraceHBaseValue","type":"record","fields":[{"name":"info","type":{"name":"columns","type":"record","fields":[{"name":"type","type":"string"},{"name":"id","type":"string"},{"name":"xw","type":"string"},{"name":"ddbh","type":"string"},{"name":"fssj","type":"string"},{"name":"location","type":"string"},{"name":"jd","type":"string"},{"name":"wd","type":"string"},{"name":"zdh","type":"string"},{"name":"sfd","type":"string"},{"name":"mdd","type":"string"},{"name":"zj","type":"string"}]}}]}'
        1. 注意:

          1. hbase-connector使用的是结构化数据org.apache.kafka.connect.data.Struct),所以需要使用能发送结构化数据的Producer,即avro-producer。

          2. 并且列族“info”需要在Producer中配置(符合HBase的理念),所以需要我们用avro配置模板(Schema)用于解析结构化数据

          3. 而且,如果想使用拼接列族功能,也需要为行键配置模板并发送结构化的行键。

          4. 所以才有了如上的命令。

      2. 关于Producer启动命令中参数的说明

        # 开启键的格式化
        parse.key=true
        # 将“@”作为键与值的分隔符
        key.separator=@
        # 模板注册中心地址
        schema.registry.url=……
        # 键的解析模板
        key.schema=……
        # 值的解析模板
        value.schema=……
      3. 关于行键

        1. 通过阅读RowKeyExtractor的源代码可知,HBaseSinkConnector有两种方式的接收行键:

          1. 一种是接收到的行键是结构化数据,HBaseSinkConnector将根据模板将结构化数据拼接为rowkey(需要在hbase-sink.properties指明字段名,同时还需要在Producer中启用并传递结构化键);

          2. 另一种是未指明行键字段名,然后将键(key)作为行键(rowkey)。

        2. 两种方式之外的搭配,不做处理,抛出异常。

        3. 采用结构化拼接键虽然高级,但是徒增复杂度,目前我想不出采用这种方式的的必要性,这里只是做技术验证,所以采用的是结构化拼接键。

        4. 真正应用的时候还是推荐直接使用键作为行键, 即,将“row.key.definition”注释掉。

    3. 输入测试数据

      "R210124198701172625_20201210104500est"@{"info":{"type":"R","id":"210124198701172625","xw":"TL","ddbh":"D1024","fssj":"20201210104502","location":"39.915119,116.403963","jd":"116.403963","wd":"39.915119","zdh":"01","sfd":"JNK","mdd":"WFK","zj":"test"}}
    4. 验证结果:直接查看HBase中的TM_TRACE是否有刚插入的值

 

Kafka Connect向多个目标输出

  1. 简介

    1. ElasticsearchHBase的表结构是不兼容的:

      1. 因为在不修改原jar中代码的前提下,HBase的数据有两层,外层为列族,比如列族为“info”的数据“{"info":{"xw":"LD","type":"R",……}}”,但是Elasticsearch的数据就一层,如果es-sink按照hbase-sink的两层结构来抽取数据到索引中,必然报错。

      2. 轨迹应用依赖Elasticsearch的基于地理位置的搜索功能,这就要求我们新增一个geo_point类型的字段用来进行纬经度的搜索(暂且称呼该字段为“location”),但是HBase并不需要这一字段,会多余。

    2. KSQL可以解决这俩问题。

  2. KSQL

    1. 简介

      1. KSQL就是一个用来操作Kafka中的数据流的工具。

      2. 具体地来说,KSQL可以通过编写SQL来操作Kafka中的数据流,它可以解析结构化的数据,以及直接输出结构化数据的指定字段,还可以新增字段。

    2. 启动

      1. 服务端

        1. 打开配置文件

          vim etc/ksqldb/ksql-server.properties
        2. 编辑配置文件

          listeners=http://10.110.5.81:8088
          ksql.logging.processing.topic.auto.create=true
          ksql.logging.processing.stream.auto.create=true
          bootstrap.servers=10.110.5.81:9092
          ksql.schema.registry.url=http://10.110.5.81:8081
        3. 启动服务端

          ./bin/ksql-server-start-6.0.1 etc/ksqldb/ksql-server.properties
      2. 客户端

        1. 启动完服务端后,我们还需要启动客户端,增加一系列配置。

        2. 启动

          ./bin/ksql http://10.110.5.81:8088
        3. 创建新数据流,以便从“tm_trace”中引入原始数据

          1. create stream trace_hbase (info struct) with (kafka_topic='tm_trace', value_format='avro');
          2. 注意:大小写不敏感,除非用英文双引号包含,与Oracle相仿

        4. 解析“tm_trace”,去除外层包裹,同时新增地理位置字段“location”,然后将该数据流写入到负责向elasticsearch中写入数据的主题中去。

          1. create stream t1 with (kafka_topic='trace') as select info->type "type", info->id "id", info->xw "xw", info->ddbh "ddbh", info->fssj "fssj", info->jd "jd", info->wd "wd", struct("lat" := info->wd, "lon" := info->jd) AS "location", info->zdh "zdh", info->sfd "sfd", info->mdd "mdd", info->zj "zj" from trace_hbase;
          2. 注意:主题已更换,需要到./etc/kafka-connect-hbase/hbase-sink.properties中修改主题

        5. 修改SQL

          很遗憾,KSQL不支持修改已经创建的数据流,我们需要删除原来的,然后重新创建:

          1. 确认查询是否正在运行

            show queries;
          2. 如果有,需要将其停掉

            terminate QueryId;(QueryId大小写敏感)
          3. 查看流的名称

            show streams;
          4. 删除旧的流

            drop stream trace_hbase;
            drop stream trace_elasticsearch;
          5. 其他有用SQL参照

            1. set 'auto.offset.reset' = 'earliest';
              select "jd", "wd" from t1 EMIT CHANGES;
      3. 启动Connector

        ./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties etc/kafka-connect-elasticsearch/elasticsearch-sink.properties etc/kafka-connect-hbase/hbase-sink.properties

 

集群模式

  1. 分别对集群中的每台机器进行如下操作

    1. 部分ip需要填写本机ip,比如①处

    2. “localhost”、“0.0.0.0”,这两个地址不会被正确识别,需要填写实际ip,如10.110.5.81

  2. Kafka Broker

    1. 打开文件

      vi etc/kafka/server.properties
    2. 修改(或添加)几个配置

      broker.id=2
      host.name=10.110.5.81①
      listeners=PLAINTEXT://10.110.5.81:9092
      advertised.listeners=PLAINTEXT://10.110.5.81:9092
      zookeeper.connect=10.110.5.81:2181,10.110.5.82:2181,10.110.5.83:2181
      delete.topic.enable=true
    3. 保存并退出

    4. 启动Broker

      ./bin/kafka-server-start etc/kafka/server.properties
    5. 之后

      1. ctrl + c停止该Broker

      2. 以后台进程方式启动Broker

        ./bin/kafka-server-start -daemon etc/kafka/server.properties
      3. 确认Broker是否启动

        1. 输入jps,如果有名为SupportedKafka(Broker)的进程,则表示启动成功

  3. Schema Register
    1. 打开文件

      vi etc/schema-registry/schema-registry.properties
    2. 修改配置

      listeners=http://10.110.5.81:8081
      kafkastore.connection.url=10.110.5.81:2181,10.110.5.82:2181,10.110.5.83:2181
    3. 保存并退出

    4. 启动

      ./bin/schema-registry-start ./etc/schema-registry/schema-registry.properties
    5. 之后

      1. Kafka Broker中的步骤五相似

  4. Kafka Rest API

    1. 打开文件

      vi etc/kafka-rest/kafka-rest.properties
    2. 修改配置

      schema.registry.url=http://10.110.5.81:8081
      zookeeper.connect=10.110.5.81:2181,10.110.5.82:2181,10.110.5.83:2181
    3. 保存并退出

    4. 启动

      ./bin/kafka-rest-start etc/kafka-rest/kafka-rest.properties
    5. 之后

      1. Kafka Broker中的步骤五相似

      2. Kafka Broker中的步骤五不同的是,要以以下方式才能后台运行

        nohup ./bin/kafka-rest-start etc/kafka-rest/kafka-rest.properties > /dev/null 2>&1 &
  5. 新建主题(3分区,2备份)

    ./bin/kafka-topics \
    --topic trace \
    --create \
    --zookeeper 10.110.5.81:2181,10.110.5.82:2181,10.110.5.83:2181 \
    --partitions 3 \
    --replication-factor 2
  6. Kafka connect

    1. 打开文件

      vi etc/schema-registry/connect-avro-distributed.properties
    2. 修改配置

      bootstrap.servers=10.110.5.81:9092,10.110.5.82:9092,10.110.5.83:9092
      group.id=acp_cluster
    3. 保存后退出

    4. 启动

      ./bin/connect-distributed etc/kafka/connect-distributed.properties
    5. 之后

      1. Kafka Broker中的步骤五相似

  7. 创建Connector

    1. 使用Kafka Rest API(8083)启动Connector,参数内容参照etc/kafka-connect-***/***.properties配置文件,如:

      curl 'http://10.110.5.81:8083/connectors' -X POST -i -H "Content-Type:application/json" -d '
      {
      "name":"elasticsearch-sink",
      "config":{
      "connector.class":"io.confluent.connect.elasticsearch.ElasticsearchSinkConnector",
      "tasks.max":10,
      "topics":"ACP_KAFKA",
      "key.ignore":true,
      "schema.ignore":true,
      "connection.url":"http://10.110.5.81:9200",
      "type.name":"kafka-connect",
      "schema.registry.url":"http://10.110.5.81:8081",
      "key.converter":"io.confluent.connect.avro.AvroConverter",
      "key.converter.schema.registry.url":"http://10.110.5.81:8081",
      "value.converter":"io.confluent.connect.avro.AvroConverter",
      "value.converter.schema.registry.url":"http://10.110.5.81:8081",
      "topic.index.map":"ACP_KAFKA:acp_kafka_message"
      }
      }
      '
      curl 'http://10.110.5.81:8083/connectors' -X POST -i -H "Content-Type:application/json" -d '
      {
      "name":"oracle-sink",
      "config":{
      "connector.class":"io.confluent.connect.jdbc.JdbcSinkConnector",
      "tasks.max":"1",
      "topics":"ACP_KAFKA",
      "connection.url":"jdbc:oracle:thin:@10.110.1.132:1521:IDR",
      "connection.user":"ppw",
      "connection.password":"ppw",
      "auto.create":"true",
      "schema.registry.url":"http://10.110.5.81:8081",
      "key.converter":"io.confluent.connect.avro.AvroConverter",
      "key.converter.schema.registry.url":"http://10.110.5.81:8081",
      "value.converter":"io.confluent.connect.avro.AvroConverter",
      "value.converter.schema.registry.url":"http://10.110.5.81:8081",
      "table.name.format":"ACP_KAFKA_MESSAGE",
      "transforms":"Rename",
      "transforms.Rename.type":"org.apache.kafka.connect.transforms.ReplaceField$Value",
      "transforms.Rename.renames":"id:ID,type:TYPE,msg:MESSAGE,time:TIME"
      }
      }
      '
      curl 10.110.5.81:8083/connectors/oracle-sink/config -X PUT -i -H "Content-Type:application/json" -d '{}'
    2. 这种方式不可行

      ./bin/confluent load elasticsearch-sink -d etc/kafka-connect-elasticsearch/elasticsearch-sink.properties
  8. 对Kafka Rest API的补充说明

    方法路径说明
    GET /connectors 返回活动连接器的列表
    POST /connectors 创建一个新的连接器; 请求主体应该是包含字符串name字段和config带有连接器配置参数的对象字段的JSON对象
    GET /connectors/{name} 获取有关特定连接器的信息
    GET /connectors/{name}/config 获取特定连接器的配置参数
    PUT /connectors/{name}/config 更新特定连接器的配置参数
    GET /connectors/{name}/status 获取连接器的当前状态,包括连接器是否正在运行,失败,已暂停等,分配给哪个工作者,失败时的错误信息以及所有任务的状态
    GET /connectors/{name}/tasks 获取当前为连接器运行的任务列表
    GET /connectors/{name}/tasks/{taskid}/status 获取任务的当前状态,包括如果正在运行,失败,暂停等,分配给哪个工作人员,如果失败,则返回错误信息
    PUT /connectors/{name}/pause 暂停连接器及其任务,停止消息处理,直到连接器恢复
    PUT /connectors/{name}/resume 恢复暂停的连接器(或者,如果连接器未暂停,则不执行任何操作)
    POST /connectors/{name}/restart 重新启动连接器(通常是因为失败)
    POST /connectors/{name}/tasks/{taskId}/restart 重启个别任务(通常是因为失败)
    DELETE /connectors/{name} 删除连接器,停止所有任务并删除其配置

 

其他

curl -i -X POST -H "Content-Type: application/vnd.kafka.json.v1+json" \
--data '{"records": [{"key": "somekey","value": {"field": "bar"}},{"value": [ "field", "bar" ],"partition": 0}]}' localhost:8082/topics/trace

 

AVRO

./bin/connect-standalone -daemon  etc/schema-registry/connect-avro-standalone.properties etc/kafka-connect-elasticsearch/elasticsearch-sink.properties
./bin/kafka-avro-console-producer \
--broker-list localhost:9092 \
--topic test \
--property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}'
{"f1": "value1"}
./bin/kafka-avro-console-consumer \
--topic trace \
--bootstrap-server 10.110.5.81:9092 \
--from-beginning \
--property schema.registry.url=http://10.110.5.81:8081
./bin/kafka-console-producer \
--broker-list localhost:9092 \
--topic acp-test \
--property value.schema='{"type":"record","name":"myrecord","fields":[{"name":"f1","type":"string"}]}'

 

Schema Registry

# Register a new version of a schema under the subject "trace-key"

$ curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data '{"schema": "{\"type\": \"string\"}"}' \
http://10.110.5.81:8081/subjects/trace-key/versions
{"id":1}

 

# Register a new version of a schema under the subject "trace-value"

$ curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data '{"schema": "{\"type\": \"string\"}"}' \
http://10.110.5.81:8081/subjects/trace-value/versions
{"id":1}

 

# List all subjects

curl -X GET http://10.110.5.81:8081/subjects
["trace-value","trace-key"]

 

# List all schema versions registered under the subject "trace-value"

curl -X GET http://10.110.5.81:8081/subjects/trace-value/versions
[1]

 

# Fetch a schema by globally unique id 1

curl -X GET http://10.110.5.81:8081/schemas/ids/1
{"schema":"\"string\""}

 

# Fetch version 1 of the schema registered under subject "trace-value"

curl -X GET http://10.110.5.81:8081/subjects/trace-value/versions/1
{"subject":"trace-value","version":1,"id":1,"schema":"\"string\""}

 

# Fetch the most recently registered schema under subject "trace-value"

curl -X GET http://10.110.5.81:8081/subjects/trace-value/versions/latest
{"subject":"trace-value","version":1,"id":1,"schema":"\"string\""}

 

# Delete version 3 of the schema registered under subject "trace-value"

curl -X DELETE http://10.110.5.81:8081/subjects/trace-value/versions/3
3

 

# Delete all versions of the schema registered under subject "trace-value"

curl -X DELETE http://10.110.5.81:8081/subjects/trace-value
[1, 2, 3, 4, 5]

 

# Check whether a schema has been registered under subject "trace-key"

curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data '{"schema": "{\"type\": \"string\"}"}' \
http://10.110.5.81:8081/subjects/trace-key
{"subject":"trace-key","version":1,"id":1,"schema":"\"string\""}

 

# Test compatibility of a schema with the latest schema under subject "trace-value"

curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data '{"schema": "{\"type\": \"string\"}"}' \
http://10.110.5.81:8081/compatibility/subjects/trace-value/versions/latest
{"is_compatible":true}

 

# Get top level config

curl -X GET http://10.110.5.81:8081/config
{"compatibilityLevel":"BACKWARD"}

 

# Update compatibility requirements globally

curl -X PUT -H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data '{"compatibility": "NONE"}' \
http://10.110.5.81:8081/config
{"compatibility":"NONE"}

 

# Update compatibility requirements under the subject "trace-value"

curl -X PUT -H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data '{"compatibility": "BACKWARD"}' \
http://10.110.5.81:8081/config/trace-value
{"compatibility":"BACKWARD"}

 

posted @ 2021-10-02 18:22  LeanLeeOne  阅读(3246)  评论(0)    收藏  举报