logstash加载千万级数据

参考:

https://blog.csdn.net/qq_16272049/article/details/80642894

https://yq.aliyun.com/articles/152043?t=t1

https://elasticsearch.cn/question/4015

https://discuss.elastic.co/t/how-should-i-use-sql-last-value-in-logstash/64595

https://segmentfault.com/a/1190000011784259

 

 

进入目录
cd /root/software/logstash-5.5.2/config

创建文件
vi sync_tabperson.conf

input {
    stdin { }
    jdbc {
        jdbc_driver_library => "/root/mysql-connector-java-5.1.46.jar"
        jdbc_driver_class => "com.mysql.jdbc.Driver"
        jdbc_connection_string => "jdbc:mysql://127.0.0.1:3306/test"
        jdbc_user => "root"
        jdbc_password => "123456"
        jdbc_validate_connection => true
        jdbc_paging_enabled => "true"
        jdbc_page_size => "1000"
        statement_filepath => "/root/software/logstash-5.5.2/config/jdbc.sql"
        schedule => "* * * * *"
        use_column_value => true
        clean_run => true 
        tracking_column => "case_id"
        tracking_column_type => "numeric"
        last_run_metadata_path => "/root/software/logstash-5.5.2/config/my_info"
    }
 }

 output {
    elasticsearch {
        hosts => "公网ip:9200"
        index => "case_all"
        document_type => "mytype"
        document_id => "%{case_id}"
    }
}

 

 

vi /root/software/logstash-5.5.2/config/jdbc.sql

注意,这个文件必须是utf-8编码,否则报错,具体可参考ruby中in `split': invalid byte sequence in UTF-8 (ArgumentError)解决方法

https://blog.csdn.net/lmmzsn/article/details/78839219

SELECT
    * from  table
    and cu.case_id > :sql_last_value and cu.case_id < (:sql_last_value + 1000)
GROUP BY
    a.case_id 

 

 

启动数据:


cd /root/software/logstash-5.5.2
bin/logstash -f config/sync_tabperson.conf

 

posted on 2018-12-14 17:44  ziyi_ang  阅读(961)  评论(0)    收藏  举报

导航