Spark -pycharm调试

Spark Kafka调试

在本地的spark安装包内找到spark-defaults.conf的配置文件在末尾添加一行

spark.jars.packages org.apache.spark:spark-streaming-kafka-0-8_2.10:2.0.1

举例:spark-streaming-kafka-0-10_2.11-2.3.0.jar,其中,2.11表示scala的版本,2.3.0表示Spark版本号。

这是本地操作kafka和一些组件的必备jar包。配置完毕以后就可以在maven仓库下载相关jar包了

import time
import os
import sys

from pyspark import SparkContext
from pyspark.streaming import StreamingContext
from pyspark.streaming.kafka import KafkaUtils
from operator import add

zkQuorum = 'localhost:2181'
topic = {'ztf': 1}
group_id = "test-consumer-group"


def main(ssc):
   pass


if __name__ == "__main__":
    os.environ['SPARK_HOME'] = r'/opt/spark-2.2.3-bin-hadoop2.7'
    os.environ['JAVA_HOME'] = r'/opt/jdk8'
    # sys.path.append(r"/opt/spark-2.2.3-bin-hadoop2.7")
    sc = SparkContext(master="local[2]", appName="SparkCount")
    ssc = StreamingContext(sc, 2)
    main(ssc)
    ssc.start()
    ssc.awaitTermination()

 

posted @ 2019-03-18 14:05  逐梦客!  阅读(344)  评论(0)    收藏  举报