Spark -pycharm调试
Spark Kafka调试
在本地的spark安装包内找到spark-defaults.conf的配置文件在末尾添加一行
spark.jars.packages org.apache.spark:spark-streaming-kafka-0-8_2.10:2.0.1
举例:spark-streaming-kafka-0-10_2.11-2.3.0.jar,其中,2.11表示scala的版本,2.3.0表示Spark版本号。
这是本地操作kafka和一些组件的必备jar包。配置完毕以后就可以在maven仓库下载相关jar包了
import time import os import sys from pyspark import SparkContext from pyspark.streaming import StreamingContext from pyspark.streaming.kafka import KafkaUtils from operator import add zkQuorum = 'localhost:2181' topic = {'ztf': 1} group_id = "test-consumer-group" def main(ssc): pass if __name__ == "__main__": os.environ['SPARK_HOME'] = r'/opt/spark-2.2.3-bin-hadoop2.7' os.environ['JAVA_HOME'] = r'/opt/jdk8' # sys.path.append(r"/opt/spark-2.2.3-bin-hadoop2.7") sc = SparkContext(master="local[2]", appName="SparkCount") ssc = StreamingContext(sc, 2) main(ssc) ssc.start() ssc.awaitTermination()

浙公网安备 33010602011771号