spark shell 编程
启动 Spark
[root@centos02 centos02]# cd $SPARK_HOME/sbin/start-all.sh starting org.apache.spark.deploy.master.Master, logging to /opt/bigdata/spark/spark-2.3.3/logs/spark-centos02-org.apache.spark.deploy.master.Master-1-centos02.out failed to launch: nice -n 0 /opt/bigdata/spark/spark-2.3.3/bin/spark-class org.apache.spark.deploy.master.Master --host centos02 --port 7077 --webui-port 8080 full log in /opt/bigdata/spark/spark-2.3.3/logs/spark-centos02-org.apache.spark.deploy.master.Master-1-centos02.out localhost: starting org.apache.spark.deploy.worker.Worker, logging to /opt/bigdata/spark/spark-2.3.3/logs/spark-root-org.apache.spark.deploy.worker.Worker-1-centos02.out [root@centos02 sbin]#
进入 spark-shell
[root@centos02 sbin]# cd $SPARK_HOME/bin [root@centos02 bin]# spark-shell SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/opt/bigdata/spark/spark-2.3.3/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/bigdata/hadoop/hadoop-2.8.5/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/bigdata/hive/hive-2.3.4/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] 19/09/10 18:27:13 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Spark context Web UI available at http://centos02:4040 Spark context available as 'sc' (master = local[*], app id = local-1568111388307). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.3.3 /_/ Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_181) Type in expressions to have them evaluated. Type :help for more information. scala>
操作中间遇到了一些异常
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
Caused by: java.lang.reflect.InvocationTargetException: javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
Caused by: javax.jdo.JDOFatalInternalException: Error creating transactional connection factory
Caused by: java.lang.reflect.InvocationTargetException: java.lang.NoClassDefFoundError: Could not initialize class org.apache.derby.jdbc.EmbeddedDriver
Caused by: java.lang.NoClassDefFoundError: Could not initialize class org.apache.derby.jdbc.EmbeddedDriver
运行spark时提示Could not initialize class org.apache.derby.jdbc.EmbeddedDriver的错误
是因为找不到hive的配置文件hive-site.xml,
默认使用derby数据库导致,可把配置文件放到spark配置目录解决。
#将hive/conf/hive-site.xml 复制 到 spark/conf 目录下 cp $HIVE_HOME/conf/hive-site.xml $SPARK_HOME/conf
运行环境编程
scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc) warning: there was one deprecation warning; re-run with -deprecation for details sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@4e2824b1 scala> val df = sqlContext.read.json("file:///opt/Files/json/line.json"); 19/09/10 19:01:51 WARN HiveConf: HiveConf of name hive.home does not exist df: org.apache.spark.sql.DataFrame = [FAddDate: string, FAddTime: string ... 9 more fields]
scala> df.printSchema(); root |-- FAddDate: string (nullable = true) |-- FAddTime: string (nullable = true) |-- FAddTimeDate: string (nullable = true) |-- FCompanyID: long (nullable = true) |-- FID: long (nullable = true) |-- FIP: string (nullable = true) |-- FParentID: long (nullable = true) |-- FStatus: long (nullable = true) |-- FThreadID: long (nullable = true) |-- FUserName: string (nullable = true) |-- Fguid: string (nullable = true)
scala> df.show();
+----------+--------+-------------------+----------+-----+--------------+---------+-------+---------+---------+--------------------+
| FAddDate|FAddTime| FAddTimeDate|FCompanyID| FID| FIP|FParentID|FStatus|FThreadID|FUserName| Fguid|
+----------+--------+-------------------+----------+-----+--------------+---------+-------+---------+---------+--------------------+
|2014-01-14|00:00:00|2014-01-14 00:00:00| 3067| 3067| 127.0.0.1| 0| 1| 44| admin|a730eb2b-d2a1-11e...|
|2014-09-01|10:15:34|2014-09-01 10:15:34| 36052|36052| 127.0.0.1| 0| 1| 2| f004|a730f0c3-d2a1-11e...|
|2014-09-01|19:06:18|2014-09-01 19:06:18| 3067|36198| 127.0.0.1| 3067| 1| 77| admin9|a730f213-d2a1-11e...|
|2014-09-01|20:01:46|2014-09-01 20:01:46| 36052|36215|210.79.115.102| 36052| 0| 82| clqtest|a730f31d-d2a1-11e...|
|2014-09-02|09:23:15|2014-09-02 09:23:15| 36052|36414| 127.0.0.1| 36052| 1| 78| csh00a|a730f40e-d2a1-11e...|
|2014-09-02|11:22:42|2014-09-02 11:22:42| 36052|36415| 127.0.0.1| 36052| 1| 46| A001|a730f4c9-d2a1-11e...|
|2014-09-02|13:40:40|2014-09-02 13:40:40| 36052|36463| 127.0.0.1| 36052| 1| 95| xq001|a730f6ce-d2a1-11e...|
|2014-09-02|17:33:45|2014-09-02 17:33:45| 3067|36580| 127.0.0.1| 3067| 1| 39| admin1|a730f77c-d2a1-11e...|
|2014-09-03|22:13:12|2014-09-03 22:13:12| 36052|36779| 127.0.0.1| 36052| 1| 10| ftest002|a730f812-d2a1-11e...|
|2014-10-27|17:59:32|2014-10-27 17:59:32| 3067|37402| 210.79.79.33| 3067| 1| 33| fhj1|a730f8a9-d2a1-11e...|
+----------+--------+-------------------+----------+-----+--------------+---------+-------+---------+---------+--------------------+
scala> df.createOrReplaceTempView("tuser") scala> sqlContext.sql("select * from tuser").show() +----------+--------+-------------------+----------+-----+--------------+---------+-------+---------+---------+--------------------+ | FAddDate|FAddTime| FAddTimeDate|FCompanyID| FID| FIP|FParentID|FStatus|FThreadID|FUserName| Fguid| +----------+--------+-------------------+----------+-----+--------------+---------+-------+---------+---------+--------------------+ |2014-01-14|00:00:00|2014-01-14 00:00:00| 3067| 3067| 127.0.0.1| 0| 1| 44| admin|a730eb2b-d2a1-11e...| |2014-09-01|10:15:34|2014-09-01 10:15:34| 36052|36052| 127.0.0.1| 0| 1| 2| f004|a730f0c3-d2a1-11e...| |2014-09-01|19:06:18|2014-09-01 19:06:18| 3067|36198| 127.0.0.1| 3067| 1| 77| admin9|a730f213-d2a1-11e...| |2014-09-01|20:01:46|2014-09-01 20:01:46| 36052|36215|210.79.115.102| 36052| 0| 82| clqtest|a730f31d-d2a1-11e...| |2014-09-02|09:23:15|2014-09-02 09:23:15| 36052|36414| 127.0.0.1| 36052| 1| 78| csh00a|a730f40e-d2a1-11e...| |2014-09-02|11:22:42|2014-09-02 11:22:42| 36052|36415| 127.0.0.1| 36052| 1| 46| A001|a730f4c9-d2a1-11e...| |2014-09-02|13:40:40|2014-09-02 13:40:40| 36052|36463| 127.0.0.1| 36052| 1| 95| xq001|a730f6ce-d2a1-11e...| |2014-09-02|17:33:45|2014-09-02 17:33:45| 3067|36580| 127.0.0.1| 3067| 1| 39| admin1|a730f77c-d2a1-11e...| |2014-09-03|22:13:12|2014-09-03 22:13:12| 36052|36779| 127.0.0.1| 36052| 1| 10| ftest002|a730f812-d2a1-11e...| |2014-10-27|17:59:32|2014-10-27 17:59:32| 3067|37402| 210.79.79.33| 3067| 1| 33| fhj1|a730f8a9-d2a1-11e...| +----------+--------+-------------------+----------+-----+--------------+---------+-------+---------+---------+--------------------+ scala>
line.json文件内容
{"Fguid":"a730eb2b-d2a1-11e9-a80c-5254003d609c","FID":3067,"FUserName":"admin","FParentID":0,"FCompanyID":3067,"FStatus":1,"FAddTimeDate":"2014-01-14 00:00:00","FAddDate":"2014-01-14","FAddTime":"00:00:00","FIP":"127.0.0.1","FThreadID":44}
{"Fguid":"a730f0c3-d2a1-11e9-a80c-5254003d609c","FID":36052,"FUserName":"f004","FParentID":0,"FCompanyID":36052,"FStatus":1,"FAddTimeDate":"2014-09-01 10:15:34","FAddDate":"2014-09-01","FAddTime":"10:15:34","FIP":"127.0.0.1","FThreadID":2}
{"Fguid":"a730f213-d2a1-11e9-a80c-5254003d609c","FID":36198,"FUserName":"admin9","FParentID":3067,"FCompanyID":3067,"FStatus":1,"FAddTimeDate":"2014-09-01 19:06:18","FAddDate":"2014-09-01","FAddTime":"19:06:18","FIP":"127.0.0.1","FThreadID":77}
{"Fguid":"a730f31d-d2a1-11e9-a80c-5254003d609c","FID":36215,"FUserName":"clqtest","FParentID":36052,"FCompanyID":36052,"FStatus":0,"FAddTimeDate":"2014-09-01 20:01:46","FAddDate":"2014-09-01","FAddTime":"20:01:46","FIP":"210.79.115.102","FThreadID":82}
{"Fguid":"a730f40e-d2a1-11e9-a80c-5254003d609c","FID":36414,"FUserName":"csh00a","FParentID":36052,"FCompanyID":36052,"FStatus":1,"FAddTimeDate":"2014-09-02 09:23:15","FAddDate":"2014-09-02","FAddTime":"09:23:15","FIP":"127.0.0.1","FThreadID":78}
{"Fguid":"a730f4c9-d2a1-11e9-a80c-5254003d609c","FID":36415,"FUserName":"A001","FParentID":36052,"FCompanyID":36052,"FStatus":1,"FAddTimeDate":"2014-09-02 11:22:42","FAddDate":"2014-09-02","FAddTime":"11:22:42","FIP":"127.0.0.1","FThreadID":46}
{"Fguid":"a730f6ce-d2a1-11e9-a80c-5254003d609c","FID":36463,"FUserName":"xq001","FParentID":36052,"FCompanyID":36052,"FStatus":1,"FAddTimeDate":"2014-09-02 13:40:40","FAddDate":"2014-09-02","FAddTime":"13:40:40","FIP":"127.0.0.1","FThreadID":95}
{"Fguid":"a730f77c-d2a1-11e9-a80c-5254003d609c","FID":36580,"FUserName":"admin1","FParentID":3067,"FCompanyID":3067,"FStatus":1,"FAddTimeDate":"2014-09-02 17:33:45","FAddDate":"2014-09-02","FAddTime":"17:33:45","FIP":"127.0.0.1","FThreadID":39}
{"Fguid":"a730f812-d2a1-11e9-a80c-5254003d609c","FID":36779,"FUserName":"ftest002","FParentID":36052,"FCompanyID":36052,"FStatus":1,"FAddTimeDate":"2014-09-03 22:13:12","FAddDate":"2014-09-03","FAddTime":"22:13:12","FIP":"127.0.0.1","FThreadID":10}
{"Fguid":"a730f8a9-d2a1-11e9-a80c-5254003d609c","FID":37402,"FUserName":"fhj1","FParentID":3067,"FCompanyID":3067,"FStatus":1,"FAddTimeDate":"2014-10-27 17:59:32","FAddDate":"2014-10-27","FAddTime":"17:59:32","FIP":"210.79.79.33","FThreadID":33}



浙公网安备 33010602011771号