Spark SQL 数据源 json文件
1.启动命令
[root@cdh1 ~]# spark-shell 22/05/24 20:24:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Spark context Web UI available at http://cdh1:4040 Spark context available as 'sc' (master = local[*], app id = local-1653395123656). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.1.2 /_/ Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_311) Type in expressions to have them evaluated. Type :help for more information. scala>
#Spark SQL 使用反射推断模式
scala> var sqlcontext = new org.apache.spark.sql.SQLContext(sc) warning: there was one deprecation warning (since 2.0.0); for details, enable `:setting -deprecation' or `:replay -deprecation' sqlcontext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@15c8c97
scala> import sqlContext._
import sqlContext._
scala> case class Demo(id: Int, name: String, age: Int) defined class Demo
scala> val empl=sc.textFile("Demo.txt").map(_.split(",")).map(e⇒Demo(e(0).trim.toInt,e(1),e(2).trim.toInt)).toDF() empl: org.apache.spark.sql.DataFrame = [id: int, name: string ... 1 more field]
scala> empl.registerTempTable("Demo") warning: there was one deprecation warning (since 2.0.0); for details, enable `:setting -deprecation' or `:replay -deprecation' scala> val allcolumn = sqlContext.sql("select * from Demo") allcolumn: org.apache.spark.sql.DataFrame = [id: int, name: string ... 1 more field] scala> allcolumn.show() +----+-------+------+ | id| name| age| +----+-------+------+ |1201| satish|251202| +----+-------+------+
scala> val onecolumn = sqlContext.sql("select * from Demo where id > 1200") onecolumn: org.apache.spark.sql.DataFrame = [id: int, name: string ... 1 more field] scala> onecolumn.show() +----+-------+------+ | id| name| age| +----+-------+------+ |1201| satish|251202| +----+-------+------+
scala> onecolumn.map(t=>"ID: "+t(0)).collect().foreach(println) ID: 1201 scala> onecolumn.map(t=>"ID:"+t(0)+" NAME:"+t(1)).collect().foreach(println) ID:1201 NAME: satish
-- 使用反射来推断模式(此文章有部分错误点;)
作者:M_Fight๑҉
本文版权归作者和博客园共有,欢迎转载,但未经作者同意必须保留此段声明,且在文章页面明显位置给出原文连接,否则保留追究法律责任的权利。

浙公网安备 33010602011771号