摘要:
// 数据导入val data = spark.read.option("header", "true").csv("data/adult.csv") // 数据预处理val assembler = new VectorAssembler() .setInputCols(Array("age", " 阅读全文
posted @ 2025-01-24 19:14
为20岁努力
阅读(4)
评论(0)
推荐(0)
摘要:
划分训练集和测试集为了评估模型性能,我们需要将数据集划分为训练集和测试集。scala// 随机划分数据集为训练集和测试集val Array(trainingData, testData) = data.randomSplit(Array(0.7, 0.3), seed = 1234L)5. 创建逻辑 阅读全文
posted @ 2025-01-23 22:07
为20岁努力
阅读(3)
评论(0)
推荐(0)
摘要:
// 创建StreamingContextval ssc = new StreamingContext(sc, Seconds(5)) // 从Flume接收数据val flumeStream = FlumeUtils.createStream(ssc, "localhost", 44444) // 阅读全文
posted @ 2025-01-22 22:26
为20岁努力
阅读(2)
评论(0)
推荐(0)
摘要:
// 创建SparkSessionval spark = SparkSession.builder() .appName("SparkSQLExample") .config("spark.master", "local") .getOrCreate() // 创建DataFrameval data 阅读全文
posted @ 2025-01-20 20:14
为20岁努力
阅读(1)
评论(0)
推荐(0)
摘要:
// 创建SparkContextval sc = new SparkContext("local", "RDDExample") // 创建RDDval data = Array(1, 2, 3, 4, 5)val distData = sc.parallelize(data, 2) // 转换操 阅读全文
posted @ 2025-01-19 21:14
为20岁努力
阅读(2)
评论(0)
推荐(0)
摘要:
# 下载Sparkwget https://downloads.apache.org/spark/spark-2.1.0/spark-2.1.0-bin-hadoop2.7.tgz # 解压到指定目录sudo tar -xzf spark-2.1.0-bin-hadoop2.7.tgz -C /us 阅读全文
posted @ 2025-01-18 16:23
为20岁努力
阅读(3)
评论(0)
推荐(0)
摘要:
# 安装Javasudo apt updatesudo apt install openjdk-8-jdk -y # 配置Java环境变量echo "export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64" >> ~/.bashrcecho "expor 阅读全文
posted @ 2025-01-17 18:14
为20岁努力
阅读(2)
评论(0)
推荐(0)
摘要:
import org.apache.spark.graphx.{Edge, Graph, VertexId}import org.apache.spark.rdd.RDDimport org.apache.spark.{SparkConf, SparkContext} object SparkGra 阅读全文
posted @ 2025-01-16 20:25
为20岁努力
阅读(2)
评论(0)
推荐(0)
摘要:
import org.apache.spark.{SparkConf, SparkContext} object SparkBasic { def main(args: Array[String]): Unit = { val conf = new SparkConf().setAppName("S 阅读全文
posted @ 2025-01-15 20:25
为20岁努力
阅读(3)
评论(0)
推荐(0)
摘要:
import org.apache.spark.{SparkConf, SparkContext}import org.apache.spark.streaming.{Seconds, StreamingContext}import org.apache.spark.sql.SparkSession 阅读全文
posted @ 2025-01-14 20:25
为20岁努力
阅读(2)
评论(0)
推荐(0)