2.4
天学习了Spark GraphX,它是Spark中用于图计算的库。GraphX提供了图数据的抽象和操作,适用于社交网络、推荐系统等场景。
代码示例:
python
复制
from pyspark import SparkContext
from pyspark.sql import SparkSession
from graphframes import GraphFrame
# 初始化SparkContext
sc = SparkContext("local", "GraphX Example")
spark = SparkSession(sc)
# 创建顶点和边
vertices = spark.createDataFrame([
("1", "Alice", 25),
("2", "Bob", 30),
("3", "Cathy", 28)
], ["id", "name", "age"])
edges = spark.createDataFrame([
("1", "2", "friend"),
("2", "3", "follow")
], ["src", "dst", "relationship"])
# 创建图
graph = GraphFrame(vertices, edges)
# 显示图的顶点和边
graph.vertices.show()
graph.edges.show()
# 计算顶点的度数
degrees = graph.degrees
degrees.show()
sc.stop()
输出:
复制
+---+-----+---+
| id| name|age|
+---+-----+---+
| 1|Alice| 25|
| 2| Bob| 30|
| 3|Cathy| 28|
+---+-----+---+
+---+---+------------+
|src|dst|relationship|
+---+---+------------+
| 1| 2| friend|
| 2| 3| follow|
+---+---+------------+
+---+------+
| id|degree|
+---+------+
| 2| 2|
| 1| 1|
| 3| 1|
+---+------+
浙公网安备 33010602011771号