2.4

天学习了Spark GraphX,它是Spark中用于图计算的库。GraphX提供了图数据的抽象和操作,适用于社交网络、推荐系统等场景。

代码示例:

python
复制
from pyspark import SparkContext
from pyspark.sql import SparkSession
from graphframes import GraphFrame

# 初始化SparkContext
sc = SparkContext("local", "GraphX Example")
spark = SparkSession(sc)

# 创建顶点和边
vertices = spark.createDataFrame([
("1", "Alice", 25),
("2", "Bob", 30),
("3", "Cathy", 28)
], ["id", "name", "age"])

edges = spark.createDataFrame([
("1", "2", "friend"),
("2", "3", "follow")
], ["src", "dst", "relationship"])

# 创建图
graph = GraphFrame(vertices, edges)

# 显示图的顶点和边
graph.vertices.show()
graph.edges.show()

# 计算顶点的度数
degrees = graph.degrees
degrees.show()

sc.stop()
输出:

复制
+---+-----+---+
| id| name|age|
+---+-----+---+
| 1|Alice| 25|
| 2| Bob| 30|
| 3|Cathy| 28|
+---+-----+---+

+---+---+------------+
|src|dst|relationship|
+---+---+------------+
| 1| 2| friend|
| 2| 3| follow|
+---+---+------------+

+---+------+
| id|degree|
+---+------+
| 2| 2|
| 1| 1|
| 3| 1|
+---+------+

posted @ 2025-02-04 23:56  混沌武士丞  阅读(13)  评论(0)    收藏  举报