ZhangZhihui's Blog  

In Apache Spark, the function registerTempTable() was an old API (deprecated since Spark 2.0 and removed in Spark 3.0) that allowed you to register a DataFrame as a temporary table (view) so it could be queried using SQL syntax.


📌 Usage (before Spark 2.0)

df = spark.read \
.format("jdbc") \
.option("url", "jdbc:postgresql://10.0.0.1:3433/zzhdb") \
.option("dbtable", "people") \
.option("user", "zzh") \
.option("password", "pwd") \
.load()# Register as a temporary table df.registerTempTable("people") # Now you can query it using Spark SQL result = spark.sql("SELECT name, age FROM people WHERE age > 20") result.show()

 


🚨 Deprecation

  • registerTempTable("name") was replaced with createOrReplaceTempView("name").

  • Reason: registerTempTable only worked with SQLContext and was being phased out in favor of SparkSession unified APIs.


📌 Modern Equivalent

df.createOrReplaceTempView("people")

result = spark.sql("SELECT name, age FROM people WHERE age > 20")
result.show()

 


📌 Difference Between Views

  1. createOrReplaceTempView(name)

    • Session-scoped (disappears when the SparkSession ends).

    • Can be replaced if you register the same name again.

  2. createGlobalTempView(name)

    • Global across sessions but tied to the Spark application.

    • Registered under the global_temp database.

     
    SELECT * FROM global_temp.people;

✅ In short:
registerTempTable() = old API (now gone).
createOrReplaceTempView() = the correct replacement.

 

posted on 2025-10-11 12:56  ZhangZhihuiAAA  阅读(2)  评论(0)    收藏  举报