ZhangZhihui's Blog  

In Apache Spark, the function registerTempTable() was an old API (deprecated since Spark 2.0 and removed in Spark 3.0) that allowed you to register a DataFrame as a temporary table (view) so it could be queried using SQL syntax.


📌 Usage (before Spark 2.0)

df = spark.read.json("people.json")

# Register as a temporary table
df.registerTempTable("people")

# Now you can query it using Spark SQL
result = spark.sql("SELECT name, age FROM people WHERE age > 20")
result.show()

 


🚨 Deprecation

  • registerTempTable("name") was replaced with createOrReplaceTempView("name").

  • Reason: registerTempTable only worked with SQLContext and was being phased out in favor of SparkSession unified APIs.


📌 Modern Equivalent

df.createOrReplaceTempView("people")

result = spark.sql("SELECT name, age FROM people WHERE age > 20")
result.show()

 


📌 Difference Between Views

  1. createOrReplaceTempView(name)

    • Session-scoped (disappears when the SparkSession ends).

    • Can be replaced if you register the same name again.

  2. createGlobalTempView(name)

    • Global across sessions but tied to the Spark application.

    • Registered under the global_temp database.

     
    SELECT * FROM global_temp.people;

✅ In short:
registerTempTable() = old API (now gone).
createOrReplaceTempView() = the correct replacement.

 

posted on 2025-10-11 12:55  ZhangZhihuiAAA  阅读(2)  评论(0)    收藏  举报