spark pyspark和spark-sql报错

我用的是spark2.3.0版本

错误1:pyspark使用报错

root@hadoop102:/opt/module/spark/bin# pyspark
Python 3.8.10 (default, Feb  4 2025, 15:02:54) 
[GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
Traceback (most recent call last):
  File "/opt/module/spark/python/pyspark/shell.py", line 31, in <module>
    from pyspark import SparkConf
  File "/opt/module/spark/python/pyspark/__init__.py", line 46, in <module>
    from pyspark.context import SparkContext
  File "/opt/module/spark/python/pyspark/context.py", line 31, in <module>
    from pyspark import accumulators
  File "/opt/module/spark/python/pyspark/accumulators.py", line 97, in <module>
    from pyspark.cloudpickle import CloudPickler
  File "/opt/module/spark/python/pyspark/cloudpickle.py", line 146, in <module>
    _cell_set_template_code = _make_cell_set_template_code()
  File "/opt/module/spark/python/pyspark/cloudpickle.py", line 127, in _make_cell_set_template_code
    return types.CodeType(
TypeError: an integer is required (got type bytes)

解决办法:降低pyspark使用的python版本。可以更改系统默认python版本,也可以在spark-env.sh中添加如

export PYSPARK_PYTHON=/usr/bin/python2.7

进行更改。

错误2:使用spark-sql报错

root@hadoop102:/opt/module/spark/bin# spark-sql
2025-03-05 20:05:37 WARN NativeCodeLoader:60 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2025-03-05 20:05:37 INFO SecurityManager:54 - Changing view acls to: root
2025-03-05 20:05:37 INFO SecurityManager:54 - Changing modify acls to: root
2025-03-05 20:05:37 INFO SecurityManager:54 - Changing view acls groups to:
2025-03-05 20:05:37 INFO SecurityManager:54 - Changing modify acls groups to:
2025-03-05 20:05:37 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(root); groups with view permissions: Set(); users with modify permissions: Set(root); groups with modify permissions: Set()
java.lang.ClassNotFoundException: org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:348)
at org.apache.spark.util.Utils$.classForName(Utils.scala:235)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:836)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Failed to load main class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.
You need to build Spark with -Phive and -Phive-thriftserver.

2025-03-05 20:05:37 INFO ShutdownHookManager:54 - Shutdown hook called
2025-03-05 20:05:37 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-e7b42c80-ccd8-4337-8542-41495c392ec5

解决办法:下载对应spark源码

 

posted on 2025-03-06 13:43  ifiwereaboy  阅读(93)  评论(0)    收藏  举报