PySpark环境搭建-Anaconda3-4.4.0

一、Anaconda3安装

1.1 下载地址https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/

1.2 进入文件存放目录安装:

$ sh ./Anaconda3-4.4.0-Linux-x86_64.sh

1.2.1,按Enter继续

Please, press ENTER to continue
>>> 

1.2.2,按空格,直到出现:Please answer 'yes' or 'no': 输入yes

1.2.3,输入安装目录,此处安装在本用户(hadoop)家目录下,即如下显示直接确认:

Anaconda3 will now be installed into this location:
/home/hadoop/anaconda3

  - Press ENTER to confirm the location
  - Press CTRL-C to abort the installation
  - Or specify a different location below

[/home/hadoop/anaconda3] >>> 

安装结束:

Do you wish the installer to prepend the Anaconda3 install location
to PATH in your /home/hadoop/.bashrc ? [yes|no]
[no] >>> 

You may wish to edit your .bashrc or prepend the Anaconda3 install location:

$ export PATH=/home/hadoop/anaconda3/bin:$PATH

Thank you for installing Anaconda3!

Share your notebooks and packages on Anaconda Cloud!
Sign up for free: https://anaconda.org

1.2.4 配置环境变量

$ vi ~/.bash_profile
export PATH=/home/hadoop/anaconda3/bin:$PATH
export PYSPARK_PYTHON=/home/hadoop/anaconda3/bin/python
$ source ~/.bash_profile

输入“con”,连续按两下Tab键,显示“conda”则配置成功;

$ con
conda               conda-server        consoletype         continue            
conda-env           config_data         container-executor  convertquota   

三、PySpark配置

3.1 启动Anaconda3

$ ~/anaconda3/bin/jupyter notebook --NotebookApp.ip='0.0.0.0' &   # & 进行后台运行  

3.2 创建编辑python文件

import os
import sys

os.environ["PYSPARK_PYTHON"]="/home/hadoop/anaconda3/bin/python"   # 自己Linux系统上Anaconda3路径
os.environ["JAVA_HOME"]="/usr/jvm/jdk1.8"         # 自己的JAVA_HOME
os.environ["SPARK_HOME"]="/home/hadoop/hdfs/spark"   # 自己的SPARK_HOME
os.environ["PYLIB"]=os.environ["SPARK_HOME"] + "/python/lib"
sys.path.insert(0,os.environ["PYLIB"] + "/py4j-0.10.7-src.zip")
sys.path.insert(0,os.environ["PYLIB"] + "/pyspark.zip" )

四、离线安装Python库

4.1 window

下载地址 :https://www.lfd.uci.edu/~gohlke/pythonlibs/
下载.whl文件
运行pip install <文件路径> 例如:

pip install D:\pycodes\scipy-0.17.1-cp35-cp35m-win32.whl
4.2 Linux

和window步骤一样
下载地址 :http://mirrors.aliyun.com/pypi/simple/
下载.whl文件
运行pip install <文件路径> 例如:

# 进入文件目录
pip install scikit_learn-0.22-cp36-cp36m-manylinux1_x86_64.whl
posted @ 2021-08-05 16:43  xiaojy  阅读(405)  评论(0)    收藏  举报