大数据Spark实时处理--环境搭建

  • OOTB
  • 1)虚拟机存储地址:D:\Spark\hadoop000\hadoop000。更名为spark000
  • 2)通过虚拟机左上角的文件,打开 hadoop000 文件中的 hadoop000.vmx
  • 3)点击 开启此虚拟机
  • 4)跳出的弹框,点击 我已复制该虚拟机
  • 5)配置虚拟机联通网络
[hadoop@hadoop000 ~]$ ifconfig
lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
inet 127.0.0.1  
[hadoop@hadoop000 ~]$ sudo -i
[root@hadoop000 ~]# cd /etc/sysconfig/network-scripts/
[root@hadoop000 network-scripts]# rm ifcfg-lo
[root@hadoop000 network-scripts]# ip addr
2: ens33: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN qlen 1000
link/ether 00:0c:29:ca:80:41 
[root@hadoop000 network-scripts]# vi ifcfg-eth0
HWADDR=00:0c:29:ca:80:41 
  • 6)把虚拟机的网络适配器设置为“NAT模式,导航栏“编辑”->“虚拟网络编辑器” ->NAT模式->NAT设置
  • 7)记住NAT设置中的子网IP:192.168.131.0,子网掩码:255.255.255.0,网关IP:192.168.131.2
  • 注意IPADDR和NAT设置里面的IP最后一位要不同。
[hadoop@hadoop000 ~]$ sudo -i
[root@hadoop000 ~]# cd /etc/sysconfig/network-scripts/
[root@hadoop000 network-scripts]# vi ifcfg-eth0
ONBOOT=yes
BOOTPROTO=static
IPADDR=192.168.131.66
NETMASK=255.255.255.0
GATEWAY=192.168.131.2
[root@hadoop000 network-scripts]# vi /etc/resolv.conf
nameserver 114.114.114.114
nameserver 114.114.114.115
[root@hadoop000 network-scripts]# reboot
  • 8)测试网络
[hadoop@hadoop000 ~]$ ping 192.168.131.66
[hadoop@hadoop000 ~]$ ping www.baidu.com
[hadoop@hadoop000 ~]$ ifconfig
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
inet 192.168.131.66  netmask 255.255.255.0  broadcast 192.168.131.255
ether 00:0c:29:ca:80:41 
  •  9)配置主机名字
[root@hadoop000 ~]# vi /etc/hostname
spark000
  • 其余关于spark000集群替换配置,虽然这里是单节点,我闲着无聊配置的

[root@hadoop000 sysconfig]# pwd
/etc/sysconfig [root@hadoop000 sysconfig]# vi network # Created by anaconda NETWORKING=yes HOSTNAME=spark000 [root@hadoop000 ~]# cd /etc [root@hadoop000 etc]# vi hosts 192.168.31.66 spark000 192.168.31.66 localhost 127.0.0.1 localhost localhost.localdomain localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6.localdomain6 [hadoop@hadoop000 hadoop]$ pwd /home/hadoop/app/hadoop-2.6.0-cdh5.16.2/etc/hadoop
  • 10)创建目录
[hadoop@hadoop000 maven_repos]$ mkdir maven_repos

 

  • JDK
  • 1)software文件夹中放置jdk-8u202-linux-x64.tar.gz安装包
  • 2)解压jdk1.8.0_202的压缩包进入app目录
  • 3)配置环境变量~/.bash_profile
  • 4)使~/.bash_profile配置生效
  • 5)测试输出JAVA_HOME
  • 6)java版本
[hadoop@spark000 ~]$ cd software/
[hadoop@spark000 software]$ ls
jdk-8u202-linux-x64.tar.gz
[hadoop@spark000 software]$ tar -zxvf jdk-8u202-linux-x64.tar.gz -C ~/app/
[hadoop@spark000 ~]$ cd app/
[hadoop@spark000 app]$ ls
jdk1.8.0_202
[hadoop@spark000 app]$ cd jdk1.8.0_202
[hadoop@spark000 jdk1.8.0_202]$ pwd
/home/hadoop/app/jdk1.8.0_202
[hadoop@spark000 jdk1.8.0_202]$ vi ~/.bash_profile
PATH=$PATH:$HOME/.local/bin:$HOME/bin
export JAVA_HOME=/home/hadoop/app/jdk1.8.0_202
export PATH=$JAVA_HOME/bin:$PATH
[hadoop@spark000 jdk1.8.0_202]$ source ~/.bash_profile
[hadoop@spark000 jdk1.8.0_202]$ echo $JAVA_HOME
/home/hadoop/app/jdk1.8.0_202
[hadoop@spark000 jdk1.8.0_202]$ java -version
java version "1.8.0_202"

 

  • Scala
  • 1)下载网站Download | The Scala Programming Language (scala-lang.org)
  • 2)software文件夹中放置scala-2.12.10.tgz安装包
  • 3)解压scala-2.12.10的压缩包进入app目录
  • 4)配置环境变量~/.bash_profile
  • 5)使~/.bash_profile配置生效
  • 6)测试输出SCALA_HOME
  • 7)进入scala
[hadoop@spark000 ~]$ cd software/
[hadoop@spark000 software]$ ls
scala-2.12.10.tgz
[hadoop@spark000 software]$ tar -zxvf scala-2.12.10.tgz -C ~/app/
[hadoop@spark000 app]$ ls
scala-2.12.10
[hadoop@spark000 app]$ cd scala-2.12.10/
[hadoop@spark000 scala-2.12.10]$ pwd
/home/hadoop/app/scala-2.12.10
[hadoop@spark000 scala-2.12.10]$ vi ~/.bash_profile
export SCALA_HOME=/home/hadoop/app/scala-2.12.10
export PATH=$SCALA_HOME/bin:$PATH
[hadoop@spark000 scala-2.12.10]$ source ~/.bash_profile
[hadoop@spark000 scala-2.12.10]$ echo $SCALA_HOME
/home/hadoop/app/scala-2.12.10
[hadoop@spark000 scala-2.12.10]$ scala
Welcome to Scala 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_202).
Type in expressions for evaluation. Or try :help.

scala> 1 + 2
res0: Int = 3 

 

  • Maven
  • 1)下载地址Maven – Welcome to Apache Maven
  • 2)software文件夹中放置apache-maven-3.6.3-bin.tar.gz安装包
  • 3)解压maven-3.6.3 的压缩包进入app目录
  • 4)配置环境变量~/.bash_profile
  • 5)使~/.bash_profile配置生效
  • 6)测试输出MAVEN_HOME
  • 7)配置settings.xml,将文件夹maven_repos地址放置这里
[hadoop@spark000 ~]$ cd software/
[hadoop@spark000 software]$ ls
apache-maven-3.6.3-bin.tar.gz
[hadoop@spark000 software]$ tar -zxvf apache-maven-3.6.3-bin.tar.gz -C ~/app/
[hadoop@spark000 ~]$ cd app/
[hadoop@spark000 app]$ ls
apache-maven-3.6.3 
[hadoop@spark000 app]$ cd apache-maven-3.6.3
[hadoop@spark000 apache-maven-3.6.3]$ pwd
/home/hadoop/app/apache-maven-3.6.3
[hadoop@spark000 apache-maven-3.6.3]$ vi ~/.bash_profile
export MAVEN_HOME=/home/hadoop/app/apache-maven-3.6.3
export PATH=$MAVEN_HOME/bin:$PATH
[hadoop@spark000 apache-maven-3.6.3]$ source ~/.bash_profile
[hadoop@spark000 apache-maven-3.6.3]$ mvn -version
Apache Maven 3.6.3
Maven home: /home/hadoop/app/apache-maven-3.6.3
[hadoop@spark000 apache-maven-3.6.3]$ cd conf/
[hadoop@spark000 conf]$ ls
logging  settings.xml  toolchains.xml
[hadoop@spark000 conf]$ cd
[hadoop@spark000 ~]$ cd maven_repos/
[hadoop@spark000 maven_repos]$ pwd
/home/hadoop/maven_repos
[hadoop@spark000 maven_repos]$ cd $MAVEN_HOME
[hadoop@spark000 apache-maven-3.6.3]$ cd conf/
[hadoop@spark000 conf]$ vi settings.xml
<localRepository>/home/hadoop/maven_repos</localRepository>

 

  • Hadoop
  • 1)下载地址archice.sloudera.com
  • 2)cdh版本:cdh5.16.2
  • 3)hadoop版本全称:hadoop-2.6.0-cdh5.16.2.tar.gz
  • 4)解压hadoop-2.6.0-cdh5.16.2.tar.gz 至app目录
  • 5)配置环境变量~/.bash_profile
  • 6)使~/.bash_profile配置生效
  • 7)配置hadoop-env.sh
  • 8)配置core-site.xml
  • 9)配置hdfs-site.xml
  • 10)配置mapred-site.xml
  • 11)配置yarn-site.xml
  • 12)配置slaves
  • 13)格式化文件系统(格式化仅一次)
  • 需要出现Storage directiory /home/hadoop/app/tmp/dfs5162/dfs/name has been successfully formatted
  • 14)启动hdfs的主节点,先启动namenode,再启动datanode
  • 15)若出现问题,则查询相应的namenode,datanode日志
  • 16)启动yarn,先启动resourcemanager,再启动nodemanager
  • 17)测试
  • 18)依次关闭nodemanager、resourcemanager、datanode、namenode
  • 19)若使用start-dfs.sh,start-yarn.sh,则配置一下免密码登录[hadoop@spark000 ~]$ ssh-keygen -t rsa
[hadoop@spark000 ~]$ cd software/
[hadoop@spark000 software]$ ls
hadoop-2.6.0-cdh5.16.2.tar.gz
[hadoop@spark000 software]$ tar -zxvf hadoop-2.6.0-cdh5.16.2.tar.gz -C ~/app/
[hadoop@spark000 app]$ ls
hadoop-2.6.0-cdh5.16.2
[hadoop@spark000 app]$ cd hadoop-2.6.0-cdh5.16.2/
[hadoop@spark000 hadoop-2.6.0-cdh5.16.2]$ pwd
/home/hadoop/app/hadoop-2.6.0-cdh5.16.2
[hadoop@spark000 hadoop-2.6.0-cdh5.16.2]$ vi ~/.bash_profile
export HADOOP_HOME=/home/hadoop/app/hadoop-2.6.0-cdh5.16.2
export PATH=$HADOOP_HOME/bin:$PATH
[hadoop@spark000 hadoop-2.6.0-cdh5.16.2]$ source ~/.bash_profile

[hadoop@spark000 hadoop]$ pwd
/home/hadoop/app/hadoop-2.6.0-cdh5.16.2/etc/hadoop
[hadoop@spark000 hadoop]$ echo $JAVA_HOME
/home/hadoop/app/jdk1.8.0_202

[hadoop@spark000 hadoop]$ cd /home/hadoop/app/hadoop-2.6.0-cdh5.16.2/etc/hadoop

[hadoop@spark000 hadoop]$ vi hadoop-env.sh
#export JAVA_HOME=${JAVA_HOME}
export JAVA_HOME=/home/hadoop/app/jdk1.8.0_202

[hadoop@spark000 hadoop]$ vi core-site.xml
<property>
        <name>fs.default.name</name>
        <value>hdfs://spark000:8020</value>
</property>
<property>
        <name>hadoop.tmp.dir</name>
        <value>/home/hadoop/app/tmp/dfs5162</value>
</property>

[hadoop@spark000 hadoop]$ vi hdfs-site.xml
<property>
  <name>dfs.replication</name>
  <value>1</value>
</property>
<property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hadoop/app/tmp</value>
</property>

[hadoop@spark000 hadoop]$ cp mapred-site.xml.template mapred-site.xml
[hadoop@spark000 hadoop]$ vi mapred-site.xml
<property>
  <name>mapreduce.framework.name</name>
  <value>yarn</value>
</property>

[hadoop@spark000 hadoop]$ vi yarn-site.xml
<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>
<property>
  <name>yarn.nodemanager.local-dirs</name>
  <value>/home/hadoop/app/tmp/nm-local-dir</value>
 </property>

[hadoop@spark000 hadoop]$ vi slaves
spark000

[hadoop@spark000 bin]$ pwd
/home/hadoop/app/hadoop-2.6.0-cdh5.16.2/bin
[hadoop@spark000 bin]$ ./hdfs namenode -format

[hadoop@spark000 sbin]$ pwd
/home/hadoop/app/hadoop-2.6.0-cdh5.16.2/sbin
[hadoop@spark000 sbin]$ ./hadoop-daemon.sh start namenode
[hadoop@spark000 sbin]$ jps
8825 Jps
8751 NameNode

[hadoop@spark000 sbin]$ ./hadoop-daemon.sh start datanode
starting datanode, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-hadoop-datanode-spark000.out
[hadoop@spark000 sbin]$ jps
9762 NameNode
9891 DataNode
9977 Jps

[hadoop@spark000 sbin]$ tail -200f /home/hadoop/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-hadoop-namenode-spark000.log
[hadoop@spark000 sbin]$ tail -200f /home/hadoop/app/hadoop-2.6.0-cdh5.16.2/logs/hadoop-hadoop-datanode-spark000.log

[hadoop@spark000 sbin]$ ./yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.16.2/logs/yarn-hadoop-resourcemanager-spark000.out
[hadoop@spark000 sbin]$ ./yarn-daemon.sh start nodemanager
starting nodemanager, logging to /home/hadoop/app/hadoop-2.6.0-cdh5.16.2/logs/yarn-hadoop-nodemanager-spark000.out
[hadoop@spark000 sbin]$ jps
9762 NameNode
12050 Jps
9891 DataNode
11635 ResourceManager
11901 NodeManager

[hadoop@spark000 sbin]$ hadoop fs -mkdir /test
[hadoop@spark000 hadoop-2.6.0-cdh5.16.2]$ hadoop fs -put NOTICE.txt /test
[hadoop@spark000 hadoop-2.6.0-cdh5.16.2]$ hadoop fs -ls /test

[hadoop@spark000 mapreduce]$ pwd
/home/hadoop/app/hadoop-2.6.0-cdh5.16.2/share/hadoop/mapreduce
[hadoop@spark000 mapreduce]$ hadoop jar hadoop-mapreduce-examples-2.6.0-cdh5.16.2.jar pi 2 3

[hadoop@spark000 sbin]$ ./yarn-daemon.sh stop nodemanager
[hadoop@spark000 sbin]$ ./yarn-daemon.sh stop resourcemanager
[hadoop@spark000 sbin]$ ./hadoop-daemon.sh stop datanode
[hadoop@spark000 sbin]$ ./hadoop-daemon.sh stop namenode

  

  • ZooKeeper
  • 1)下载网站:https://archive.cloudera.com/p/cdh5/cdh/5/
  • 2)尝试wget下载zookeeper-3.4.5-cdh5.16.2.tar.gz
  • 3)解压至app
  • 4)更改~/.bash_profile,并生效~/.bash_profile
  • 5)配置zoo.cfg
  • 6)zk默认端口是2181,请在zoo.cfg中查找
  • 7)开启zk:./zkServer.sh start
  • 8)连接客户端:./zkServer.sh
  • 9)查看内容:get /zookeeper/quota
  • 10)退出zk:quit
  • 11)停止服务:./zkServer.sh stop
zookeeper-3.4.5-cdh5.16.2

[hadoop@spark000 zookeeper-3.4.5-cdh5.16.2]$ pwd
/home/hadoop/app/zookeeper-3.4.5-cdh5.16.2

[hadoop@spark000 zookeeper-3.4.5-cdh5.16.2]$ vi ~/.bash_profile
export ZK_HOME=/home/hadoop/app/zookeeper-3.4.5-cdh5.16.2
export PATH=$ZK_HOME/bin:$PATH
[hadoop@spark000 zookeeper-3.4.5-cdh5.16.2]$ source ~/.bash_profile

[hadoop@spark000 zookeeper-3.4.5-cdh5.16.2]$ cd conf/
[hadoop@spark000 conf]$ cp zoo_sample.cfg zoo.cfg
[hadoop@spark000 conf]$ vi zoo.cfg
dataDir=/home/hadoop/app/tmp/zookeeper

[hadoop@spark000 bin]$ pwd
/home/hadoop/app/zookeeper-3.4.5-cdh5.16.2/bin

[hadoop@spark000 bin]$ ./zkServer.sh start
[hadoop@spark000 bin]$ jps
17780 Jps
17758 QuorumPeerMain

[hadoop@spark000 bin]$ ./zkCli.sh
[zk: localhost:2181(CONNECTED) 0] ls /
[zk: localhost:2181(CONNECTED) 1] ls /zookeeper
[zk: localhost:2181(CONNECTED) 2] ls /zookeeper/quota

[zk: localhost:2181(CONNECTED) 3] get /zookeeper/quota
[zk: localhost:2181(CONNECTED) 4] quit

[hadoop@spark000 bin]$ ./zkServer.sh stop
posted @ 2021-10-15 16:08  酱汁怪兽  阅读(148)  评论(0)    收藏  举报