安装配置

GP集群部署(6.11.2)

一、资源规划

1.1 环境配置信息

下载地址:
https://network.pivotal.io/products/vmware-greenplum#/releases/1462218

名称	                      环境配置
OS系统版本	              Linux 3.10.0-1127.el7.x86_64
OS配置信息	              4C8G + 40G数据盘
GP版本	                  greenplum-db-6.11.2

1.2 资源规划

主机          IP                 角色          规划
mdw       172.16.104.11         master       master节点
sdw1      172.16.104.12         segment     seg1、seg2、mirror1、mirror2
sdw2      172.16.104.13         segment     seg3、seg4、mirror3、mirror4
sdw3      172.16.104.14         segment     seg5、seg6、mirror5、mirror6、master_standby

二、操作系统配置

2.1 修改主机名

在GP中,一般习惯将master节点的机器称为 mdw ,将segment节点的机器称为 sdw ,在该步骤中我们配置好每个主机的hostname后,并修改其/etc/hosts文件。

# hostnamectl  set-hostname sdw1        //修改主机名
# hostname                              //查看主机名
# reboot                                //重启服务器对hostname生效
# vi /etc/hosts                         //修改hosts文件
172.16.104.11 mdw
172.16.104.12 sdw1
172.16.104.13 sdw2
172.16.104.14 sdw3

cat >> /etc/hosts  << EOF

EOF

2.2 关闭防火墙与selinux

1、 关闭防火墙

# systemctl status firewalld                //查看防火墙状态
# systemctl stop firewalld                  //关闭防火墙
# systemctl disable firewalld               //禁用防火墙

2、关闭selinux

# setenforce 0 && sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config           //禁用selinux
# getenforce       
# sestatus                  //查看selinux状态

2.3 挂载磁盘

greenplum仅支持XFS文件系统,挂载选项不记录文件访问时间

cat /etc/fstab 
/dev/mapper/vgdata-data               /gpdata            xfs     rw,noatime,nodev,nobarrier,inode64,allocsize=16m        1 2


XFS相比较ext4具有如下优点:

XFS的扩展性明显优于ext4,ext4的单个文件目录超过200W个性能下降明显
ext4作为传统文件系统确实非常稳定,但是随着存储需求的越来越大,ext4渐渐不在适应
由于历史磁盘原因,ext4的inode个数限制(32位),最多只能支持40多亿个文件,单个文件最大支持到16T
XFS使用的是64位管理空间,文件系统规模可以达到EB级别,XFS是基于B+Tree管理元数据

2.4 配置sysctl.conf

-- 修改sysctl.conf文件
# cat /etc/sysctl.conf

# kernel.shmall = _PHYS_PAGES / 2 =echo $(expr $(getconf _PHYS_PAGES) / 2)
kernel.shmall = 483888
# kernel.shmmax = kernel.shmall * PAGE_SIZE =echo $(expr $(getconf _PHYS_PAGES) / 2 \* $(getconf PAGE_SIZE))
kernel.shmmax = 1982005248
kernel.shmmni = 4096
vm.overcommit_memory = 2
vm.overcommit_ratio = 95
net.ipv4.ip_local_port_range = 10000 65535
kernel.sem = 500 2048000 200 40960
kernel.sysrq = 1
kernel.core_uses_pid = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.msgmni = 2048
net.ipv4.tcp_syncookies = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.tcp_max_syn_backlog = 4096
net.ipv4.conf.all.arp_filter = 1
net.core.netdev_max_backlog = 10000
net.core.rmem_max = 2097152
net.core.wmem_max = 2097152
vm.swappiness = 10
vm.zone_reclaim_mode = 0
vm.dirty_expire_centisecs = 500
vm.dirty_writeback_centisecs = 100
#vm.dirty_background_ratio = 0
#vm.dirty_ratio = 0
#vm.dirty_background_bytes = 1610612736
#vm.dirty_bytes = 4294967296
vm.dirty_background_ratio = 3
vm.dirty_ratio = 10

# 如果系统内存大于64G
# vm.dirty_background_ratio = 0
# vm.dirty_ratio = 0
# vm.dirty_background_bytes = 1610612736 # 1.5GB
# vm.dirty_bytes = 4294967296 # 4GB

# 如果系统内存小于64G
vm.dirty_background_ratio = 3
vm.dirty_ratio = 10


-- 根据计算写入 min_free_kbytes 值
# awk 'BEGIN {OFMT = "%.0f";} /MemTotal/ {print "vm.min_free_kbytes =", $2 * .03;}' /proc/meminfo >> /etc/sysctl.conf

-- 使参数生效
# sysctl -p

2.5 修改系统资源限制

增加操作系统文件数和进程数

-- RHEL/CentOS6修改/etc/security/limits.d/90-nproc.conf文件,RHEL/CentOS 7修改/etc/security/limits.d/20-nproc.conf文件
# cat /etc/security/limits.d/20-nproc.conf
*          soft    nproc     131072
root       soft    nproc     unlimited

-- linux模块pam_limits 通过读取 limits.conf文件来设置用户限制,重启后生效
# cat /etc/security/limits.conf
* soft nofile 1048576                    //*表示所有用户,nofile表示最大文件数
* hard nofile 1048576
* soft nproc 1048576                    //noproc表示最大进程数
* hard nproc 10048576

2.6 磁盘I/O预读设置

数据仓库的最大特点是用于保存历史数据,存在大量的数据操作,包括增删改查,
当设置的块越大时读取性能越高;16384是greenplum数据库要求的最低要求。
注意要将每一个存数据的磁盘都修改预读参数:或者使用检查配置信息查看那些需要修改

(1)检查磁盘预读取设置,应都是 16384:
blockdev --getra /dev/sd* blockdev --getra /dev/dm-*

(2)设置
修改/etc/rc.d/rc.local 增加两行
vi /etc/rc.d/rc.local
blockdev --setra 16384 /dev/sd*
blockdev --setra 16384 /dev/dm-*

(3)检查办法:
系统重启后,运行
blockdev --getra /dev/sd* blockdev --getra /dev/dm-*  应都是 16384

发布到其他节点:
/etc/rc.d/rc.local
sh bulkcp.sh /tmp/gpconfig/2rclocal.conf /tmp/gpconfig/2rclocal.conf
gpssh -f /data1/gpinstall/config/allnodes.txt -e ''
cat /tmp/gpconfig/2rclocal.conf >> /etc/rc.d/rc.local

2.7 磁盘I/O调度算法

  • ssd磁盘:建议使用noop或者deadline策略;
  • 机械盘:建议使用deadline策略;
    注意:内核版本为4版本开始的默认调出策略为NONE(也就是noop),无需修改
-- 临时生效
# more  /sys/block/sda/queue/scheduler
noop [deadline] cfq
# echo deadline > /sys/block/sda/queue/scheduler

-- 永久生效
deadline的调度策略可以平衡IO调度和进程调度,不会造成等待进程频繁的等待
注意:每个节点都需要执行:

(1)查看调度策略:

[root@master ~] # cat /sys/block/sda/queue/scheduler
noop anticipatory deadline [cfq]

[root@master ~] # cat /sys/block/fd0/queue/scheduler
noop anticipatory deadline [cfq]
[root@master ~]# dmesg | grep -i scheduler

(2)修改调度策略:
(临时修改,重启会失效)

[root@master ~]# echo deadline > /sys/block/sda/queue/scheduler
[root@master ~]# echo deadline > /sys/block/fd0/queue/scheduler

(永久修改,重启不失效)
修改/boot/grub/menu.lst 找到 kernel /vmlinuz-xxx 这一行,在最后添加 elevator=deadline

vi /boot/grub/menu.lst
elevator=deadline

(3)检查/验证方法:
系统启动正常后,执行 cat /sys/block/*/queue/scheduler 应能看到: noop anticipatory [deadline] cfq

2.8 禁用透明大页面

linux版本从 SLES11, RHEL6, OL6 and UEK2 Kernels开始默认启动Transparent HugePages功能。该功能可能导致节点重启和gp性能问题。
grep AnonHugePages /proc/meminfo
如果输出AnonHugepages: xxxx kB值大于0KB,则说明系统在使用Transparent HugePages

# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] never
以上验证方法显示always也表示在使用Transparent HugePages
禁用方法:
方法一:
编辑 /etc/grub.conf 
title Oracle Linux Server (2.6.32-300.25.1.el6uek.x86_64)
root (hd0,0)
kernel /vmlinuz-2.6.32-300.25.1.el6uek.x86_64 ro root=LABEL=/ transparent_hugepage=never
 initrd /initramfs-2.6.32-300.25.1.el6uek.x86_64.img

方法2:vi /etc/rc.d/boot.local
【SUSE系统】
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
   echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
   echo never > /sys/kernel/mm/transparent_hugepage/defrag
fi

【RHEL系统】
if test -f /sys/kernel/mm/redhat_transparent_hugepage/enabled; then
   echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/redhat_transparent_hugepage/defrag; then
   echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
Fi

Centos7

12.禁用空白大页:

echo never>/sys/kernel/mm/transparent_hugepage/enabled
echo never>/sys/kernel/mm/transparent_hugepage/defrag

然后将以上命令添加到 /etc/rc.local

或者修改grub.conf,加入transparent_hugepage=never
centos 7以下:

/etc/grub.conf的kernel行最后添加numa=off
kernel /vmlinuz-2.6.32-220.el6.x86_64 ro ...... KEYTABLE=us rd_NO_DM numa=off

centos 7及以上:
#/etc/default/grub的GRUB_CMDLINE_LINUX行最后添加 transparent_hugepage=never
GRUB_CMDLINE_LINUX=”crashkernel=auto …… rhgb quiet transparent_hugepage=never”
#执行grub2-mkconfig命令
grub2-mkconfig -o /boot/grub2/grub.cfg
cat /boot/grub2/grub.cfg

2.9 配置NTF时钟

#vi /etc/ntp.conf
#在server 最前面加上
server mdw prefer               // 优先主节点
server smdw                     // 其次standby 节点,若没有standby ,可以配置成数据中心的时钟服务器
service ntpd restart            // 修改完重启ntp服务

2.10 使用numa可能导致性能问题,建议关闭

CentOS7以下:
/etc/grub.conf的kernel行最后添加elevator=deadline
kernel /vmlinuz-2.6.32-220.el6.x86_64 ro ...... KEYTABLE=us rd_NO_DM numa=off elevator=deadline

CentOS7:
#查看:dmesg | grep scheduler
CentOS7 io调度规则默认为“deadline”,所以不需要更改
cat /sys/block/sd*/queue/scheduler 

三、GP集群搭建

3.1 下载安装依赖包

 yum install -y apr apr-util bash bzip2 curl krb5 libcurl libevent libxml2 libyaml zlib openldap openssh openssl openssl-libs perl readline rsync R sed tar zip krb5-devel

检查包安装情况:
rpm -q --qf '%{NAME}- %{VERSION}-%{RELEASE}  (%{ARCH}) \n' \
apr \
apr-util \
bash \
bzip2 \
curl \
krb5 \
libcurl \
libevent \
libxml2 \
libyaml \
zlib \
openldap \
openssh \
openssl \
openssl-libs \
perl \
readline \
rsync \
R \
sed \
tar \
zip \
gcc

3.2 创建数据库用户(所有主机)

groupadd gpadmin -g 3030
useradd gpadmin -u 3030 -g 3030
passwd gpadmin

3.3 安装greenplum程序

默认安装在/usr/local 目录下,并且自动创建好软连接
yum -y install greenplum-db-6.12.0-rhel7-x86_64.rpm

授权:
chown -R gpadmin:gpadmin /usr/local/greenplum-db

目录说明:
greenplum_path.sh:Greenplum数据库环境变量文件
GPDB-LICENSE.txt:Greenplum许可协议
bin:管理工具、客户端程序及服务程序
demo:示例程序
docs:帮助文档
etc:OpenSSL的配置示例
ext:一些GP工具使用的捆绑程序
inlcude:C头文件
lib:库文件
sbin:支持的或者内部的脚本和程序
share:共享文件

检查批量安装情况
# gpssh -f /home/gpadmin/allnodes -e ls -l $GPHOME

3.4 配置环境变量(gpadmin用户)

vi .bashrc
vi .bvash_profile
source /usr/local/greenplum-db/greenplum_path.sh
MASTER_DATA_DIRECTORY=/gpdata/master/gpseg-1
export MASTER_DATA_DIRECTORY

批量设置greenplum环境变量
-- 设置当前服务器的环境变量
# su - gpadmin
$ cat >> /home/gpadmin/.bash_profile << EOF
> source /usr/local/greenplum-db/greenplum_path.sh
> EOF

-- 将环境变更同步至各个集群服务器
$ gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bash_profile  gpadmin@=:/home/gpadmin/.bash_profile
$ gpscp -f /usr/local/greenplum-db/seg_host /home/gpadmin/.bashrc gpadmin@=:/home/gpadmin/.bashrc


3.5 配置all_host、seg_host文件

# cat all_host              //集群所有节点服务器
mdw
sdw1
sdw2
sdw3
# cat seg_host              //seg节点服务器
sdw1
sdw2
sdw3

3.6 配置免密互信

# ssh-keygen                //生成密钥
# ssh-copy-id sdw1          //将本服务器的公钥传输到集群其他服务器
# ssh-copy-id sdw2
# ssh-copy-id sdw3

-- 使用gpssh-exkeys工具,打通N-N免密互信
# source /usr/local/greenplum-db/greenplum_path.sh
# gpssh-exkeys -f all_host
[STEP 1 of 5] create local ID and authorize on local host
  ... /root/.ssh/id_rsa file exists ... key generation skipped

[STEP 2 of 5] keyscan all hosts and update known_hosts file

[STEP 3 of 5] retrieving credentials from remote hosts
  ... send to sdw1
  ... send to sdw2
  ... send to sdw3

[STEP 4 of 5] determine common authentication file content

[STEP 5 of 5] copy authentication files to all remote hosts
  ... finished key exchange with sdw1
  ... finished key exchange with sdw2
  ... finished key exchange with sdw3

[INFO] completed successfully

-- 验证互信
#  gpssh -f /usr/local/greenplum-db/all_host -e 'ls /usr/local/'

打通gpadmin 用户免密登录

# su - gpadmin
$ source /usr/local/greenplum-db/greenplum_path.sh
$ ssh-keygen
$ ssh-copy-id sdw1
$ ssh-copy-id sdw2
$ ssh-copy-id sdw3
$ gpssh-exkeys -f /usr/local/greenplum-db/all_host
[STEP 1 of 5] create local ID and authorize on local host
  ... /home/gpadmin/.ssh/id_rsa file exists ... key generation skipped

[STEP 2 of 5] keyscan all hosts and update known_hosts file

[STEP 3 of 5] retrieving credentials from remote hosts
  ... send to sdw1
  ... send to sdw2
  ... send to sdw3

[STEP 4 of 5] determine common authentication file content

[STEP 5 of 5] copy authentication files to all remote hosts
  ... finished key exchange with sdw1
  ... finished key exchange with sdw2
  ... finished key exchange with sdw3

[INFO] completed successfully

3.7 创建数据目录(master,segment)

# mkdir -p /data/greenplum/data/master
# chown gpadmin:gpadmin /data/greenplum/data/master
# source /usr/local/greenplum-db/greenplum_path.sh
# gpssh -h  sdw3 -e 'mkdir -p /data/greenplum/data/master'                 //将master数据目录传输至master-standby节点上
# gpssh -h  sdw3 -e 'chown gpadmin:gpadmin /data/greenplum/data/master'

3.8 创建segment 数据目录

# gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /data/greenplum/data1/primary'
# gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /data/greenplum/data1/mirror'
# gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /data/greenplum/data2/primary'
# gpssh -f /usr/local/greenplum-db/seg_host -e 'mkdir -p /data/greenplum/data2/mirror'
# gpssh -f /usr/local/greenplum-db/seg_host -e 'chown -R gpadmin /data/greenplum/data*'

3.9 测试集群性能

1、测试集群网络性能
# gpcheckperf -f /usr/local/greenplum-db/seg_host -r N -d /tmp
2、磁盘I/O 性能测试
# gpcheckperf -f /usr/local/greenplum-db/seg_host -r ds -D   -d /opt/greenplum/data1/primary
3、集群时钟校验
# gpssh -f /usr/local/greenplum-db/all_host -e 'date'
[ mdw] date
[ mdw] 2020年 10月 29日 星期四 23:13:53 CST
[sdw2] date
[sdw2] 2020年 10月 29日 星期四 23:13:53 CST
[sdw3] date
[sdw3] 2020年 10月 29日 星期四 23:13:53 CST
[sdw1] date
[sdw1] 2020年 10月 29日 星期四 23:13:53 CST

3.10 修改配置文件

# su - gpadmin
$ mkdir -p /home/gpadmin/gpconfigs
$ cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config /home/gpadmin/gpconfigs/gpinitsystem_config

-- 修改配置文件
$ vi /home/gpadmin/gpconfigs/gpinitsystem_config
ARRAY_NAME="Greenplum Data Platform"
SEG_PREFIX=gpseg
PORT_BASE=6000
declare -a DATA_DIRECTORY=(/opt/greenplum/data1/primary /opt/greenplum/data2/primary)           //segment节点primary目录
MASTER_HOSTNAME=mdw
MASTER_DIRECTORY=/data/greenplum/data/master                 //master节点目录
MASTER_PORT=5432
TRUSTED_SHELL=ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
MIRROR_PORT_BASE=7000
declare -a MIRROR_DATA_DIRECTORY=(/opt/greenplum/data1/mirror /opt/greenplum/data2/mirror)      //segment节点mirror目录
DATABASE_NAME=db_init           //初始化数据库

3.11 初始化集群

gpinitsystem -c gpinitsystem_config -h seg_host -B 8
初始化的过程就是将所有的安装装目录下的文件复制到其他节点的相同目录下(所以也可以手动复制scp -r XXX)。
初始化成功之后数据库会自动打开。

2、删除初始化失败后初始化文件,后续重新进行初始化
-d MASTER_DATA_DIRECTORY :会清除master,segment所有的数据目录。
-f force : 终止所有进程,强制删除。
$ gpdeletesystem -d /data/greenplum/data/master/gpseg-1 -f

3、初始化gpadmin账号密码
# su - gpadmin
[gpadmin@mdw ~]$ psql
psql (9.4.24)
Type "help" for help.

db_init=# ALTER USER gpadmin WITH PASSWORD 'gpadmin';
ALTER ROLE

4、各节点登录方式
-- 登陆master节点
[gpadmin@mdw ~]$ PGOPTIONS='-c gp_session_role=utility' psql -h mdw -p5432 -d postgres
psql (9.4.24)
Type "help" for help.

-- 登陆到segment节点
postgres=# \q
[gpadmin@mdw ~]$ PGOPTIONS='-c gp_session_role=utility' psql -h sdw1 -p6000 -d postgres
psql (9.4.24)
Type "help" for help.

5、修改配置文件
1)修改pg_hba.conf
# vi /data/greenplum/data/master/gpseg-1/pg_hba.conf
2)修改postgresql.conf文件
3)加载修改的配置文件
gpstop -u 

3.12 激活master的standby

gpinitstandby -s sdw3

查看复制状态
select * from pg_stat_replication;

期间遇到的报错

20201029:23:27:02:018002 gpinitsystem:mdw:gpadmin-[INFO]:-Start Function PARALLEL_COUNT
20201029:23:27:02:018002 gpinitsystem:mdw:gpadmin-[INFO]:-End Function PARALLEL_COUNT
20201029:23:27:02:018002 gpinitsystem:mdw:gpadmin-[INFO]:-Start Function PARALLEL_COUNT
20201029:23:27:02:018002 gpinitsystem:mdw:gpadmin-[INFO]:-Start Function PARALLEL_WAIT
20201029:23:27:02:018002 gpinitsystem:mdw:gpadmin-[INFO]:-Waiting for parallel processes batch [1], please wait...
2020-10-29 23:27:06.644931 CST,,,p12262,th767985792,,,,0,,,seg0,,,,,"LOG","00000","registering background worker ""sweeper process""",,,,,,,,"RegisterBackgroundWorker","bgworker.c",774,
2020-10-29 23:27:06.664880 CST,,,p12262,th767985792,,,,0,,,seg0,,,,,"WARNING","01000","Better set max_prepared_transactions greater than max_connections",,,,,,,,"tmShmemInit","cdbtm.c",1064,
2020-10-29 23:27:06.837265 CST,,,p12262,th767985792,,,,0,,,seg0,,,,,"FATAL","XX000","could not create semaphores: No space left on device","Failed system call was semget(7000031, 17, 03600).","This error does *not* mean that you have run out of disk space.  It occurs when either the system limit for the maximum number of semaphore sets (SEMMNI), or the system wide maximum number of semaphores (SEMMNS), would be exceeded.  You need to raise the respective kernel parameter.  Alternatively, reduce PostgreSQL's consumption of semaphores by reducing its max_connections parameter.
The PostgreSQL documentation contains more information about configuring your system for PostgreSQL.",,,,,,"InternalIpcSemaphoreCreate","pg_sema.c",126,1    0xbed33c postgres errstart (elog.c:557)
2    0x9f91e8 postgres PGSemaphoreCreate (pg_sema.c:113)
3    0xa70ad1 postgres InitProcGlobal (proc.c:259)
4    0xa5c8e5 postgres CreateSharedMemoryAndSemaphores (ipci.c:290)
5    0xa0d78b postgres PostmasterMain (postmaster.c:1337)
6    0x6b2e21 postgres main (main.c:205)
7    0x7fe82a83f555 libc.so.6 __libc_start_main + 0xf5
8    0x6beabc postgres <symbol not found> + 0x6beabc

2020-10-29 23:27:06.669080 CST,,,p15613,th423786624,,,,0,,,seg2,,,,,"LOG","00000","registering background worker ""sweeper process""",,,,,,,,"RegisterBackgroundWorker","bgworker.c",774,
2020-10-29 23:27:06.760203 CST,,,p15613,th423786624,,,,0,,,seg2,,,,,"WARNING","01000","Better set max_prepared_transactions greater than max_connections",,,,,,,,"tmShmemInit","cdbtm.c",1064,
2020-10-29 23:27:06.936564 CST,,,p15613,th423786624,,,,0,,,seg2,,,,,"FATAL","XX000","could not create semaphores: No space left on device","Failed system call was semget(7000031, 17, 03600).","This error does *not* mean that you have run out of disk space.  It occurs when either the system limit for the maximum number of semaphore sets (SEMMNI), or the system wide maximum number of semaphores (SEMMNS), would be exceeded.  You need to raise the respective kernel parameter.  Alternatively, reduce PostgreSQL's consumption of semaphores by reducing its max_connections parameter.
The PostgreSQL documentation contains more information about configuring your system for PostgreSQL.",,,,,,"InternalIpcSemaphoreCreate","pg_sema.c",126,1    0xbed33c postgres errstart (elog.c:557)
2    0x9f91e8 postgres PGSemaphoreCreate (pg_sema.c:113)
3    0xa70ad1 postgres InitProcGlobal (proc.c:259)
4    0xa5c8e5 postgres CreateSharedMemoryAndSemaphores (ipci.c:290)
5    0xa0d78b postgres PostmasterMain (postmaster.c:1337)
6    0x6b2e21 postgres main (main.c:205)
7    0x7f2615ffe555 libc.so.6 __libc_start_main + 0xf5
8    0x6beabc postgres <symbol not found> + 0x6beabc

解决办法:
减少如下的值
kernel.shmall = 483888
kernel.shmmax = 17179869184

posted @ 2025-06-26 15:36  数据库小白(专注)  阅读(27)  评论(0)    收藏  举报