PG-流复制备份恢复工具(barman)

Barman

Barman (Backup and Recovery Manager) 是一个用Python语言实现的PostgreSQL灾难恢复管理工具,它由第二象限公司(2ndQuadrant)开源并维护。

http://docs.pgbarman.org/release/2.13/

功能

备份元数据、增量备份、保留策略、远程回复、WAL文件归档压缩和备份

分类

  • 仅流复制
  • 流复制+ssh/rsync

结构图

image-20210916151441528

Backup via rsync/SSH with WAL streaming (../../../assets/barman-architecture-scenario2b.png)

安装配置

依赖

  • Python >= 3.4
  • Python modules:
    • argcomplete
    • argh >= 0.21.2
    • psycopg2 >= 2.4.2
    • python-dateutil
    • setuptools
  • PostgreSQL >= 8.3
  • rsync >= 3.0.4 (optional for PostgreSQL >= 9.2)

创建用户(barman)

useradd -m -d /ups/app/postgresql/barman barman

这里

创建目录

mkdir -p /ups/app/postgresql/barman/{python,log,bkdir}
chown -R barman.barman /ups/app/postgresql/barman

安装依赖软件(barman用户执行)

python
su - barman

tar -xf Python-3.8.2.tgz
cd Python-3.8.2
./configure --prefix=/ups/app/postgresql/barman/python --enable-optimizations
make -j2 && make -j2 install

/ups/app/postgresql/barman/python/bin/pip3 install --upgrade pip
/ups/app/postgresql/barman/python/bin/python3 -V
安装argcomplete
/ups/app/postgresql/barman/python/bin/pip3 install argcomplete-1.12.3-py2.py3-none-any.whl
验证
/ups/app/postgresql/barman/python/bin/pip3 list

image-20210916110032631

安装argh
/ups/app/postgresql/barman/python/bin/pip3 install argh-0.26.2-py2.py3-none-any.whl
安装python_dateutil
/ups/app/postgresql/barman/python/bin/pip3 install python_dateutil-2.8.2-py2.py3-none-any.whl

Processing ./python_dateutil-2.8.2-py2.py3-none-any.whl
Collecting six>=1.5
Downloading six-1.16.0-py2.py3-none-any.whl (11 kB)
Installing collected packages: six, python-dateutil
Successfully installed python-dateutil-2.8.2 six-1.16.0

安装six
/ups/app/postgresql/barman/python/bin/pip3 install six-1.16.0-py2.py3-none-any.whl
安装psycopg2
tar -xf psycopg2-2.9.1.tar.gz
cd psycopg2-2.9.1
/ups/app/postgresql/barman/python/bin/python3 setup.py build_ext --pg-config /ups/app/postgresql/pgsql-12/bin/pg_config build
/ups/app/postgresql/barman/python/bin/python3 setup.py build_ext --pg-config /ups/app/postgresql/pgsql-12/bin/pg_config install
检查已安装python 包
ls -lrt /ups/app/postgresql/barman/python/lib/python3.8/site-packages

/ups/app/postgresql/barman/python/bin/pip3 list

image-20210916111327417

image-20210916111506296

安装barman

tar -xf barman-release-2.13.tar.gz
cd barman-release-2.13
/ups/app/postgresql/barman/python/bin/python3 setup.py build
/ups/app/postgresql/barman/python/bin/python3 setup.py install --user
验证环境
export PGHOME=/ups/app/postgresql/pgsql-12
export LD_LIBRARY_PATH=$PGHOME/lib

/ups/app/postgresql/barman/python/bin/barman -h
配置barman配置文件
cp -r barman-release-2.13/doc/barman.conf /ups/app/postgresql/barman/barman.conf
cp -r barman-release-2.13/doc/barman.d /ups/app/postgresql/barman/barman.d

配置PG Server端与barman(备份端)SSH互信

当PG Server服务器与备份端(barman)不在同一设备上,需要备份远程的PG Server数据的环境,需要配置SSH

创建ssh-key
export SSH='ssh -o ConnectTimeout=3 -o ConnectionAttempts=5 -o PasswordAuthentication=no -o StrictHostKeyChecking=no'

ssh-keygen -b 4096 -t rsa -P '' -f ~/.ssh/id_rsa
ssh-keygen -b 4096 -t dsa -P '' -f ~/.ssh/id_dsa
cat ~/.ssh/*.pub >> ~/.ssh/authorized_keys

${SSH} barman cat  ~/.ssh/*.pub >> ~/.ssh/authorized_keys
scp ~/.ssh/authorized_keys barman:~/.ssh/authorized_keys

chmod 600 ${HOME}/.ssh/authorized_keys

使用

用法

# /ups/app/postgresql/barman/python/bin/barman -h
usage: barman [-h] [-v] [-c CONFIG] [--color {never,always,auto}] [--log-level {NOTSET,DEBUG,INFO,WARNING,ERROR,CRITICAL}] [-q] [-d] [-f {console,json}]
              {archive-wal,backup,check,check-backup,cron,delete,diagnose,get-wal,list-backup,list-files,list-server,put-wal,rebuild-xlogdb,receive-wal,recover,show-backup,show-server,replication-status,status,switch-wal,switch-xlog,sync-info,sync-backup,sync-wals}
              ...

positional arguments:
  {archive-wal,backup,check,check-backup,cron,delete,diagnose,get-wal,list-backup,list-files,list-server,put-wal,rebuild-xlogdb,receive-wal,recover,show-backup,show-server,replication-status,status,switch-wal,switch-xlog,sync-info,sync-backup,sync-wals}
    archive-wal         Execute maintenance operations on WAL files for a given server. This command processes any incoming WAL files for the server and archives them along the catalogue.
    backup              Perform a full backup for the given server (supports 'all')
    check               Check if the server configuration is working. This command returns success if every checks pass, or failure if any of these fails
    check-backup        Make sure that all the required WAL files to check the consistency of a physical backup (that is, from the beginning to the end of the full backup) are correctly
                        archived. This command is automatically invoked by the cron command and at the end of every backup operation.
    cron                Run maintenance tasks (global command)
    delete              Delete a backup
    diagnose            Diagnostic command (for support and problems detection purpose)
    get-wal             Retrieve WAL_NAME file from SERVER_NAME archive. The content will be streamed on standard output unless the --output-directory option is specified.
    list-backup         List available backups for the given server (supports 'all')
    list-files          List all the files for a single backup
    list-server         List available servers, with useful information
    put-wal             Receive a WAL file from SERVER_NAME and securely store it in the incoming directory. The file will be read from standard input in tar format.
    rebuild-xlogdb      Rebuild the WAL file database guessing it from the disk content.
    receive-wal         Start a receive-wal process. The process uses the streaming protocol to receive WAL files from the PostgreSQL server.
    recover             Recover a server at a given time, name, LSN or xid
    show-backup         This method shows a single backup information
    show-server         Show all configuration parameters for the specified servers
    replication-status  Shows live information and status of any streaming client
    status              Shows live information and status of the PostgreSQL server
    switch-wal          Execute the switch-wal command on the target server
    switch-xlog         Execute the switch-wal command on the target server
    sync-info           Output the internal synchronisation status. Used to sync_backup with a passive node
    sync-backup         Command that synchronises a backup from a master to a passive node
    sync-wals           Command that synchronises WAL files from a master to a passive node

optional arguments:
  -h, --help            show this help message and exit
  -v, --version         show program's version number and exit
  -c CONFIG, --config CONFIG
                        uses a configuration file (defaults: ~/.barman.conf, /etc/barman.conf, /etc/barman/barman.conf)
  --color {never,always,auto}, --colour {never,always,auto}
                        Whether to use colors in the output (default: 'auto')
  --log-level {NOTSET,DEBUG,INFO,WARNING,ERROR,CRITICAL}
                        Override the default log level
  -q, --quiet           be quiet (default: False)
  -d, --debug           debug output (default: False)
  -f {console,json}, --format {console,json}
                        output format (default: 'console')

Barman by EnterpriseDB (www.enterprisedb.com)
[root@progs bin]# 

PostgreSQL服务端配置

配置pg_hba.conf文件
host    all             barman          0.0.0.0/0               trust
host    replication     barmstr         0.0.0.0/0               trust
postgresql.conf 文件配置
# WAL级别设置至少保证archive(v9.5为replica)以上的
wal_level = replica
# 开启归档模式
archive_mode = on
# 如果使用流复制的方式传输WAL到barman,archive_command则不是必须项
archive_command = ' '
# 最大WAL sender进程的数量,总数 > wal archive + streaming/logical replication
max_wal_senders = 5
# 最少保留WAL文件的数量,防止WAL被服务节点过早移除
wal_keep_segments = 64
创建数据库用户
创建连接PG的用户
  • 可以创建superuser 权限用户(barman)

    createuser -s -P barman
    # psql -c "create user barman with LOGIN superuser password 'barman123';"
    
  • 也可以创建普通用户并授权必要的权限

    createuser --replication -P barman
    psql <<EOF
    GRANT EXECUTE ON FUNCTION pg_start_backup(text, boolean, boolean) to barman;
    GRANT EXECUTE ON FUNCTION pg_stop_backup() to barman;
    GRANT EXECUTE ON FUNCTION pg_stop_backup(boolean, boolean) to barman;
    GRANT EXECUTE ON FUNCTION pg_switch_wal() to barman;
    GRANT EXECUTE ON FUNCTION pg_create_restore_point(text) to barman;
    
    GRANT pg_read_all_settings TO barman;
    GRANT pg_read_all_stats TO barman;
    EOF
    
    • 注意:barman switch-wal命令使用--force选项时,不生效
  • 测试登录

    psql -c "SELECT version()" -U barman postgres
    
创建streaming replication连接账号

当计划使用 WAL streaming or streaming backup功能时

  • 创建账号

    createuser -P --replication streaming_barman
    # psql -c "create user streaming_barman with LOGIN replication password 'barman123'"
    
  • 检查确认

    psql -U streaming_barman \
      -c "IDENTIFY_SYSTEM" \
      replication=1
      
    psql "dbname=postgres port=5432 replication=database user=streaming_barman password=barman123" -c "IDENTIFY_SYSTEM;"
    

    image-20210916130056330

barman配置

配置文件内容
  • global/general 配置
    • 配置文件默认路径:/etc/barman.conf
    • 配置参数项:包含主目录,系统用户,日志文件等
  • server 配置
    • 文件目录:/etc/barman.d下以.conf后缀名的文件

由于历史原因,可以使用一个文件包含全局和Server选项的配置文件。但不推荐

配置项类型
  • string
  • enum
  • integer
  • boolean, on/true/1 are accepted as well are off/false/0.
选项范围
  • 全局选项由 [barman] 标签标识

    [barman]
    ; ... global and global/server options go here
    
  • Server选项:在barman中标识PostgreSQL ID

    [pg]
    ; Configuration options for the
    ; server named 'pg' go here
    
创建软连接
sudo ln -s /ups/app/postgresql/barman/barman.conf /etc/barman.conf
sudo ln -s /ups/app/postgresql/barman/barman.d /etc/barman.d
barman.conf 配置
cat > /ups/app/postgresql/barman/barman.conf <<EOF
[barman]
barman_user = barman
configuration_files_directory = /ups/app/postgresql/barman/barman.d
;指定barman_home,即备份文件的存放路径
barman_home = /ups/app/postgresql/barman/bkdir
log_file = /ups/app/postgresql/barman/log/barman.log
log_level = INFO
compression = gzip
backup_options = concurrent_backup
minimum_redundancy = 0
; 备份集,备份集中数据,WAL文件的保留策略
retention_policy = RECOVERY WINDOW OF 1 WEEK
retention_policy_mode = auto
wal_retention_policy = main
; 指定目标数据库备份配置的所在目录
configuration_files_directory = /ups/app/postgresql/barman/barman.d
check_timeout = 5
EOF

streaming-server.conf 配置
cat > /ups/app/postgresql/barman/barman.d/pg12.conf <<EOF
[pg12]
description =  "pg12-5432 master database(Streaming Only)"
; Barman通过conninfo连接字符
conninfo = host=192.168.10.181 port=5432 user=barman dbname=postgres application_name=pg12
; 建立流复制的连接字符串和WAL流复制归档方式
streaming_conninfo = host=192.168.10.181 port=5432 user=streaming_barman dbname=postgres password=barman123 application_name=pg12
; 开启流复制方式的归档
streaming_archiver = on
; 通过插槽的方式管控WAL的流传输
slot_name = streaming_barman
create_slot = auto
; 建议针对数据库的备份方式
backup_method = postgres
backup_options = concurrent_backup
; 通过流复制传输的WAL存放的位置
streaming_wals_directory=/ups/app/postgresql/barman/streaming
minimum_redundancy = 2
retention_policy = RECOVERY WINDOW OF 1 DAYS
; 指定Barman服务器客户端psql的安装位置
path_prefix = "/ups/app/postgresql/pgsql-12/bin"
EOF
创建复制槽slot
barman --config /ups/app/postgresql/barman/barman.conf receive-wal --create-slot pg12

image-20210916143816476

在后台持续接收PostgreSQL服务端的WAL日志
barman receive-wal pg12 &

image-20210916170555865

备份前检查验证

验证备份服务配置的状态
barman --config /ups/app/postgresql/barman/barman.conf check pg12

确保所有状态OK后,才可以开启备份

备份操作

barman backup pg12

image-20210916202720547

恢复操作

恢复目标库
全量恢复
barman recover <target_database> <backup_id> <destination_directory> 
barman recover pg12 20210916T202157  ${PGDATA}
基于时间点(PITR)恢复
barman recover --target-time "2021-09-16 20:15:00" pg12 20210916T202157 ${PGDATA}

常用维护命令

#列出所有的备份服务信息(--minimal 参数仅显示服务名称,不包含描述信息)
$ barman list-server
#列出单个备份服务的所有配置信息
$ barman show-server <target_database>
#列出单个备份服务的备份集信息
$ barman list-backup <target_database>
 
# 查看单个备份集的信息
$ barman show-backup <target_database> <backup_id>
 
#检查备份服务配置状态
$ barman check <target_database>

# 检查备份服务的实时信息
$ barman status <target_database>
 
#以Json格式列出所有备份服务及备份集的详细信息
$ barman diagnose

#更多信息可以参考
barman --help

image-20210916203007489

故障排错

receive-wal running: FAILED

现象

image-20210916144647437

处理
# 可以尝试通过重建replication slot刷新WAL的接收状态
barman receive-wal --stop <target_database>
barman receive-wal --reset <target_database>
barman receive-wal --create-slot <target_database>
barman switch-wal <target_database>

WAL archive: FAILED

处理
barman cron

## 切换日志
barman switch-wal pg12  # pg 版本 >10
barman switch-xlog pg12 # pg 版本 <10

image-20210916202348213

检查结果

image-20210916202423996

posted @ 2021-09-16 20:44  KuBee  阅读(612)  评论(0编辑  收藏  举报