Postgresql-rman

  • 联机程序. 并且目标数据库必须处于归档模式。
  • 支持在线全备, 增量备份, 归档备份
    • 增量备份基于已经存在的一个全库备份
  • rman 本身使用pg_start_backup(), copy, pg_stop_backup() 备份模式

本身采用的是文本拷贝… cp/fwrite;

  • pg_start_backup()
    • text 用户定义的标签, 是备份转储文件将被存储的名字
    • boolean 指尽快执行pg_start_backup. 这将会强制一个立即执行的检查点, 会导致I/O操作的峰值, 拖慢任何并发执行的查询.
    • boolean 如果为false, 则在完成备份后, pg_stop_backup将立即返回,而无需等待WAL归档
  • pg_stop_backup()

rman整体架构

1564449414429

默认配置参数:

  1. PGDATA
  2. BACKUP_PATH
  3. ARCLOG_PATH

pg_rman init

1564449631751

pg_rman show

pg_rman config --list

pg_rman backup -b full

1564449737711

​ -b inc [incremental]

1564449766687

​ -b arch [archive]

pg_rman restore

1564450036123

[新增功能] pg_rman blockrecover --datafile tablespaceOid/databaseOid/relfilenode --block 0

1564450128908

备份策略

  1. 恢复窗口: 指定天数. 默认值为 7.
  2. 备份数量: 冗余度保留。 默认值为 1.

代码组织架构:

.
├── backup.c
├── blockrecover.c
├── catalog.c
├── COPYRIGHT
├── data.c
├── delete.c
├── dir.c
├── docs
├── expected
├── idxpagehdr.h
├── init.c
├── Makefile
├── parray.c
├── parray.h
├── pg_rman.c
├── pg_rman.h
├── pgsql_src
├── pgut
├── README.md
├── restore.c
├── script
├── show.c
├── sql
├── util.c
├── validate.c
└── xlog.c

pg_rman-源码浅析

代码阅读

 * +----------------+---------------------------------+  
 * | PageHeaderData | linp1 linp2 linp3 ...           |  
 * +-----------+----+---------------------------------+  
 * | ... linpN |                                      |  
 * +-----------+--------------------------------------+  
 * |               ^ pd_lower                         |  
 * |                                                  |  
 * |                     v pd_upper                   |  
 * +-------------+------------------------------------+  
 * |                     | tupleN ...                 |  
 * +-------------+------------------+-----------------+  
 * |       ... tuple3 tuple2 tuple1 | "special space" |  
 * +--------------------------------+-----------------+

如果有数据刷入, 那么将会做持久化,数据库页头部的pd_lsn表示该数据库页最后一次变化时, 变化产生的REDO在wal file中的结束为止.

如果wal flush的lsn插入位置 大于或者等于这个pd_lsn将表示这个页的更改是可靠的. 即每次修改都将发生块的变化: 包含LSN的修改.

即可以通过第一次备份开始时的全局LSN, 以及当前需要备份的数据的Page LSN来判断此页是否发生过修改.

修改了即备份,没修改不需要备份, 从而实现数据库的块级别增量备份

增量备份关联代码:

			pgBackupGetPath(prev_backup, prev_file_txt, lengthof(prev_file_txt),
				DATABASE_FILE_LIST);
			prev_files = dir_read_file_list(pgdata, prev_file_txt);

			/*
			 * Do backup only pages having larger LSN than previous backup.
			 */
			lsn = &prev_backup->start_lsn;
			xlogid = (uint32) (*lsn >> 32);
			xrecoff = (uint32) *lsn;
			elog(DEBUG, _("backup only the page updated after LSN(%X/%08X)"),
							xlogid, xrecoff);

		/* Construct the directory for this backup within BACKUP_PATH. */
		pgBackupGetPath(&current, path, lengthof(path), DATABASE_DIR);

		/* Save the files listed above. */
		backup_files(pgdata, path, files, prev_files, lsn, current.compress_data, NULL);

[新增]块恢复代码:

	for (loop = 0; loop <= brc.base_index; loop++)
	{
		backup = (pgBackup *) parray_get(backups, loop);

		/* don't use incomplete nor different timeline backup */
		if (backup->status != BACKUP_STATUS_OK || backup->tli != base_backup->tli)
			continue;
		if(-1 == brc.lastBackupIndex && HAVE_ARCLOG(backup) && brc.last_needed_index >= loop)
		{
			restore_archive_logs(backup,true);
		}
		/* use database backup only */
		if (BACKUP_MODE_INCREMENTAL > backup->backup_mode || brc.last_needed_index < loop)
			continue;

		elog(DEBUG, "found backup BK_KEY: \"%d\" can be used ",backup->backup_id);

		recoverBackup(backup,loop);
        =>  [[
            	for(loop = 0; loop < brc.rbNum; loop++)
                {
                    /*If this block has find a page,skip it*/
                    if(brc.pageArray[loop])
                    {
                        elog(DEBUG,"block \'%u\' has find it's page,skip.",brc.recoverBlock[loop]);
                        continue;
                    }
                    page = findPageInBackup(backup, brc.recoverBlock[loop]);
                    if(page)
                    {
                        brc.pageArray[loop] = page;
                        if(-1 == brc.lastBackupIndex)
                        {
                            brc.lastBackupIndex = backupindex;
                            elog(DEBUG,"Find last backup can be used:BK_KEY \'%d\'",backup->backup_id);
                        }
                    }
                }
        	]]
	}

问题:

  1. 随意增大filenode大小, 即无法整除8192时, 会默认增大一个Page。 此时的Page是不完整的. pg默认不开启checksum校验. 因此Pg会提示blk Num无效, 进行blockrecover操作时, 将会发生无法恢复. 因为整个filenode本身就没有正确的此Page;
  2. 当随意修改Page数据时, 有时会发生显示数据不全,即数据条目与插入条目不符的情况. 此时Pg本身无法正常的数据异常告警. 请开启checksum. 进行验证.

checkSum异常告警;

WARNING:  01000: page verification failed, calculated checksum 11654 but expected 8293
  1. 确定table的tuple Num
  2. 确定table的page Num

确保开启checksum功能, 保证Page的数据正常. 但对上述问题不产生有效影响;;

posted @ 2019-09-10 09:09  Rocky_Ansi  阅读(338)  评论(0编辑  收藏  举报