代码改变世界

PostgreSQL 一主两备节点(两备节点为同步节点)故障恢复

2016-03-10 19:53  DataBases  阅读(3256)  评论(0编辑  收藏  举报

PostgreSQL  同步复制及故障恢复

10.2.208.10:node1:master

10.2.208.11:node2:standby1 同步

10.2.208.12:node3:standby2 同步

10.2.208.13:node4:备用机

node1,node2,node3,node4均安装PostgreSQL数据库,master初始化数据库,standby1与standby2从master pg_basebackup。

[replica@node1 bin]$ ./psql -h localhost -p 6432 -U replica -d postgres

psql (9.5.1)

Type "help" for help.

 

postgres=# LOG:  standby "node2" is now the synchronous standby with priority 1

 

postgres=# select version();

                                                 version                                                 

---------------------------------------------------------------------------------------------------------

 PostgreSQL 9.5.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit

(1 row)

 

postgres=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 f

(1 row)

 

postgres=# \x

Expanded display is on.

postgres=# select * from pg_stat_replication ;

-[ RECORD 1 ]----+------------------------------

pid              | 14818

usesysid         | 10

usename          | replica

application_name | node2

client_addr      | 10.2.208.11

client_hostname  |

client_port      | 44530

backend_start    | 2016-03-10 11:16:10.850257+08

backend_xmin     |

state            | streaming

sent_location    | 0/4000220

write_location   | 0/4000220

flush_location   | 0/4000220

replay_location  | 0/4000220

sync_priority    | 1

sync_state       | sync

-[ RECORD 2 ]----+------------------------------

pid              | 14820

usesysid         | 10

usename          | replica

application_name | node3

client_addr      | 10.2.208.12

client_hostname  |

client_port      | 32839

backend_start    | 2016-03-10 11:16:28.010007+08

backend_xmin     |

state            | streaming

sent_location    | 0/4000220

write_location   | 0/4000220

flush_location   | 0/4000220

replay_location  | 0/4000220

sync_priority    | 2

sync_state       | potential

[replica@node2 bin]$ ./psql -h localhost -p 6432 -U replica -d postgres

psql (9.5.1)

Type "help" for help.

 

postgres=# select version();

                                                 version                                                

---------------------------------------------------------------------------------------------------------

 PostgreSQL 9.5.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit

(1 row)

 

postgres=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 t

(1 row)

 

postgres=# \x

Expanded display is on.

postgres=# select * from pg_stat_replication ;

(0 rows)

[replica@node3 bin]$ ./psql -h localhost -p 6432 -U replica -d postgres

psql (9.5.1)

Type "help" for help.

 

postgres=# select version();

                                                 version                                                

---------------------------------------------------------------------------------------------------------

 PostgreSQL 9.5.1 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9), 64-bit

(1 row)

 

postgres=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 t

(1 row)

 

postgres=# \x

Expanded display is on.

postgres=# select * from pg_stat_replication ;

(0 rows)

[replica@node1 ~]$ ps -ef|grep postgres

replica  14731     1  0 11:10 pts/2    00:00:00 /home/replica/bin/postgres -D ../data

replica  14733 14731  0 11:10 ?        00:00:00 postgres: checkpointer process  

replica  14734 14731  0 11:10 ?        00:00:00 postgres: writer process  

replica  14735 14731  0 11:10 ?        00:00:00 postgres: wal writer process  

replica  14736 14731  0 11:10 ?        00:00:00 postgres: autovacuum launcher process 

replica  14737 14731  0 11:10 ?        00:00:00 postgres: archiver process   last was 000000010000000000000003.00000028.backup

replica  14738 14731  0 11:10 ?        00:00:00 postgres: stats collector process  

replica  14749 13945  0 11:10 pts/2    00:00:00 ./psql -h localhost -p 6432 -U replica -d postgres

replica  14750 14731  0 11:10 ?        00:00:00 postgres: replica postgres ::1(43486) idle

replica  14818 14731  0 11:16 ?        00:00:00 postgres: wal sender process replica 10.2.208.11(44530) streaming 0/4000300

replica  14820 14731  0 11:16 ?        00:00:00 postgres: wal sender process replica 10.2.208.12(32839) streaming 0/4000300

replica  15020 14302  0 11:29 pts/0    00:00:00 grep --color=auto postgres

[replica@node2 ~]$ ps -ef|grep postgres

replica  27194     1  0 11:16 pts/2    00:00:00 /home/replica/bin/postgres -D ../data

replica  27195 27194  0 11:16 ?        00:00:00 postgres: startup process   recovering 000000010000000000000004

replica  27196 27194  0 11:16 ?        00:00:00 postgres: checkpointer process  

replica  27197 27194  0 11:16 ?        00:00:00 postgres: writer process  

replica  27198 27194  0 11:16 ?        00:00:00 postgres: stats collector process  

replica  27199 27194  0 11:16 ?        00:00:00 postgres: wal receiver process   streaming 0/4000300

replica  27216 26525  0 11:18 pts/2    00:00:00 ./psql -h localhost -p 6432 -U replica -d postgres

replica  27217 27194  0 11:18 ?        00:00:00 postgres: replica postgres ::1(43081) idle

replica  27417 27091  0 11:30 pts/0    00:00:00 grep --color=auto postgres

[replica@node3 ~]$ ps -ef|grep postgres

replica  20747     1  0 11:16 pts/1    00:00:00 /home/replica/bin/postgres -D ../data

replica  20748 20747  0 11:16 ?        00:00:00 postgres: startup process   recovering 000000010000000000000004

replica  20749 20747  0 11:16 ?        00:00:00 postgres: checkpointer process  

replica  20750 20747  0 11:16 ?        00:00:00 postgres: writer process  

replica  20751 20747  0 11:16 ?        00:00:00 postgres: stats collector process  

replica  20752 20747  0 11:16 ?        00:00:00 postgres: wal receiver process   streaming 0/4000300

replica  20769 20161  0 11:18 pts/1    00:00:00 ./psql -h localhost -p 6432 -U replica -d postgres

replica  20770 20747  0 11:18 ?        00:00:00 postgres: replica postgres ::1(58250) idle

replica  20936 20653  0 11:31 pts/2    00:00:00 grep --color=auto postgres

模拟同步节点Standby1 (node2)故障

1 停止Standby1的数据库服务

[replica@node1 bin]$ ./psql -h localhost -p 6432 -U replica -d replica

psql (9.5.1)

Type "help" for help.

 

replica=# \d+

                          List of relations

 Schema |       Name       | Type  |  Owner  |  Size   | Description

--------+------------------+-------+---------+---------+-------------

 public | pgbench_accounts | table | replica | 3651 MB |

 public | pgbench_branches | table | replica | 48 kB   |

 public | pgbench_history  | table | replica | 0 bytes |

 public | pgbench_tellers  | table | replica | 160 kB  |

(4 rows)

 

replica=# LOG:  standby "node3" is now the synchronous standby with priority 2

replica=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 f

(1 row)

 

replica=# \x

Expanded display is on.

replica=# select * from pg_stat_replication ;

-[ RECORD 1 ]----+------------------------------

pid              | 14820

usesysid         | 10

usename          | replica

application_name | node3

client_addr      | 10.2.208.12

client_hostname  |

client_port      | 32839

backend_start    | 2016-03-10 11:16:28.010007+08

backend_xmin     |

state            | streaming

sent_location    | 0/DEF730B8

write_location   | 0/DEF730B8

flush_location   | 0/DEF730B8

replay_location  | 0/DEF730B8

sync_priority    | 2

sync_state       | sync

[replica@node1 ~]$ ps -ef|grep postgres

replica  14731     1  0 11:10 pts/2    00:00:00 /home/replica/bin/postgres -D ../data

replica  14733 14731  0 11:10 ?        00:00:00 postgres: checkpointer process  

replica  14734 14731  0 11:10 ?        00:00:00 postgres: writer process  

replica  14735 14731  0 11:10 ?        00:00:01 postgres: wal writer process  

replica  14736 14731  0 11:10 ?        00:00:00 postgres: autovacuum launcher process 

replica  14737 14731  0 11:10 ?        00:00:00 postgres: archiver process   last was 0000000100000000000000DD

replica  14738 14731  0 11:10 ?        00:00:00 postgres: stats collector process  

replica  14820 14731  0 11:16 ?        00:00:02 postgres: wal sender process replica 10.2.208.12(32839) streaming 0/DEF73198

replica  15378 14731  0 11:36 ?        00:00:00 postgres: replica replica ::1(43489) idle

replica  15541 14302  0 11:47 pts/0    00:00:00 grep --color=auto postgres

[replica@node3 bin]$ ./psql -h localhost -p 6432 -U replica -d postgres

psql (9.5.1)

Type "help" for help.

 

postgres=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 t

(1 row)

 

postgres=# \x

Expanded display is on.

postgres=# select * from pg_stat_replication ;

(0 rows)

[replica@node3 ~]$ ps -ef|grep postgres

replica  20747     1  0 11:16 pts/1    00:00:00 /home/replica/bin/postgres -D ../data

replica  20748 20747  0 11:16 ?        00:00:16 postgres: startup process   recovering 0000000100000000000000DE

replica  20749 20747  0 11:16 ?        00:00:03 postgres: checkpointer process  

replica  20750 20747  0 11:16 ?        00:00:00 postgres: writer process  

replica  20751 20747  0 11:16 ?        00:00:00 postgres: stats collector process  

replica  20752 20747  0 11:16 ?        00:00:10 postgres: wal receiver process   streaming 0/DEF73198

replica  21127 20161  0 11:48 pts/1    00:00:00 ./psql -h localhost -p 6432 -U replica -d postgres

replica  21128 20747  0 11:48 ?        00:00:00 postgres: replica postgres ::1(58252) idle

replica  21196 20653  0 11:50 pts/2    00:00:00 grep --color=auto postgres

[replica@node1 bin]$ ./psql -h localhost -p 6432 -U replica -d replica

psql (9.5.1)

Type "help" for help.

replica=# create table test_tb(id int);

CREATE TABLE

replica=# insert into test_tb values (1),(2),(3);

INSERT 0 3

replica=# select * from test_tb ;

 id

----

  1

  2

  3

(3 rows)

[replica@node3 bin]$ ./psql -h localhost -p 6432 -U replica -d replica

psql (9.5.1)

Type "help" for help.

 

replica=# select * from test_tb ;

 id

----

  1

  2

  3

(3 rows)

重新启动Standby1

[replica@node1 bin]$ ./psql -h localhost -p 6432 -U replica -d replica

psql (9.5.1)

Type "help" for help.

 

replica=# select * from test_tb ;

 id

----

  1

  2

  3

(3 rows)

 

replica=# LOG:  standby "node2" is now the synchronous standby with priority 1

 

replica=#

查看复制状态

[replica@node1 bin]$ ./psql -h localhost -p 6432 -U replica -d replica

psql (9.5.1)

Type "help" for help.

 

replica=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 f

(1 row)

replica=# \x

Expanded display is on.

replica=#  select * from pg_stat_replication ;

-[ RECORD 1 ]----+------------------------------

pid              | 15705

usesysid         | 10

usename          | replica

application_name | node2

client_addr      | 10.2.208.11

client_hostname  |

client_port      | 44533

backend_start    | 2016-03-10 11:56:10.103849+08

backend_xmin     |

state            | streaming

sent_location    | 0/DEF89618

write_location   | 0/DEF89618

flush_location   | 0/DEF89618

replay_location  | 0/DEF89618

sync_priority    | 1

sync_state       | sync

-[ RECORD 2 ]----+------------------------------

pid              | 14820

usesysid         | 10

usename          | replica

application_name | node3

client_addr      | 10.2.208.12

client_hostname  |

client_port      | 32839

backend_start    | 2016-03-10 11:16:28.010007+08

backend_xmin     |

state            | streaming

sent_location    | 0/DEF89618

write_location   | 0/DEF89618

flush_location   | 0/DEF89618

replay_location  | 0/DEF89618

sync_priority    | 2

sync_state       | potential

 

[replica@node1 ~]$ ps -ef|grep postgres

replica  14731     1  0 11:10 pts/2    00:00:00 /home/replica/bin/postgres -D ../data

replica  14733 14731  0 11:10 ?        00:00:00 postgres: checkpointer process  

replica  14734 14731  0 11:10 ?        00:00:00 postgres: writer process  

replica  14735 14731  0 11:10 ?        00:00:01 postgres: wal writer process  

replica  14736 14731  0 11:10 ?        00:00:00 postgres: autovacuum launcher process 

replica  14737 14731  0 11:10 ?        00:00:00 postgres: archiver process   last was 0000000100000000000000DD

replica  14738 14731  0 11:10 ?        00:00:00 postgres: stats collector process  

replica  14820 14731  0 11:16 ?        00:00:02 postgres: wal sender process replica 10.2.208.12(32839) streaming 0/DEF896F8

replica  15705 14731  0 11:56 ?        00:00:00 postgres: wal sender process replica 10.2.208.11(44533) streaming 0/DEF896F8

replica  15861 14731  0 12:01 ?        00:00:00 postgres: replica replica ::1(43491) idle

replica  15946 14302  0 12:07 pts/0    00:00:00 grep --color=auto postgres

[replica@node2 bin]$ ./psql -h localhost -p 6432 -U replica -d replica

psql (9.5.1)

Type "help" for help.

 

replica=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 t

(1 row)

 

replica=# \x

Expanded display is on.

replica=# select * from pg_stat_replication ;

(0 rows)

[replica@node2 ~]$ ps -ef|grep postgres

replica  27714     1  0 11:56 pts/2    00:00:00 /home/replica/bin/postgres -D ../data

replica  27715 27714  0 11:56 ?        00:00:00 postgres: startup process   recovering 0000000100000000000000DE

replica  27716 27714  0 11:56 ?        00:00:00 postgres: checkpointer process  

replica  27717 27714  0 11:56 ?        00:00:00 postgres: writer process  

replica  27718 27714  0 11:56 ?        00:00:00 postgres: stats collector process  

replica  27719 27714  0 11:56 ?        00:00:00 postgres: wal receiver process   streaming 0/DEF896F8

replica  27908 27714  0 12:05 ?        00:00:00 postgres: replica replica ::1(43084) idle

replica  27942 27091  0 12:08 pts/0    00:00:00 grep --color=auto postgres

[replica@node3 bin]$ ./psql -h localhost -p 6432 -U replica -d replica

psql (9.5.1)

Type "help" for help.

 

replica=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 t

(1 row)

 

replica=# \x

Expanded display is on.

replica=# select * from pg_stat_replication ;

(0 rows)

[replica@node3 ~]$ ps -ef|grep postgres

replica  20747     1  0 11:16 pts/1    00:00:00 /home/replica/bin/postgres -D ../data

replica  20748 20747  0 11:16 ?        00:00:16 postgres: startup process   recovering 0000000100000000000000DE

replica  20749 20747  0 11:16 ?        00:00:03 postgres: checkpointer process  

replica  20750 20747  0 11:16 ?        00:00:00 postgres: writer process  

replica  20751 20747  0 11:16 ?        00:00:00 postgres: stats collector process  

replica  20752 20747  0 11:16 ?        00:00:11 postgres: wal receiver process   streaming 0/DEF896F8

replica  21444 20747  0 12:06 ?        00:00:00 postgres: replica replica ::1(58254) idle

replica  21507 20653  0 12:10 pts/2    00:00:00 grep --color=auto postgres

模拟Standby2(node3)故障

[replica@node3 bin]$ ./pg_ctl -D ../data/ stop

查看状态

[replica@node1 bin]$ ./psql -h localhost -p 6432 -U replica

psql (9.5.1)

Type "help" for help.

 

replica=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 f

(1 row)

 

replica=# \x

Expanded display is on.

replica=# select * from pg_stat_replication ;

-[ RECORD 1 ]----+------------------------------

pid              | 15705

usesysid         | 10

usename          | replica

application_name | node2

client_addr      | 10.2.208.11

client_hostname  |

client_port      | 44533

backend_start    | 2016-03-10 11:56:10.103849+08

backend_xmin     |

state            | streaming

sent_location    | 0/DEF89A78

write_location   | 0/DEF89A78

flush_location   | 0/DEF89A78

replay_location  | 0/DEF89A78

sync_priority    | 1

sync_state       | sync

[replica@node1 ~]$ ps -ef|grep postgres

replica  14731     1  0 11:10 pts/2    00:00:00 /home/replica/bin/postgres -D ../data

replica  14733 14731  0 11:10 ?        00:00:00 postgres: checkpointer process  

replica  14734 14731  0 11:10 ?        00:00:00 postgres: writer process  

replica  14735 14731  0 11:10 ?        00:00:01 postgres: wal writer process  

replica  14736 14731  0 11:10 ?        00:00:00 postgres: autovacuum launcher process 

replica  14737 14731  0 11:10 ?        00:00:00 postgres: archiver process   last was 0000000100000000000000DD

replica  14738 14731  0 11:10 ?        00:00:00 postgres: stats collector process  

replica  15705 14731  0 11:56 ?        00:00:00 postgres: wal sender process replica 10.2.208.11(44533) streaming 0/DEF8A5F0

replica  16112 14731  0 12:17 ?        00:00:00 postgres: replica replica ::1(43492) idle

replica  16329 14731  0 12:22 ?        00:00:00 postgres: replica replica ::1(43493) idle

replica  16493 14731  0 12:27 ?        00:00:00 postgres: replica replica ::1(43494) idle

replica  17772 17733  0 13:34 pts/5    00:00:00 grep --color=auto postgres

[replica@node2 bin]$ ./psql -h localhost -p 6432 -U replica

psql (9.5.1)

Type "help" for help.

 

replica=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 t

(1 row)

replica=# \x

Expanded display is on.

replica=# select * from pg_stat_replication ;

(0 rows)

[replica@node2 ~]$ ps -ef|grep postgres

replica  27714     1  0 11:56 pts/2    00:00:00 /home/replica/bin/postgres -D ../data

replica  27715 27714  0 11:56 ?        00:00:00 postgres: startup process   recovering 0000000100000000000000DE

replica  27716 27714  0 11:56 ?        00:00:00 postgres: checkpointer process  

replica  27717 27714  0 11:56 ?        00:00:00 postgres: writer process  

replica  27718 27714  0 11:56 ?        00:00:00 postgres: stats collector process  

replica  27719 27714  0 11:56 ?        00:00:01 postgres: wal receiver process   streaming 0/DEF8A5F0

replica  27908 27714  0 12:05 ?        00:00:00 postgres: replica replica ::1(43084) idle

replica  28311 27714  0 12:28 ?        00:00:00 postgres: replica replica ::1(43085) idle

replica  29330 29291  0 13:36 pts/3    00:00:00 grep --color=auto postgres

[replica@node1 bin]$ ./psql -h localhost -p 6432 -U replica

psql (9.5.1)

Type "help" for help.

 

replica=# create table test_tb2(id int);

CREATE TABLE

replica=# insert into test_tb2 values (1),(2),(3);

INSERT 0 3

[replica@node2 bin]$ ./psql -h localhost -p 6432 -U replica

psql (9.5.1)

Type "help" for help.

 

replica=# select * from test_tb2;

 id

----

  1

  2

  3

(3 rows)

重新启动Standby2(node3)

[replica@node3 bin]$ ./pg_ctl -D ../data/ start

[replica@node1 bin]$ ./psql -h localhost -p 6432 -U replica

psql (9.5.1)

Type "help" for help.

 

replica=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 f

(1 row)

 

replica=# \x

Expanded display is on.

replica=# select * from pg_stat_replication ;

-[ RECORD 1 ]----+------------------------------

pid              | 15705

usesysid         | 10

usename          | replica

application_name | node2

client_addr      | 10.2.208.11

client_hostname  |

client_port      | 44533

backend_start    | 2016-03-10 11:56:10.103849+08

backend_xmin     |

state            | streaming

sent_location    | 0/DEFA14B0

write_location   | 0/DEFA14B0

flush_location   | 0/DEFA14B0

replay_location  | 0/DEFA14B0

sync_priority    | 1

sync_state       | sync

-[ RECORD 2 ]----+------------------------------

pid              | 17908

usesysid         | 10

usename          | replica

application_name | node3

client_addr      | 10.2.208.12

client_hostname  |

client_port      | 32845

backend_start    | 2016-03-10 13:40:58.96505+08

backend_xmin     |

state            | streaming

sent_location    | 0/DEFA14B0

write_location   | 0/DEFA14B0

flush_location   | 0/DEFA14B0

replay_location  | 0/DEFA14B0

sync_priority    | 2

sync_state       | potential

[replica@node1 ~]$ ps -ef|grep postgres

replica  14731     1  0 11:10 pts/2    00:00:00 /home/replica/bin/postgres -D ../data

replica  14733 14731  0 11:10 ?        00:00:00 postgres: checkpointer process  

replica  14734 14731  0 11:10 ?        00:00:00 postgres: writer process  

replica  14735 14731  0 11:10 ?        00:00:01 postgres: wal writer process  

replica  14736 14731  0 11:10 ?        00:00:00 postgres: autovacuum launcher process 

replica  14737 14731  0 11:10 ?        00:00:00 postgres: archiver process   last was 0000000100000000000000DD

replica  14738 14731  0 11:10 ?        00:00:00 postgres: stats collector process  

replica  15705 14731  0 11:56 ?        00:00:00 postgres: wal sender process replica 10.2.208.11(44533) streaming 0/DEFA14B0

replica  16112 14731  0 12:17 ?        00:00:00 postgres: replica replica ::1(43492) idle

replica  16329 14731  0 12:22 ?        00:00:00 postgres: replica replica ::1(43493) idle

replica  17908 14731  0 13:40 ?        00:00:00 postgres: wal sender process replica 10.2.208.12(32845) streaming 0/DEFA14B0

replica  17913 14731  0 13:41 ?        00:00:00 postgres: replica replica ::1(43496) idle

replica  17967 17733  0 13:45 pts/5    00:00:00 grep --color=auto postgres

[replica@node2 bin]$ ./psql -h localhost -p 6432 -U replica

psql (9.5.1)

Type "help" for help.

 

replica=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 t

(1 row)

 

replica=# \x

Expanded display is on.

replica=# select * from pg_stat_replication ;

(0 rows)

[replica@node2 ~]$ ps -ef|grep postgres

replica  27714     1  0 11:56 pts/2    00:00:00 /home/replica/bin/postgres -D ../data

replica  27715 27714  0 11:56 ?        00:00:00 postgres: startup process   recovering 0000000100000000000000DE

replica  27716 27714  0 11:56 ?        00:00:00 postgres: checkpointer process  

replica  27717 27714  0 11:56 ?        00:00:00 postgres: writer process  

replica  27718 27714  0 11:56 ?        00:00:00 postgres: stats collector process  

replica  27719 27714  0 11:56 ?        00:00:01 postgres: wal receiver process   streaming 0/DEFA1590

replica  27908 27714  0 12:05 ?        00:00:00 postgres: replica replica ::1(43084) idle

replica  29467 27714  0 13:46 ?        00:00:00 postgres: replica replica ::1(43087) idle

replica  29478 29291  0 13:47 pts/3    00:00:00 grep --color=auto postgres

[replica@node3 bin]$ ./psql -h localhost -p 6432 -U replica

psql (9.5.1)

Type "help" for help.

 

replica=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 t

(1 row)

 

replica=# \x

Expanded display is on.

replica=# select * from pg_stat_replication ;

(0 rows)

[replica@node3 ~]$ ps -ef|grep postgres

replica  22845     1  0 13:40 pts/1    00:00:00 /home/replica/bin/postgres -D ../data

replica  22846 22845  0 13:40 ?        00:00:00 postgres: startup process   recovering 0000000100000000000000DE

replica  22847 22845  0 13:40 ?        00:00:00 postgres: checkpointer process  

replica  22848 22845  0 13:40 ?        00:00:00 postgres: writer process  

replica  22849 22845  0 13:40 ?        00:00:00 postgres: stats collector process  

replica  22850 22845  0 13:40 ?        00:00:00 postgres: wal receiver process   streaming 0/DEFA1590

replica  22919 22845  0 13:48 ?        00:00:00 postgres: replica replica ::1(58256) idle

replica  23029 22982  0 13:49 pts/2    00:00:00 grep --color=auto postgres

 

故障恢复处理

模拟Standby2故障

1 停止node3[10.2.208.12]数据库的服务

pg_ctl –D ../data stop –m fast

2 更改master[node1][10.2.208.10]postgresql.conf文件

synchronous_standby_names = ' node2,node4'

pg_ctl –D ../data reload

3 node4[10.2.208.13]从master pg_basebackup

mv recovery.done recovery.conf

primary_conninfo = 'host=10.2.208.10 port=6432 user=replica password=replica application_name=node4'

4 启动node4

pg_ctl –D ../data start

[replica@node1 bin]$ ./psql -h localhost -p 6432 -U replica

psql (9.5.1)

Type "help" for help.

replica=# select pg_is_in_recovery();

 pg_is_in_recovery

-------------------

 f

(1 row)

replica=# \x

Expanded display is on.

replica=# select * from pg_stat_replication ;

-[ RECORD 1 ]----+------------------------------

pid              | 15705

usesysid         | 10

usename          | replica

application_name | node2

client_addr      | 10.2.208.11

client_hostname  |

client_port      | 44533

backend_start    | 2016-03-10 11:56:10.103849+08

backend_xmin     |

state            | streaming

sent_location    | 0/E0000060

write_location   | 0/E0000060

flush_location   | 0/E0000060

replay_location  | 0/E0000060

sync_priority    | 1

sync_state       | sync

-[ RECORD 2 ]----+------------------------------

pid              | 19907

usesysid         | 10

usename          | replica

application_name | node4

client_addr      | 10.2.208.13

client_hostname  |

client_port      | 34767

backend_start    | 2016-03-10 15:08:29.425353+08

backend_xmin     |

state            | streaming

sent_location    | 0/E0000060

write_location   | 0/E0000060

flush_location   | 0/E0000060

replay_location  | 0/E0000060

sync_priority    | 2

sync_state       | potential

模拟Standby1故障

1 停止Standby1 的数据库服务

pg_ctl –D ../data stop –m fast

2更改master[10.2.208.10]的postgresql.conf

synchronous_standby_names = ' node3,node4'

pg_ctl –D ../data reload

3在node4[10.2.208.13] 从master pg_basebackup

mv recovery.done recovery.conf

primary_conninfo = 'host=10.2.208.10 port=6432 user=replica password=replica application_name=node4'

4 启动node4

pg_ctl –D ../data start

master

replica=# select * from pg_stat_replication ;

-[ RECORD 1 ]----+------------------------------

pid              | 20668

usesysid         | 10

usename          | replica

application_name | node4

client_addr      | 10.2.208.13

client_hostname  |

client_port      | 34770

backend_start    | 2016-03-10 15:45:49.027116+08

backend_xmin     |

state            | streaming

sent_location    | 0/E4000060

write_location   | 0/E4000060

flush_location   | 0/E4000060

replay_location  | 0/E4000000

sync_priority    | 2

sync_state       | potential

-[ RECORD 2 ]----+------------------------------

pid              | 20210

usesysid         | 10

usename          | replica

application_name | node3

client_addr      | 10.2.208.12

client_hostname  |

client_port      | 32856

backend_start    | 2016-03-10 15:18:01.507366+08

backend_xmin     |

state            | streaming

sent_location    | 0/E4000060

write_location   | 0/E4000060

flush_location   | 0/E4000060

replay_location  | 0/E4000060

sync_priority    | 1

sync_state       | sync

 

模拟主库故障

1 停止master[node1]的数据库服务

pg_ctl –D ../data stop –m fast

2更改Standby1[node2]的postgresql.conf文件

synchronous_standby_names = 'node3,node4'

pg_ctl –D ../data reload

pg_ctl –D ../data promote

3更改Standby2[node3]的recovery.conf文件

primary_conninfo = 'host=10.2.208.11 port=6432 user=replica password=replica application_name=node3'

pg_ctl –D ../data restart

在node2上查看复制状态

select * from pg_stat_replication;

[replica@node2 bin]$ ./psql -h localhost -p 6432 -U replica

psql (9.5.1)

Type "help" for help.

replica=# LOG:  standby "node3" is now the synchronous standby with priority 1

replica=# \x

Expanded display is on.

replica=# select * from pg_stat_replication ;

-[ RECORD 1 ]----+-----------------------------

pid              | 31996

usesysid         | 10

usename          | replica

application_name | node3

client_addr      | 10.2.208.12

client_hostname  |

client_port      | 38750

backend_start    | 2016-03-10 16:21:42.23142+08

backend_xmin     |

state            | streaming

sent_location    | 0/E7000060

write_location   | 0/E7000060

flush_location   | 0/E7000060

replay_location  | 0/E7000060

sync_priority    | 1

sync_state       | sync

4在node4[10.2.208.13] 从node2[10.2.208.11] pg_basebackup

mv recovery.done recovery.conf

primary_conninfo = 'host=10.2.208.11 port=6432 user=replica password=replica application_name=node4'

4 启动node4

pg_ctl –D ../data start

 

 

replica=# select * from pg_stat_replication ;

-[ RECORD 1 ]----+------------------------------

pid              | 31996

usesysid         | 10

usename          | replica

application_name | node3

client_addr      | 10.2.208.12

client_hostname  |

client_port      | 38750

backend_start    | 2016-03-10 16:21:42.23142+08

backend_xmin     |

state            | streaming

sent_location    | 0/E7000060

write_location   | 0/E7000060

flush_location   | 0/E7000060

replay_location  | 0/E7000060

sync_priority    | 1

sync_state       | sync

-[ RECORD 2 ]----+------------------------------

pid              | 32031

usesysid         | 10

usename          | replica

application_name | node4

client_addr      | 10.2.208.13

client_hostname  |

client_port      | 53133

backend_start    | 2016-03-10 16:25:27.541883+08

backend_xmin     |

state            | streaming

sent_location    | 0/E7000060

write_location   | 0/E7000060

flush_location   | 0/E7000060

replay_location  | 0/E7000000

sync_priority    | 2

sync_state       | potential

1主节点+1同步节点+1异步节点:缺点,同步节点故障,主节点将不提供写操作(insert,update)

三节点级联复制:1主节点+1上游节点(同步)+1下游节点:上游节点故障,主节点将不提供写操作(insert ,update)

同步复制中的recovery.conf 中的application_name与postgresql.conf中的synchronous_standby_names可以使用IP地址代替hostname