postgresql的相关服务端问题及解决方法

Caused by: org.postgresql.util.PSQLException: ERROR: connection to the remote node 10.20.30.195:5435 failed with the following error: could not fork new process for connection: Resource temporarily unavailable

服务器资源不足,可能是内存、linux共享内存段配置太低、overcommit超了或进程数。如:

CommitLimit:    199412136 kB
Committed_AS:   199564684 kB  # 接近limit时就可能提示资源临时不可用

 

### Cause: org.postgresql.util.PSQLException: ERROR: out of memory
  Detail: Failed on request of size 40364 in memory context "CacheMemoryContext".
  Where: while executing command on 10.20.30.217:5437
; SQL []; ERROR: out of memory

内存不足、linux共享内存段配置太低、overcommit超了。可以查询pg_shmem_allocations查看当前的实际内存分配情况:

select * from pg_shmem_allocations;
name                               |off        |size       |allocated_size|
-----------------------------------+-----------+-----------+--------------+
Buffer Blocks                      | 1084544896|60615032832|   60615032832|
Buffer Descriptors                 |  137435008|  947109888|     947109888|
<anonymous>                        |           |  572226944|     572226944|
DirtyPageQueue                     |63185582464|  295971840|     295971840|
Buffer IO Locks                    |61699577728|  236777472|     236777472|
PageWriter Data                    |62823015552|  177584504|     177584512|
Checkpointer Data                  |62645432320|  177583168|     177583232|
Checkpoint BufferIds               |61936355200|  147985920|     147985920|
Pagewriter BufferIds               |63000600064|  147985920|     147985920|
XLOG Ctl                           |     309504|  134358672|     134358784|
                                   |63514032256|   75728768|      75728768|
CandidateBuffers                   |63148585984|   29597184|      29597184|
BatchFileHead                      |63481561216|   13508608|      13508608|
CandidateFreeMap                   |63178183168|    7399296|       7399296|
Backend Activity Buffer            |62641214720|    2152448|       2152448|
Xact                               |  134668672|    2116320|       2116352|
KnownAssignedXids                  |62638078976|    1580280|       1580288|
Prepared Transaction Table         |62644056704|    1056016|       1056128|
Backend Status Array               |62640054400|     891248|        891264|
Backend SSL Status Buffer          |62643367168|     689456|        689536|
KnownAssignedXidsValid             |62639659264|     395070|        395136|
Subtrans                           |  136918784|     267008|        267008|
Shared Buffer Lookup Table         |62084341120|     263000|        263040|
ProcSignal                         |62645297664|     134536|        134656|
Backend Client Host Name Buffer    |62641080192|     134528|        134528|
Backend Application Name Buffer    |62640945664|     134528|        134528|
CommitTs                           |  136785024|     133568|        133632|
MultiXactMember                    |  137252608|     133568|        133632|
Serial                             |62632431360|     133568|        133632|
shmInvalBuffer                     |62645148800|     132056|        132096|
Notify                             |63495199744|      66816|         66816|
MultiXactOffset                    |  137185792|      66816|         66816|
lt_stat_activity                   |63496779392|      48800|         48896|
Shared MultiXact State             |  137386240|      48676|         48768|
Async Queue Control                |63495158016|      41612|         41728|
Background Worker Data             |62645112832|      35856|         35968|
PROCLOCK hash                      |62482482432|      33624|         33664|
PREDICATELOCK hash                 |62540721280|      33624|         33664|
SingleFlushState                   |63495086848|      32767|         32768|
BTree Vacuum State                 |63495132288|      24948|         24960|
Proc Array                         |62638054528|      24352|         24448|
PREDICATELOCKTARGET hash           |62517836032|      17240|         17280|
LOCK hash                          |62447690112|      17240|         17280|
PMSignalState                      |62645280896|      16668|         16768|
SingleFileHead                     |63495070464|      16384|         16384|
AutoVacuum Data                    |63495119616|       5368|          5376|
Fast Path Strong Relation Lock Data|62517831808|       4100|          4224|
CkptRedoState                      |63481554304|       3840|          3840|
pg_show_plans hash                 |63496828416|       2904|          2944|
pg_stat_statements hash            |63495270912|       2904|          2944|
SERIALIZABLEXID hash               |62615371648|       2904|          2944|
ReplicationSlot Ctl                |63495124992|       2480|          2560|
BatchFileCxts                      |63481558784|       2416|          2432|
Wal Receiver Ctl                   |63495129344|       2248|          2304|
Wal Sender Ctl                     |63495128192|       1040|          1152|
Sync Scan Locations List           |63495157248|        656|           768|
DW Batch Data                      |63481558144|        552|           640|
DW Single Data                     |63495069824|        552|           640|
ReplicationOriginState             |63495127552|        568|           640|
Logical Replication Launcher Data  |63495131648|        424|           512|
Control File                       |  134668288|        312|           384|
pg_show_plans                      |63496828288|         16|           128|
FinishedSerializableTransactions   |62632431232|         16|           128|
Proc Header                        |62632565120|        112|           128|
CommitTs shared                    |  136918656|         32|           128|
OldSnapshotControlData             |63495132160|         68|           128|
PredXactList                       |62605160448|         88|           128|
RWConflictPool                     |62617843840|         24|           128|
Buffer Strategy Status             |62447689984|         28|           128|
SerialControlData                  |62632564992|         12|           128|
pg_stat_statements                 |63495270784|         48|           128|
Cluster Encryption Key             |63495270656|         18|           128|

lightdb 23.1版本开始,在lightdb.log启动日志中包含了实际申请的共享内存大小,如下:

[lightdb@hs-10-20-30-199 13.8-23.1]$ lt_ctl -D test_incre restart
waiting for server to shut down.................. done
server stopped
2023-05-12 21:37:01.398108T,,,,,postmaster,,00000,2023-05-12 21:37:01 CST,0,11193,LOG: LightDB autoprewarm: prewarm dbnum=0
2023-05-12 21:37:01.400707T,,,,,postmaster,,00000,2023-05-12 21:37:01 CST,0,11193,INFO: invoking IpcMemoryCreate(size=62444929024)
waiting for server to start....2023-05-12 21:37:02.261923T,,,,,postmaster,,00000,2023-05-12 21:37:01 CST,0,11193,LOG: redirecting log output to logging collector process
2023-05-12 21:37:02.261923T,,,,,postmaster,,00000,2023-05-12 21:37:01 CST,0,11193,HINT: Future log output will appear in directory "log".
done
server started

2021-12-31 14:03:21.441691C [unknown] lightdb@findptdis 10.20.30.193(12146) client backend SELECT 57P01[2021-12-31 13:45:14 UTC] 0 [71437] ERROR:  terminating connection due to administrator command

服务器进程被正常或异常重启或停止或kill,包括具体的某个backend进程。

 

2PC未成功,回滚响应未收到,第二次重复回滚导致。

 

posted @ 2022-01-01 18:06  zhjh256  阅读(1368)  评论(0编辑  收藏  举报