会话相关

查看系统中会话/连接/SQL与锁情况

1、查看当前活动的客户端连接数

 SELECT count(*) FROM pg_stat_activity WHERE NOT pid=pg_backend_pid();


查看会话:
SELECT * FROM pg_stat_activity;


查看锁信息
SELECT locktype, database, c.relname, l.relation, 
l.transactionid, l.pid, l.mode, l.granted, 
a.query 
        FROM pg_locks l, pg_class c, pg_stat_activity a 
        WHERE l.relation=c.oid AND l.pid=a.pid 
        ORDER BY c.relname;

2、查询客户端连接的情况

SELECT
	pid,
CASE
		
		WHEN waiting = 'f' THEN
		'already get lock,sql executing' 
		WHEN waiting = 't' THEN
		'waiting get lock,sql waiting execute' 
	END lock_satus,
	CURRENT_TIMESTAMP - LEAST ( query_start, xact_start ) AS runtime,
	substr( query, 1, 25 ) AS current_query 
FROM
	pg_stat_activity 
WHERE
	NOT pid = pg_backend_pid ( ) 
	AND STATE <> 'idle' 
	AND application_name <> 'pg_statsinfod' 
ORDER BY
	runtime DESC

3、查看持有锁和等待锁的一些信息(已经修改验证)

--reltype=0代表其为索引
SELECT
	locker.pid,
	pc.relname,
	locker.MODE,
	locker_act.application_name,
	LEAST ( query_start, xact_start ) start_time,
CASE
		
		WHEN locker.GRANTED = 'f' THEN
		'wait_lock' 
		WHEN locker.GRANTED = 't' THEN
		'get_lock' 
	END lock_satus,
	CURRENT_TIMESTAMP - LEAST ( query_start, xact_start ) AS runtime,
	locker_act.query 
FROM
	pg_locks locker,
	pg_stat_activity locker_act,
	pg_class pc 
WHERE
	locker.pid = locker_act.pid 
	AND NOT locker.pid = pg_backend_pid ( ) 
	AND application_name <> 'pg_statsinfod' 
	AND locker.relation = pc.oid 
	AND pc.reltype <> 0 --and pc.relname='t'
	
ORDER BY
	runtime DESC;

4、查询系统中正在执行的或者等待执行的事务

--注意其只是代表事务信息,系统中也有可能存在慢的查询

SELECT
	pc.relname lock_table,
	pc.oid,
	tans.pid,
CASE
		
		WHEN waiting = 'f' THEN
		'already get lock,sql executing' 
		WHEN waiting = 't' THEN
		'waiting get lock,sql waiting execute' 
	END lock_satus,
	LEAST ( query_start, xact_start ) query_start,
	CURRENT_TIMESTAMP - LEAST ( query_start, xact_start ) AS runtime,
	psa.query 
FROM
	pg_locks tans,
	pg_locks pl,
	pg_class pc,
	pg_stat_activity psa 
WHERE
	tans.transactionid IS NOT NULL 
	AND pc.oid = pl.relation 
	AND tans.pid = pl.pid 
	AND tans.pid = psa.pid 
	AND pc.reltype <> 0 
ORDER BY
	runtime DESC;

5、查看系统中正在执行的sql与lock_table有关的信息

SELECT
	locktype,
	pg_locks.pid,
	virtualtransaction,
	transactionid,
	nspname,
	relname,
	MODE,
	GRANTED,
CASE
		
		WHEN GRANTED = 'f' THEN
		'get_lock' 
		WHEN GRANTED = 't' THEN
		'wait_lock' 
	END lock_satus,
CASE
	
	WHEN waiting = 'f' THEN
	'already get lock,sql executing' 
	WHEN waiting = 't' THEN
	'waiting get lock,sql waiting execute' 
	END lock_satus,
	CURRENT_TIMESTAMP - LEAST ( query_start, xact_start ) AS runtime,
	CAST ( date_trunc( 'second', query_start ) AS TIMESTAMP ) AS query_start,
	substr( query, 1, 25 ) AS query 
FROM
	pg_locks
	LEFT OUTER JOIN pg_class ON ( pg_locks.relation = pg_class.oid )
	LEFT OUTER JOIN pg_namespace ON ( pg_namespace.oid = pg_class.relnamespace ),
	pg_stat_activity 
WHERE
	NOT pg_locks.pid = pg_backend_pid ( ) 
	AND pg_locks.pid = pg_stat_activity.pid 
	AND pg_class.relname = 't' --此处进行替换
	
ORDER BY
	query_start;

6、查看PostgreSQL正在执行的SQL

SELECT
	procpid,
	START,
	now( ) - START AS lap,
	current_query 
FROM
	(
	SELECT
		backendid,
		pg_stat_get_backend_pid ( S.backendid ) AS procpid,
		pg_stat_get_backend_activity_start ( S.backendid ) AS START,
		pg_stat_get_backend_activity ( S.backendid ) AS current_query 
	FROM
		( SELECT pg_stat_get_backend_idset ( ) AS backendid ) AS S 
	) AS S,
	pg_stat_activity pa 
WHERE
	current_query <> '<IDLE>' 
	AND procpid <> pg_backend_pid ( ) 
	AND pa.pid = s.procpid 
	AND pa.STATE <> 'idle' 
ORDER BY
	lap DESC;

procpid:进程 ID 
START:进程开始时间 
lap:经过时间 
current_query:执行中的 SQL 

怎样停止正在执行的 SQL 
SELECT pg_cancel_backend ( 进程 ID );
或者用系统函数 kill - 9 进程 ID;

--查找是否有waiting

ps -ef|grep postgres | grep wait

7、查看当前库表和索引的的大小并排序显示前20条

SELECT
	nspname,
	relname,
	relkind AS "type",
	pg_size_pretty ( pg_table_size ( C.oid ) ) AS SIZE,
	pg_size_pretty ( pg_indexes_size ( C.oid ) ) AS idxsize,
	pg_size_pretty ( pg_total_relation_size ( C.oid ) ) AS "total" 
FROM
	pg_class
	C LEFT JOIN pg_namespace N ON ( N.oid = C.relnamespace ) 
WHERE
	nspname NOT IN ( 'pg_catalog', 'information_schema' ) 
	AND nspname !~ '^pg_toast' 
	AND relkind IN ( 'r', 'i' ) 
ORDER BY
	pg_total_relation_size ( C.oid ) DESC 
	LIMIT 20;

**原文:https://blog.csdn.net/rudygao/article/details/49334001
**
抽空网上收罗Greenplum常用SQL查询语句整理备忘。欢迎各位留言补充。都是SQL命令以及数据字典的使用。熟悉数据字典非常重要。三个重要的schema:pg_catalog,pg_toolkit,information_schema,其中information_schema 中的数据字典都在视图中

目录
一、查看表某模式所有分布键信息
二、 数据库运行状态查询管理
1.greenplum查询正在运行的sql,session
2.终止执行的sql
3.查看greemplum资源队列状态
4.查看greemplum资源队列锁
5.查看greemplum资源队列优先级
6.查看greemplum所有连接 类似mysql SHOW PROCESSLIST
7.greemplum磁盘使用,通过SQL查看Greenplum中用了多少空间
8.查看greemplum节点状态
9.节点故障等历史信息
10.数据倾斜
11.greemplum表或索引大小 (占用空间)
12.greemplum表和索引大小(占用空间)
13.greemplum查看指定数据库大小(占用空间)
14.greemplum所有数据库大小(占用空间)
15.查看greemplum数据分布情况
三、查源数据
16.查看greemplum数据表更新时间
17.通过sql 获取greemplum表的预估数据量
18.通过sql 获取greemplum获取分布键
19.通过sql获取 greemplum指定表结构
20.显示哪些没有统计信息且可能需要ANALYZE的表
21.显示在系统表中被标记为掉线的Segment的信息
22.显示库中表的大小(单位G)
23.查询一个库中有多少表(如果有分区表不列出子分区)
24.查询某个用户对某个表有什么权限
25.查看分区表的信息
26.导入数据
27.远程导入数据
28.导数据指定分隔符(和mysql的select into outfile很像)
29.生成授权语句
30 .给用户授权
31、改变postgreSQL的默认schema

一、查看表某模式所有分布键信息

SELECT
aaa.nspname AS "模式名",
aaa.relname AS "表名",
aaa.table_comment AS "中文表明",
ccc.attname AS "分布键"
FROM
 (
  SELECT   aa.oid,
    obj_description (aa.oid) AS table_comment,
    aa.relname,
    bb.localoid,
    bb.attrnums,
    regexp_split_to_table (
     array_to_string (bb.attrnums, ','),
     ',' 
  ) att,
    dd.nspname   FROM   pg_class aa --原数据信息 最重要的表!
  LEFT
 JOIN gp_distribution_policy bb ON bb.localoid = aa.oid --分布键表
LEFT
 JOIN pg_namespace dd ON dd.oid = aa.relnamespace --模式
LEFT
 JOIN pg_inherits hh ON aa.oid = hh.inhrelid --继承表 
WHERE   dd.nspname = 'dim'  -- 替换成需要的模式
AND hh.inhrelid IS NULL
 ) aaa
LEFT JOIN pg_attribute ccc ON ccc.attrelid = aaa.oid
AND ccc.attnum = aaa.att
WHERE
ccc.attnum > 0
ORDER BY
aaa.relname;

二、 数据库运行状态查询管理

1.greenplum查询正在运行的sql,session

-- 方法1:

SELECT
    pid,  -- pid
    usename user_name, -- 执行的用户
    backend_start,  -- 会话开始时间
    query_start, -- 查询开始时间
    waiting,  -- 是否等待执行
    now() - query_start AS current_query_time, -- 累计执行时间
  now() - backend_start AS current_session_time,
    query,
    client_addr , datname
FROM
    pg_stat_activity
 WHERE      query != '<IDLE>'
ORDER BY current_query_time DESC;

-- 方法2(通过视图查)

SELECT
    procpid,
  START,
  now()   - START AS lap,
    current_query,
  -- count() over() count_num,
  t2.rolname,t3.rsqname,
  ip
FROM
    (
        SELECT
            backendid,
      pg_stat_get_backend_userid(S.backendid) as uid,
      pg_stat_get_backend_client_addr(S.backendid) as ip,
            pg_stat_get_backend_pid (S.backendid) AS procpid,
            pg_stat_get_backend_activity_start (S.backendid) AS START,
            pg_stat_get_backend_activity (S.backendid) AS current_query
        FROM
            (
                SELECT
                    pg_stat_get_backend_idset () AS backendid
            ) AS S
    ) AS t1 left join pg_authid  t2 on t1.uid=t2.oid
    left join pg_resqueue t3 on t2.rolresqueue=t3.oid
WHERE
    current_query!= '<IDLE>'
ORDER BY lap DESC;

-- 方法3(限定了角色和资源队列,查当前账号正在查询的语句)

SELECT
    rolname,
    rsqname,
    pid,
    GRANTED,
    current_query,
    datname
FROM
    pg_roles  t1,
    gp_toolkit.gp_resqueue_status t2 ,
    pg_locks t3 ,
    pg_stat_activity t4 
WHERE
    t1.rolresqueue = t3.objid
AND t3.objid=t2.queueid
and t4.procpid=t3.pid

2.终止执行的sql

select pg_terminate_backend(48988); --pid

3.查看greemplum资源队列状态

SELECT * FROM gp_toolkit.gp_resqueue_status;

4.查看greemplum资源队列锁

SELECT * FROM gp_toolkit.gp_locks_on_resqueue WHERE lorwaiting='true';

5.查看greemplum资源队列优先级

select * from gp_toolkit.gp_resq_priority_statement;

6.查看greemplum所有连接 类似mysql SHOW PROCESSLIST

select * from pg_stat_activity; -- 所有状态的连接

7.greemplum磁盘使用,通过SQL查看Greenplum中用了多少空间

select datname,pg_size_pretty(pg_database_size(datname)) from pg_database;

8.查看greemplum节点状态

select * from gp_segment_configuration tt
select * from gp_segment_configuration tt where tt.status='d'; -- 状态为down

9.节点故障等历史信息

select * from gp_configuration_history tt order by 1 desc ;

10.数据倾斜

SELECT
    t1.gp_segment_id,
    t1.count_tatol,
    round(t1.count_tatol-(AVG(t1.count_tatol)  over()) ,0)
FROM
    (
        SELECT
            gp_segment_id,
            COUNT (*) count_tatol
        FROM
           <tablename>  -- 要查的表
        GROUP BY
            gp_segment_id
    ) t1
    order by 3

11.greemplum表或索引大小 (占用空间)

select pg_size_pretty(pg_relation_size('gp_test'));

12.greemplum表和索引大小(占用空间)

select pg_size_pretty(pg_total_relation_size('gp_test'));

13.greemplum查看指定数据库大小(占用空间)

select pg_size_pretty(pg_database_size('postgres'));

14.greemplum所有数据库大小(占用空间)

select datname,pg_size_pretty(pg_database_size(datname)) from pg_database;

15.查看greemplum数据分布情况

select gp_segment_id,count(*) from gp_test group by gp_segment_id order by 1;

三、查源数据

16.查看greemplum数据表更新时间

SELECT
    *
FROM
    pg_stat_last_operation,
    pg_class
WHERE
    objid = oid
AND relname = 'base_common'; -- 表名

17.通过sql 获取greemplum表的预估数据量

select
    relname,
    reltuples::int as total
from
    pg_class
where
    relname = 'base_common'
    and relnamespace = (select oid from pg_namespace where nspname = 'positions');

18.通过sql 获取greemplum获取分布键

select n.nspname AS "模式名",c.relname AS "表名",case when p.policytype='p' then '分区表' when p.policytype='r' then '复制分布' end "分布策略", pg_get_table_distributedby(c.oid)  "分布键" ,c.reloptions "压缩类型"
from pg_class c
left join gp_distribution_policy p on c.oid=p.localoid
left join pg_namespace n on c.relnamespace=n.oid 
where c.relname in ('表名');


 SELECT string_agg(att.attname,',' order by attrnums) as distribution
        FROM gp_distribution_policy a,pg_attribute att
        WHERE a.localoid ='bi_data.schoolmate_relations'::regclass
      and a.localoid = att.attrelid
      and att.attnum = any(a.attrnums);

19.通过sql获取 greemplum指定表结构

 SELECT   
         attname, typname
     FROM
           pg_attribute
           INNER JOIN pg_class  ON pg_attribute.attrelid = pg_class.oid
           INNER JOIN pg_type   ON pg_attribute.atttypid = pg_type.oid
           INNER JOIN pg_namespace on pg_class.relnamespace=pg_namespace.oid  --
     WHERE
           pg_attribute.attnum > 0
       AND attisdropped <> 't'
       AND pg_namespace.nspname='resumes'
       AND pg_class.relname= 'base_common'
--     and pg_class.relname ~~* any(array['%some%', '%someelse']));
order by  pg_attribute.attnum

20.显示哪些没有统计信息且可能需要ANALYZE的表

SELECT * from gp_toolkit.gp_stats_missing ;

21.显示在系统表中被标记为掉线的Segment的信息

SELECT * from gp_toolkit.gp_pgdatabase_invalid;

22.显示库中表的大小(单位G)

SELECT sotdoid,sotdsize/1024/1024/1024 as sotdsize,sotdtoastsize,sotdadditionalsize,sotdschemaname,sotdtablename from gp_toolkit.gp_size_of_table_disk order by sotdsize desc;

23.查询一个库中有多少表(如果有分区表不列出子分区)

SELECT relname from pg_class a,pg_namespace b where relname not like '%prt%' and relkind ='r'  and a.relnamespace=b.oid and nspname not in ('pg_catalog','information_schema','gp_toolkit') and nspname not like '%pg_temp%';

24.查询某个用户对某个表有什么权限

select * from INFORMATION_SCHEMA.role_table_grants where grantee='user_name' and table_name='table';

25.查看分区表的信息

SELECT tablename,partitiontablename,partitiontype,partitionboundary from pg_partitions where tablename='table_name' order by partitionboundary desc;

26.导入数据

copy t1 from '/home/gpadmin/t1.txt' with delimiter '|' LOG ERRORS INTO INSERT_ERRS SEGMENT REJECT LIMIT 100;

27.远程导入数据

psql -h 1.1.1.1  -U user_name -d db_name -W -c "copy tb1 from stdin with delimiter '|'" < /home/gpadmin/tb1.txt

28.导数据指定分隔符(和mysql的select into outfile很像)

psql -d db_name -c "select * from tb1" -o tb1.txt -t -A -F $'/t'

29.生成授权语句

SELECT
	'grant select on ' || relname || ' to gk-gpdb-rw;' 
FROM
	pg_class A,
	pg_namespace b 
WHERE
	relname NOT LIKE'%prt%' 
	AND relkind = 'r' 
	AND has_table_privilege ( 'gk-gpdb-rw', A.oid, 'select' ) = 'f' 
	AND A.relnamespace = b.oid 
	AND nspname NOT IN ( 'pg_catalog', 'information_schema', 'gp_toolkit' ) 
	AND nspname NOT LIKE'%pg_temp%';

30 .给用户授权

SELECT
	'grant all on SCHEMA ' || tt.autnspname || ' to tuser;' AS grant_script 
FROM
	gp_toolkit.__gp_user_tables tt -- group by  tt.autnspname
	UNION-- all
SELECT
	'grant all on table ' || tt.autnspname || '.' || tt.autrelname || ' to tuser;' grant_script 
FROM
	gp_toolkit.__gp_user_tables tt;

31、改变postgreSQL的默认schema

-- Use this to show the current search_path
-- Should return: "$user",public
SHOW search_path;
-- Create another schema
CREATE SCHEMA my_schema;
GRANT ALL ON SCHEMA my_schema TO my_user;
-- To change search_path on a connection-level
SET search_path TO my_schema; 
-- To change search_path on a database-level
ALTER database "my_database" SET search_path TO my_schema;

查看历史sql:

SELECT
	*,
	(
	SELECT COUNT
		( * ) 
	FROM
		(
		SELECT LENGTH
			( query_text ),
			substr( query_text, 1, 501 ) AS query_text,
			tmid,
			ssid,
			ccnt,
			username,
			db,
			tsubmit,
			tstart,
			tfinish,
			exec_time,
			wait_time,
			cpu_elapsed,
			cpu_master_percent,
			cpu_segs_percent,
			status,
			skew_cpu,
			skew_rows,
			memory,
			disk_read_bytes,
			disk_write_bytes,
			spill_size,
			plan_gen,
			COST,
			rsgname,
			queuename,
			priority,
		CASE
				
				WHEN mark = 1 THEN
				1 ELSE 0 
			END AS is_from_metrics 
		FROM
			(
			SELECT RANK
				( ) OVER ( PARTITION BY q.tmid, q.ssid, q.ccnt ORDER BY q.mark ASC ) rn,
				q.* 
			FROM
				gpmetrics.queries_history AS q 
			WHERE
				'2022-10-18 13:33:09' <= tfinish 
				AND tstart <= '2022-10-18 14:33:14' 
				AND '2022-10-18 14:33:14' :: TIMESTAMP ( 0 ) WITHOUT TIME ZONE + INTERVAL '1 month' >= sample_time 
				AND '2022-10-18 13:33:09' :: TIMESTAMP ( 0 ) WITHOUT TIME ZONE - INTERVAL '1 month' <= sample_time 
				AND status IN ( 'done', 'abort', 'error' ) 
			) AS rnq2 
		WHERE
			rnq2.rn < 2 
		) AS rnq2 
	) AS total 
FROM
	(
	SELECT LENGTH
		( query_text ),
		substr( query_text, 1, 501 ) AS query_text,
		tmid,
		ssid,
		ccnt,
		username,
		db,
		tsubmit,
		tstart,
		tfinish,
		exec_time,
	wait_time,
	cpu_elapsed,C



### 查看sql详情:

SELECT LENGTH
		( query_text ),
		substr( query_text, 1, 100000 ) AS query_text,
		tmid,
		ssid,
		ccnt,
		username,
		db,
		tsubmit,
		tstart,
		tfinish,
		exec_time,
		wait_time,
		cpu_elapsed,
		cpu_master_percent,
		cpu_segs_percent,
		status,
		skew_cpu,
		skew_rows,
		memory,
		disk_read_bytes,
		disk_write_bytes,
		spill_size,
		plan_gen,
		COST,
		rsgname,
		queuename,
		priority,
	CASE
			
			WHEN mark = 1 THEN
			1 ELSE 0 
		END AS is_from_metrics 
	FROM
		(
		SELECT RANK
			( ) OVER ( PARTITION BY q.tmid, q.ssid, q.ccnt ORDER BY q.mark ASC ) rn,
			q.* 
		FROM
			gpmetrics.queries_history AS q 
		WHERE
			tmid =$1 
			AND ssid =$2 
			AND ccnt =$3 
			AND '2022-10-18 14:32:58' :: TIMESTAMP ( 0 ) WITHOUT TIME ZONE - INTERVAL '1 month' <= sample_time 
			AND '2022-10-18 14:32:58' :: TIMESTAMP ( 0 ) WITHOUT TIME ZONE + INTERVAL '1 month' >= sample_time 
		) AS rnq 
	WHERE
		rnq.rn < 2

查看完整sql

gpperfmon 库的public 对象下
select * from queries_now

准确的:
gpperfmon  gpmetrics
gpcc_queries_now

历史sql
select * from gpcc_queries_history where ctime>='2024-10-14 00:00:05' and ctime>='2024-10-15 00:00:05'



SELECT
    username as 用户名,
    db as 数据库,
    COST as 查询代价,
    tstart as 开始时间,
    tfinish as 结束时间,
    status as 状态,
    rows_out as 返回行数,
    error_msg as 是否失败,
    plan_gen as 执行计划生成方式,
    query_text as 查询SQL,
    EXTRACT(EPOCH FROM(tfinish - tstart )) as 消耗时间S,
    application_name as 应用名称,
    rsqname as 资源队列,
    cpu_master as 主节点CPU时间,
    cpu_segs as SEG节点CPU时间,
    cpu_master_percent as CPU使用百分比主,
    cpu_segs_percent as CPU使用百分比seg,
    skew_cpu as CPU的倾斜度,
    skew_rows as 数据行倾斜度,
    memory / 1024 / 1024 as 消耗的内存量MB,
    disk_read_bytes / 1024 / 1024 as 磁盘读取的数据量MB,
    disk_write_bytes / 1024 / 1024 as 写入磁盘的数据量MB,
    spill_size / 1024 / 1024 as 溢出到磁盘的数据量,
    rqpriority as 查询的优先级,
    peak_memory / 1024 / 1024 as 使用的峰值内存量
FROM
    gpcc_queries_history 
WHERE
    ctime >= '2024-10-14 00:00:05' 
    AND ctime < '2024-10-15 00:00:05' 
    AND EXTRACT ( EPOCH FROM ( tfinish - tstart ) ) > 10

数据膨胀优化

查询膨胀较大的表
select * from gp_toolkit.gp_bloat_diag order by bdirelpages desc, bdidiag;

查看表的大小
select pg_size_pretty(pg_relation_size('nc65_pcto_settlelist_b'));

执行优化(该命令会全局锁表)
vacuum full nc65_pcto_settlelist_b;
或者数据重分布
ALTER TABLE dwm_amb_allfy SET WITH (REORGANIZE=true) DISTRIBUTED BY (szlx_bm);
(重分布期间也会锁表)

psql 命令说明:

第一个数据库名表示在该库下执行查询,第二个数据库名表示将执行的结果拿到该库下再执行查询

psql -tc "select count(*) from  public.amb_cbo_items; " datawarehouse_test | psql -a datawarehouse_test

postgresql中的类型转换

ding_dept_id::text

ding_dept_id字段类型为int ,以上写法将其转换为text

查询没有主键或者唯一键的表

select
	tab.table_schema,
	tab.table_name
from
	information_schema.tables tab
left join information_schema.table_constraints tco on
	tab.table_schema = tco.table_schema
	and tab.table_name = tco.table_name
	and tco.constraint_type in ('PRIMARY KEY', 'UNIQUE')
where
	tab.table_type = 'BASE TABLE'
	and tab.table_schema not in ('pg_catalog', 'information_schema')
	and tco.constraint_name is null;

查询没有约束的表

SELECT
    t.table_schema,
    t.table_name
FROM
    information_schema.tables t
LEFT JOIN
    information_schema.key_column_usage kcu
    ON t.table_schema = kcu.table_schema
    AND t.table_name = kcu.table_name
WHERE
    t.table_type = 'BASE TABLE'
    AND kcu.constraint_name IS NULL;

posted @ 2025-06-26 15:46  数据库小白(专注)  阅读(28)  评论(0)    收藏  举报