会话相关
查看系统中会话/连接/SQL与锁情况
1、查看当前活动的客户端连接数
SELECT count(*) FROM pg_stat_activity WHERE NOT pid=pg_backend_pid();
查看会话:
SELECT * FROM pg_stat_activity;
查看锁信息
SELECT locktype, database, c.relname, l.relation,
l.transactionid, l.pid, l.mode, l.granted,
a.query
FROM pg_locks l, pg_class c, pg_stat_activity a
WHERE l.relation=c.oid AND l.pid=a.pid
ORDER BY c.relname;
2、查询客户端连接的情况
SELECT
pid,
CASE
WHEN waiting = 'f' THEN
'already get lock,sql executing'
WHEN waiting = 't' THEN
'waiting get lock,sql waiting execute'
END lock_satus,
CURRENT_TIMESTAMP - LEAST ( query_start, xact_start ) AS runtime,
substr( query, 1, 25 ) AS current_query
FROM
pg_stat_activity
WHERE
NOT pid = pg_backend_pid ( )
AND STATE <> 'idle'
AND application_name <> 'pg_statsinfod'
ORDER BY
runtime DESC
3、查看持有锁和等待锁的一些信息(已经修改验证)
--reltype=0代表其为索引
SELECT
locker.pid,
pc.relname,
locker.MODE,
locker_act.application_name,
LEAST ( query_start, xact_start ) start_time,
CASE
WHEN locker.GRANTED = 'f' THEN
'wait_lock'
WHEN locker.GRANTED = 't' THEN
'get_lock'
END lock_satus,
CURRENT_TIMESTAMP - LEAST ( query_start, xact_start ) AS runtime,
locker_act.query
FROM
pg_locks locker,
pg_stat_activity locker_act,
pg_class pc
WHERE
locker.pid = locker_act.pid
AND NOT locker.pid = pg_backend_pid ( )
AND application_name <> 'pg_statsinfod'
AND locker.relation = pc.oid
AND pc.reltype <> 0 --and pc.relname='t'
ORDER BY
runtime DESC;
4、查询系统中正在执行的或者等待执行的事务
--注意其只是代表事务信息,系统中也有可能存在慢的查询
SELECT
pc.relname lock_table,
pc.oid,
tans.pid,
CASE
WHEN waiting = 'f' THEN
'already get lock,sql executing'
WHEN waiting = 't' THEN
'waiting get lock,sql waiting execute'
END lock_satus,
LEAST ( query_start, xact_start ) query_start,
CURRENT_TIMESTAMP - LEAST ( query_start, xact_start ) AS runtime,
psa.query
FROM
pg_locks tans,
pg_locks pl,
pg_class pc,
pg_stat_activity psa
WHERE
tans.transactionid IS NOT NULL
AND pc.oid = pl.relation
AND tans.pid = pl.pid
AND tans.pid = psa.pid
AND pc.reltype <> 0
ORDER BY
runtime DESC;
5、查看系统中正在执行的sql与lock_table有关的信息
SELECT
locktype,
pg_locks.pid,
virtualtransaction,
transactionid,
nspname,
relname,
MODE,
GRANTED,
CASE
WHEN GRANTED = 'f' THEN
'get_lock'
WHEN GRANTED = 't' THEN
'wait_lock'
END lock_satus,
CASE
WHEN waiting = 'f' THEN
'already get lock,sql executing'
WHEN waiting = 't' THEN
'waiting get lock,sql waiting execute'
END lock_satus,
CURRENT_TIMESTAMP - LEAST ( query_start, xact_start ) AS runtime,
CAST ( date_trunc( 'second', query_start ) AS TIMESTAMP ) AS query_start,
substr( query, 1, 25 ) AS query
FROM
pg_locks
LEFT OUTER JOIN pg_class ON ( pg_locks.relation = pg_class.oid )
LEFT OUTER JOIN pg_namespace ON ( pg_namespace.oid = pg_class.relnamespace ),
pg_stat_activity
WHERE
NOT pg_locks.pid = pg_backend_pid ( )
AND pg_locks.pid = pg_stat_activity.pid
AND pg_class.relname = 't' --此处进行替换
ORDER BY
query_start;
6、查看PostgreSQL正在执行的SQL
SELECT
procpid,
START,
now( ) - START AS lap,
current_query
FROM
(
SELECT
backendid,
pg_stat_get_backend_pid ( S.backendid ) AS procpid,
pg_stat_get_backend_activity_start ( S.backendid ) AS START,
pg_stat_get_backend_activity ( S.backendid ) AS current_query
FROM
( SELECT pg_stat_get_backend_idset ( ) AS backendid ) AS S
) AS S,
pg_stat_activity pa
WHERE
current_query <> '<IDLE>'
AND procpid <> pg_backend_pid ( )
AND pa.pid = s.procpid
AND pa.STATE <> 'idle'
ORDER BY
lap DESC;
procpid:进程 ID
START:进程开始时间
lap:经过时间
current_query:执行中的 SQL
怎样停止正在执行的 SQL
SELECT pg_cancel_backend ( 进程 ID );
或者用系统函数 kill - 9 进程 ID;
--查找是否有waiting
ps -ef|grep postgres | grep wait
7、查看当前库表和索引的的大小并排序显示前20条
SELECT
nspname,
relname,
relkind AS "type",
pg_size_pretty ( pg_table_size ( C.oid ) ) AS SIZE,
pg_size_pretty ( pg_indexes_size ( C.oid ) ) AS idxsize,
pg_size_pretty ( pg_total_relation_size ( C.oid ) ) AS "total"
FROM
pg_class
C LEFT JOIN pg_namespace N ON ( N.oid = C.relnamespace )
WHERE
nspname NOT IN ( 'pg_catalog', 'information_schema' )
AND nspname !~ '^pg_toast'
AND relkind IN ( 'r', 'i' )
ORDER BY
pg_total_relation_size ( C.oid ) DESC
LIMIT 20;
**原文:https://blog.csdn.net/rudygao/article/details/49334001
**
抽空网上收罗Greenplum常用SQL查询语句整理备忘。欢迎各位留言补充。都是SQL命令以及数据字典的使用。熟悉数据字典非常重要。三个重要的schema:pg_catalog,pg_toolkit,information_schema,其中information_schema 中的数据字典都在视图中
目录
一、查看表某模式所有分布键信息
二、 数据库运行状态查询管理
1.greenplum查询正在运行的sql,session
2.终止执行的sql
3.查看greemplum资源队列状态
4.查看greemplum资源队列锁
5.查看greemplum资源队列优先级
6.查看greemplum所有连接 类似mysql SHOW PROCESSLIST
7.greemplum磁盘使用,通过SQL查看Greenplum中用了多少空间
8.查看greemplum节点状态
9.节点故障等历史信息
10.数据倾斜
11.greemplum表或索引大小 (占用空间)
12.greemplum表和索引大小(占用空间)
13.greemplum查看指定数据库大小(占用空间)
14.greemplum所有数据库大小(占用空间)
15.查看greemplum数据分布情况
三、查源数据
16.查看greemplum数据表更新时间
17.通过sql 获取greemplum表的预估数据量
18.通过sql 获取greemplum获取分布键
19.通过sql获取 greemplum指定表结构
20.显示哪些没有统计信息且可能需要ANALYZE的表
21.显示在系统表中被标记为掉线的Segment的信息
22.显示库中表的大小(单位G)
23.查询一个库中有多少表(如果有分区表不列出子分区)
24.查询某个用户对某个表有什么权限
25.查看分区表的信息
26.导入数据
27.远程导入数据
28.导数据指定分隔符(和mysql的select into outfile很像)
29.生成授权语句
30 .给用户授权
31、改变postgreSQL的默认schema
一、查看表某模式所有分布键信息
SELECT
aaa.nspname AS "模式名",
aaa.relname AS "表名",
aaa.table_comment AS "中文表明",
ccc.attname AS "分布键"
FROM
(
SELECT aa.oid,
obj_description (aa.oid) AS table_comment,
aa.relname,
bb.localoid,
bb.attrnums,
regexp_split_to_table (
array_to_string (bb.attrnums, ','),
','
) att,
dd.nspname FROM pg_class aa --原数据信息 最重要的表!
LEFT
JOIN gp_distribution_policy bb ON bb.localoid = aa.oid --分布键表
LEFT
JOIN pg_namespace dd ON dd.oid = aa.relnamespace --模式
LEFT
JOIN pg_inherits hh ON aa.oid = hh.inhrelid --继承表
WHERE dd.nspname = 'dim' -- 替换成需要的模式
AND hh.inhrelid IS NULL
) aaa
LEFT JOIN pg_attribute ccc ON ccc.attrelid = aaa.oid
AND ccc.attnum = aaa.att
WHERE
ccc.attnum > 0
ORDER BY
aaa.relname;
二、 数据库运行状态查询管理
1.greenplum查询正在运行的sql,session
-- 方法1:
SELECT
pid, -- pid
usename user_name, -- 执行的用户
backend_start, -- 会话开始时间
query_start, -- 查询开始时间
waiting, -- 是否等待执行
now() - query_start AS current_query_time, -- 累计执行时间
now() - backend_start AS current_session_time,
query,
client_addr , datname
FROM
pg_stat_activity
WHERE query != '<IDLE>'
ORDER BY current_query_time DESC;
-- 方法2(通过视图查)
SELECT
procpid,
START,
now() - START AS lap,
current_query,
-- count() over() count_num,
t2.rolname,t3.rsqname,
ip
FROM
(
SELECT
backendid,
pg_stat_get_backend_userid(S.backendid) as uid,
pg_stat_get_backend_client_addr(S.backendid) as ip,
pg_stat_get_backend_pid (S.backendid) AS procpid,
pg_stat_get_backend_activity_start (S.backendid) AS START,
pg_stat_get_backend_activity (S.backendid) AS current_query
FROM
(
SELECT
pg_stat_get_backend_idset () AS backendid
) AS S
) AS t1 left join pg_authid t2 on t1.uid=t2.oid
left join pg_resqueue t3 on t2.rolresqueue=t3.oid
WHERE
current_query!= '<IDLE>'
ORDER BY lap DESC;
-- 方法3(限定了角色和资源队列,查当前账号正在查询的语句)
SELECT
rolname,
rsqname,
pid,
GRANTED,
current_query,
datname
FROM
pg_roles t1,
gp_toolkit.gp_resqueue_status t2 ,
pg_locks t3 ,
pg_stat_activity t4
WHERE
t1.rolresqueue = t3.objid
AND t3.objid=t2.queueid
and t4.procpid=t3.pid
2.终止执行的sql
select pg_terminate_backend(48988); --pid
3.查看greemplum资源队列状态
SELECT * FROM gp_toolkit.gp_resqueue_status;
4.查看greemplum资源队列锁
SELECT * FROM gp_toolkit.gp_locks_on_resqueue WHERE lorwaiting='true';
5.查看greemplum资源队列优先级
select * from gp_toolkit.gp_resq_priority_statement;
6.查看greemplum所有连接 类似mysql SHOW PROCESSLIST
select * from pg_stat_activity; -- 所有状态的连接
7.greemplum磁盘使用,通过SQL查看Greenplum中用了多少空间
select datname,pg_size_pretty(pg_database_size(datname)) from pg_database;
8.查看greemplum节点状态
select * from gp_segment_configuration tt
select * from gp_segment_configuration tt where tt.status='d'; -- 状态为down
9.节点故障等历史信息
select * from gp_configuration_history tt order by 1 desc ;
10.数据倾斜
SELECT
t1.gp_segment_id,
t1.count_tatol,
round(t1.count_tatol-(AVG(t1.count_tatol) over()) ,0)
FROM
(
SELECT
gp_segment_id,
COUNT (*) count_tatol
FROM
<tablename> -- 要查的表
GROUP BY
gp_segment_id
) t1
order by 3
11.greemplum表或索引大小 (占用空间)
select pg_size_pretty(pg_relation_size('gp_test'));
12.greemplum表和索引大小(占用空间)
select pg_size_pretty(pg_total_relation_size('gp_test'));
13.greemplum查看指定数据库大小(占用空间)
select pg_size_pretty(pg_database_size('postgres'));
14.greemplum所有数据库大小(占用空间)
select datname,pg_size_pretty(pg_database_size(datname)) from pg_database;
15.查看greemplum数据分布情况
select gp_segment_id,count(*) from gp_test group by gp_segment_id order by 1;
三、查源数据
16.查看greemplum数据表更新时间
SELECT
*
FROM
pg_stat_last_operation,
pg_class
WHERE
objid = oid
AND relname = 'base_common'; -- 表名
17.通过sql 获取greemplum表的预估数据量
select
relname,
reltuples::int as total
from
pg_class
where
relname = 'base_common'
and relnamespace = (select oid from pg_namespace where nspname = 'positions');
18.通过sql 获取greemplum获取分布键
select n.nspname AS "模式名",c.relname AS "表名",case when p.policytype='p' then '分区表' when p.policytype='r' then '复制分布' end "分布策略", pg_get_table_distributedby(c.oid) "分布键" ,c.reloptions "压缩类型"
from pg_class c
left join gp_distribution_policy p on c.oid=p.localoid
left join pg_namespace n on c.relnamespace=n.oid
where c.relname in ('表名');
SELECT string_agg(att.attname,',' order by attrnums) as distribution
FROM gp_distribution_policy a,pg_attribute att
WHERE a.localoid ='bi_data.schoolmate_relations'::regclass
and a.localoid = att.attrelid
and att.attnum = any(a.attrnums);
19.通过sql获取 greemplum指定表结构
SELECT
attname, typname
FROM
pg_attribute
INNER JOIN pg_class ON pg_attribute.attrelid = pg_class.oid
INNER JOIN pg_type ON pg_attribute.atttypid = pg_type.oid
INNER JOIN pg_namespace on pg_class.relnamespace=pg_namespace.oid --
WHERE
pg_attribute.attnum > 0
AND attisdropped <> 't'
AND pg_namespace.nspname='resumes'
AND pg_class.relname= 'base_common'
-- and pg_class.relname ~~* any(array['%some%', '%someelse']));
order by pg_attribute.attnum
20.显示哪些没有统计信息且可能需要ANALYZE的表
SELECT * from gp_toolkit.gp_stats_missing ;
21.显示在系统表中被标记为掉线的Segment的信息
SELECT * from gp_toolkit.gp_pgdatabase_invalid;
22.显示库中表的大小(单位G)
SELECT sotdoid,sotdsize/1024/1024/1024 as sotdsize,sotdtoastsize,sotdadditionalsize,sotdschemaname,sotdtablename from gp_toolkit.gp_size_of_table_disk order by sotdsize desc;
23.查询一个库中有多少表(如果有分区表不列出子分区)
SELECT relname from pg_class a,pg_namespace b where relname not like '%prt%' and relkind ='r' and a.relnamespace=b.oid and nspname not in ('pg_catalog','information_schema','gp_toolkit') and nspname not like '%pg_temp%';
24.查询某个用户对某个表有什么权限
select * from INFORMATION_SCHEMA.role_table_grants where grantee='user_name' and table_name='table';
25.查看分区表的信息
SELECT tablename,partitiontablename,partitiontype,partitionboundary from pg_partitions where tablename='table_name' order by partitionboundary desc;
26.导入数据
copy t1 from '/home/gpadmin/t1.txt' with delimiter '|' LOG ERRORS INTO INSERT_ERRS SEGMENT REJECT LIMIT 100;
27.远程导入数据
psql -h 1.1.1.1 -U user_name -d db_name -W -c "copy tb1 from stdin with delimiter '|'" < /home/gpadmin/tb1.txt
28.导数据指定分隔符(和mysql的select into outfile很像)
psql -d db_name -c "select * from tb1" -o tb1.txt -t -A -F $'/t'
29.生成授权语句
SELECT
'grant select on ' || relname || ' to gk-gpdb-rw;'
FROM
pg_class A,
pg_namespace b
WHERE
relname NOT LIKE'%prt%'
AND relkind = 'r'
AND has_table_privilege ( 'gk-gpdb-rw', A.oid, 'select' ) = 'f'
AND A.relnamespace = b.oid
AND nspname NOT IN ( 'pg_catalog', 'information_schema', 'gp_toolkit' )
AND nspname NOT LIKE'%pg_temp%';
30 .给用户授权
SELECT
'grant all on SCHEMA ' || tt.autnspname || ' to tuser;' AS grant_script
FROM
gp_toolkit.__gp_user_tables tt -- group by tt.autnspname
UNION-- all
SELECT
'grant all on table ' || tt.autnspname || '.' || tt.autrelname || ' to tuser;' grant_script
FROM
gp_toolkit.__gp_user_tables tt;
31、改变postgreSQL的默认schema
-- Use this to show the current search_path
-- Should return: "$user",public
SHOW search_path;
-- Create another schema
CREATE SCHEMA my_schema;
GRANT ALL ON SCHEMA my_schema TO my_user;
-- To change search_path on a connection-level
SET search_path TO my_schema;
-- To change search_path on a database-level
ALTER database "my_database" SET search_path TO my_schema;
查看历史sql:
SELECT
*,
(
SELECT COUNT
( * )
FROM
(
SELECT LENGTH
( query_text ),
substr( query_text, 1, 501 ) AS query_text,
tmid,
ssid,
ccnt,
username,
db,
tsubmit,
tstart,
tfinish,
exec_time,
wait_time,
cpu_elapsed,
cpu_master_percent,
cpu_segs_percent,
status,
skew_cpu,
skew_rows,
memory,
disk_read_bytes,
disk_write_bytes,
spill_size,
plan_gen,
COST,
rsgname,
queuename,
priority,
CASE
WHEN mark = 1 THEN
1 ELSE 0
END AS is_from_metrics
FROM
(
SELECT RANK
( ) OVER ( PARTITION BY q.tmid, q.ssid, q.ccnt ORDER BY q.mark ASC ) rn,
q.*
FROM
gpmetrics.queries_history AS q
WHERE
'2022-10-18 13:33:09' <= tfinish
AND tstart <= '2022-10-18 14:33:14'
AND '2022-10-18 14:33:14' :: TIMESTAMP ( 0 ) WITHOUT TIME ZONE + INTERVAL '1 month' >= sample_time
AND '2022-10-18 13:33:09' :: TIMESTAMP ( 0 ) WITHOUT TIME ZONE - INTERVAL '1 month' <= sample_time
AND status IN ( 'done', 'abort', 'error' )
) AS rnq2
WHERE
rnq2.rn < 2
) AS rnq2
) AS total
FROM
(
SELECT LENGTH
( query_text ),
substr( query_text, 1, 501 ) AS query_text,
tmid,
ssid,
ccnt,
username,
db,
tsubmit,
tstart,
tfinish,
exec_time,
wait_time,
cpu_elapsed,C
### 查看sql详情:
SELECT LENGTH
( query_text ),
substr( query_text, 1, 100000 ) AS query_text,
tmid,
ssid,
ccnt,
username,
db,
tsubmit,
tstart,
tfinish,
exec_time,
wait_time,
cpu_elapsed,
cpu_master_percent,
cpu_segs_percent,
status,
skew_cpu,
skew_rows,
memory,
disk_read_bytes,
disk_write_bytes,
spill_size,
plan_gen,
COST,
rsgname,
queuename,
priority,
CASE
WHEN mark = 1 THEN
1 ELSE 0
END AS is_from_metrics
FROM
(
SELECT RANK
( ) OVER ( PARTITION BY q.tmid, q.ssid, q.ccnt ORDER BY q.mark ASC ) rn,
q.*
FROM
gpmetrics.queries_history AS q
WHERE
tmid =$1
AND ssid =$2
AND ccnt =$3
AND '2022-10-18 14:32:58' :: TIMESTAMP ( 0 ) WITHOUT TIME ZONE - INTERVAL '1 month' <= sample_time
AND '2022-10-18 14:32:58' :: TIMESTAMP ( 0 ) WITHOUT TIME ZONE + INTERVAL '1 month' >= sample_time
) AS rnq
WHERE
rnq.rn < 2
查看完整sql
gpperfmon 库的public 对象下
select * from queries_now
准确的:
gpperfmon gpmetrics
gpcc_queries_now
历史sql
select * from gpcc_queries_history where ctime>='2024-10-14 00:00:05' and ctime>='2024-10-15 00:00:05'
SELECT
username as 用户名,
db as 数据库,
COST as 查询代价,
tstart as 开始时间,
tfinish as 结束时间,
status as 状态,
rows_out as 返回行数,
error_msg as 是否失败,
plan_gen as 执行计划生成方式,
query_text as 查询SQL,
EXTRACT(EPOCH FROM(tfinish - tstart )) as 消耗时间S,
application_name as 应用名称,
rsqname as 资源队列,
cpu_master as 主节点CPU时间,
cpu_segs as SEG节点CPU时间,
cpu_master_percent as CPU使用百分比主,
cpu_segs_percent as CPU使用百分比seg,
skew_cpu as CPU的倾斜度,
skew_rows as 数据行倾斜度,
memory / 1024 / 1024 as 消耗的内存量MB,
disk_read_bytes / 1024 / 1024 as 磁盘读取的数据量MB,
disk_write_bytes / 1024 / 1024 as 写入磁盘的数据量MB,
spill_size / 1024 / 1024 as 溢出到磁盘的数据量,
rqpriority as 查询的优先级,
peak_memory / 1024 / 1024 as 使用的峰值内存量
FROM
gpcc_queries_history
WHERE
ctime >= '2024-10-14 00:00:05'
AND ctime < '2024-10-15 00:00:05'
AND EXTRACT ( EPOCH FROM ( tfinish - tstart ) ) > 10
数据膨胀优化
查询膨胀较大的表
select * from gp_toolkit.gp_bloat_diag order by bdirelpages desc, bdidiag;
查看表的大小
select pg_size_pretty(pg_relation_size('nc65_pcto_settlelist_b'));
执行优化(该命令会全局锁表)
vacuum full nc65_pcto_settlelist_b;
或者数据重分布
ALTER TABLE dwm_amb_allfy SET WITH (REORGANIZE=true) DISTRIBUTED BY (szlx_bm);
(重分布期间也会锁表)
psql 命令说明:
第一个数据库名表示在该库下执行查询,第二个数据库名表示将执行的结果拿到该库下再执行查询
psql -tc "select count(*) from public.amb_cbo_items; " datawarehouse_test | psql -a datawarehouse_test
postgresql中的类型转换
ding_dept_id::text
ding_dept_id字段类型为int ,以上写法将其转换为text
查询没有主键或者唯一键的表
select
tab.table_schema,
tab.table_name
from
information_schema.tables tab
left join information_schema.table_constraints tco on
tab.table_schema = tco.table_schema
and tab.table_name = tco.table_name
and tco.constraint_type in ('PRIMARY KEY', 'UNIQUE')
where
tab.table_type = 'BASE TABLE'
and tab.table_schema not in ('pg_catalog', 'information_schema')
and tco.constraint_name is null;
查询没有约束的表
SELECT
t.table_schema,
t.table_name
FROM
information_schema.tables t
LEFT JOIN
information_schema.key_column_usage kcu
ON t.table_schema = kcu.table_schema
AND t.table_name = kcu.table_name
WHERE
t.table_type = 'BASE TABLE'
AND kcu.constraint_name IS NULL;

浙公网安备 33010602011771号