Greenplum 实战-添加拓展插件
Greenplum 实战-添加拓展插件
概述
PostgreSQL提供了丰富的扩展,如模糊查询 pg_trgm、中文全文检索 zhparser、字符串相似度判断 fuzzystrmatch、数组相似度 smlar、位图计算 roaringbitmap等等。Greenplum 使用了PostgreSQL内核,理论上兼容PG扩展包。本文将介绍在Greenplum上怎么安装扩展插件。
快速入门
查看自带插件列表
已装插件列表
通过查询pg_extension视图,可看到已装的扩展。gp默认只安装了plpgsql插件
SELECT * from pg_extension;
extname | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition
-----------+----------+--------------+----------------+------------+-----------+--------------
plpgsql | 10 | 11 | f | 1.0 | |
plpythonu | 10 | 11 | f | 1.0 | |
pg_trgm | 10 | 2200 | t | 1.1 | |
可安装的插件列表
SELECT * from pg_available_extensions;
-----------------------------+-----------------+-------------------+-----------------------------------------------------------------------------------------------
advanced_password_check | 1.0 | | Advanced Password Check
amcheck | 1.0 | | functions for verifying relation integrity
plperlu | 1.0 | | PL/PerlU untrusted procedural language
greenplum_fdw | 1.0 | | foreign-data wrapper for remote Greenplum servers
btree_gin | 1.0 | | support for indexing common datatypes in GIN
plpgsql | 1.0 | 1.0 | PL/pgSQL procedural language
pageinspect | 1.8 | | inspect the contents of database pages at a low level
citext | 1.0 | | data type for case-insensitive character strings
plpython2u | 1.0 | | PL/Python2U untrusted procedural language
dataflow | 1.0 | | Extension which provides extra formatters and types for dataflow
plpythonu | 1.0 | 1.0 | PL/PythonU untrusted procedural language
dblink | 1.1 | | connect to other PostgreSQL databases from within a database
postgres_fdw | 1.0 | | foreign-data wrapper for remote PostgreSQL servers
diskquota | 1.0 | | Disk Quota Main Program
pg_trgm | 1.1 | 1.1 | text similarity measurement and index searching based on trigrams
fuzzystrmatch | 1.0 | | determine similarities and distance between strings
gp_array_agg | 1.0.0 | | Parallel aggregate implementation for Greenplum
gp_distribution_policy | 1.0 | | check distribution policy in a GPDB cluster
gp_internal_tools | 1.0.0 | | Different internal tools for Greenplum
hstore | 1.3 | | data type for storing sets of (key, value) pairs
gp_legacy_string_agg | 1.0.0 | | Legacy one-argument string_agg implementation for Greenplum
gpss | 1.0 | | Extension which implements kinds of gpss formaters and protocol buffer
gp_parallel_retrieve_cursor | 1.0 | | Retrieve results of cursor in parallel
sslinfo | 1.0 | | information about SSL certificates
gp_pitr | 1.0 | | Distributed point-in-time-recovery functions
pgcrypto | 1.1 | | cryptographic functions
gp_sparse_vector | 1.0.1 | | SParse vector implementation for GreenPlum
gp_wlm | 0.1 | | Greenplum Workload Manager Extension
metrics_collector | 1.0 | | Greenplum Metrics Collector Extension
orafce | 3.7 | | Functions and operators that emulate a subset of functions and packages from the Oracle RDBMS
plperl | 1.0 | | PL/Perl procedural language
tablefunc | 1.0 | | functions that manipulate whole tables, including crosstab
(32 rows)
插件详情
可看到当前拥有的扩展清单、扩展版本号、是否已安装
SELECT * from pg_available_extension_versions;
name | version | installed | superuser | relocatable | schema | requires | comment
-----------------------------+---------+-----------+-----------+-------------+------------+----------+-----------------------------------------------------------------------------------------------
advanced_password_check | 1.0 | f | t | t | | | Advanced Password Check
amcheck | 1.0 | f | t | t | | | functions for verifying relation integrity
plperlu | 1.0 | f | t | f | pg_catalog | | PL/PerlU untrusted procedural language
greenplum_fdw | 1.0 | f | t | t | | | foreign-data wrapper for remote Greenplum servers
btree_gin | 1.0 | f | t | t | | | support for indexing common datatypes in GIN
plpgsql | 1.0 | t | f | f | pg_catalog | | PL/pgSQL procedural language
pageinspect | 1.8 | f | t | t | | | inspect the contents of database pages at a low level
pageinspect | 1.2 | f | t | t | | | inspect the contents of database pages at a low level
citext | 1.0 | f | t | t | | | data type for case-insensitive character strings
citext | 1.1 | f | t | t | | | data type for case-insensitive character strings
plpython2u | 1.0 | f | t | f | pg_catalog | | PL/Python2U untrusted procedural language
dataflow | 1.0 | f | t | t | | | Extension which provides extra formatters and types for dataflow
plpythonu | 1.0 | t | t | f | pg_catalog | | PL/PythonU untrusted procedural language
dblink | 1.1 | f | t | t | | | connect to other PostgreSQL databases from within a database
postgres_fdw | 1.0 | f | t | t | | | foreign-data wrapper for remote PostgreSQL servers
diskquota | 1.0 | f | t | t | | | Disk Quota Main Program
pg_trgm | 1.1 | t | t | t | | | text similarity measurement and index searching based on trigrams
fuzzystrmatch | 1.0 | f | t | t | | | determine similarities and distance between strings
gp_array_agg | 1.0.0 | f | t | t | | | Parallel aggregate implementation for Greenplum
gp_distribution_policy | 1.0 | f | t | t | | | check distribution policy in a GPDB cluster
gp_internal_tools | 1.0.0 | f | t | t | | | Different internal tools for Greenplum
hstore | 1.3 | f | t | t | | | data type for storing sets of (key, value) pairs
gp_legacy_string_agg | 1.0.0 | f | t | t | | | Legacy one-argument string_agg implementation for Greenplum
gpss | 1.0 | f | t | t | | | Extension which implements kinds of gpss formaters and protocol buffer
gp_parallel_retrieve_cursor | 1.0 | f | t | f | pg_catalog | | Retrieve results of cursor in parallel
sslinfo | 1.0 | f | t | t | | | information about SSL certificates
gp_pitr | 1.0 | f | t | f | | | Distributed point-in-time-recovery functions
pgcrypto | 1.1 | f | t | t | | | cryptographic functions
gp_sparse_vector | 1.0.1 | f | t | t | | | SParse vector implementation for GreenPlum
gp_wlm | 0.1 | f | t | f | gpmetrics | | Greenplum Workload Manager Extension
metrics_collector | 1.0 | f | t | f | gpmetrics | | Greenplum Metrics Collector Extension
orafce | 3.7 | f | t | f | | | Functions and operators that emulate a subset of functions and packages from the Oracle RDBMS
plperl | 1.0 | f | f | f | pg_catalog | | PL/Perl procedural language
tablefunc | 1.0 | f | t | t | | | functions that manipulate whole tables, including crosstab
(34 rows)
安装插件
加载已添加的扩展
通过命令 CREATE EXTENSION,加载pg_available_extensions视图里的扩展到数据库中。执行成功,即可直接使用。
相关命令介绍详见:http://docs-cn.greenplum.org/v6/ref_guide/sql_commands/CREATE_EXTENSION.html#topic1
CREATE EXTENSION [ IF NOT EXISTS ] extension_name
[ WITH ] [ SCHEMA schema_name ]
[ VERSION version ]
[ FROM old_version ]
[ CASCADE ]
示例:
CREATE EXTENSION fuzzystrmatch;
安装外部插件
需要先下载扩展包,解压到gp数据库的扩展包目录下,执行编译脚本。安装成功后,在可用插件中就可以看到该插件信息。接下来通过CREATE EXTENSION加装插件(过程同上),就完成了。
下载扩展包
如果扩展在未在视图pg_available_extension_versions中找到,就得先进行插件安装。GP团队已收集了常用的插件(实际就是PostgreSQL的扩展插件),统一维护在gp源码包中。可以通过下载源码包获得需要的插件,或单独下载需要的三方扩展包。以模糊查询 pg_trgm插件为例。
下载gp源码包,并解压
wget https://github.com/greenplum-db/gpdb/archive/refs/tags/6.18.2.tar.gz \
&& tar -zxvf gpdb-6.18.2.tar.gz
或者https://github.com/greenplum-db/gpdb-archive/archive/refs/heads/main.zip
拷贝扩展包到安装目录
gp源码包解压后,找到需要的扩展包。扩展包路径在gpdb-6.18.2/contrib下,进到该目录可看到可用的扩展包


复制扩展包到GP数据库扩展包安装目录($GPHOME/share/postgresql/contrib),当然也可以把插件全部拷贝过去,这里以模糊查询 pg_trgm插件为例,其路径为 gpdb-6.18.2/contrib/pg_trgm
cp -r gpdb-6.18.2/contrib/pg_trgm $GPHOME/share/postgresql/contrib
编译并安装扩展包
进到GP数据库扩展包安装目录,进到需安装的扩展包里,执行编译安装命令
cd $GPHOME/share/postgresql/contrib/pg_trgm

注意请切换gpadmin用户(与GP安装同用户)安装,如果当前不是请授权文件夹权限并切换到gpadmin用户
chown -R gpadmin:gpadmin $GPHOME/share/postgresql/contrib/pg_trgm
su gpadmin
编译并安装,注意要带上USE_PGXS=1,直接执行make会报错。见附录
make USE_PGXS=1 && make USE_PGXS=1 install
验证插件安装是否成功
安装成功后,会在$GPHOME/share/postgresql/extension目录下生成插件相关的几个文件

并且在可用插件列表中也会出现。因为笔者已安装,固直接截图插件详情里的。验证成功后,注意这时别急,如果gp为集群环境,请在sement上也进行安装
SELECT * from pg_available_extensions;
sement节点机器安装扩展插件
集群环境下,sement机器必须也安装插件。还记得gp安装时的gpscp、gpssh命令么,没错,这时可以用这两个工具批量操作。进行相同安装操作,或可直接拷贝安装成功后生成的文件(注意不单单是$GPHOME/share/postgresql/extension目录下的,没实践不展开)。此处以安装包为例。
打包插件包并拷贝到sement节点上
# 打包压缩
cd $GPHOME/share/postgresql/contrib \
&& tar -czvf pg_trgm.tar.gz ./pg_trgm
# 拷贝到segment上,seghost文件为segment机器列表
gpscp -f <seghost_list> ./pg_trgm.tar.gz=:$GPHOME/share/postgresql/contrib/pg_trgm.tar.gz
# 示例
gpscp -f /home/gpadmin/config/seghosts ./pg_trgm.tar.gz=:/opt/gpdb/share/postgresql/contrib/pg_trgm.tar.gz
执行安装脚本
# 1、ssh打通segment,批量安装
gpssh -f /home/gpadmin/config/seghosts
# 检查下segment是否都连上了
pwd
# 2、进入插件包目录
cd $GPHOME/share/postgresql/contrib
# 3、解压缩并进入插件安装包目录下
tar -zxvf pg_trgm.tar.gz && cd pg_trgm
# 4、编译并安装,等待安装完成
make USE_PGXS=1 && make USE_PGXS=1 install
# 5、等待安装完成,退出
exit
到此segment也都装上了,外部插件已成功添加。
加载扩展到数据库中
通过命令 CREATE EXTENSION 加载插件。同加载已添加的扩展一样
CREATE EXTENSION pg_trgm
使用
插件安装并加载完成,就可以直接使用了。不同的插件,作用的范围不同,新增的特性会在不同的地方体现,详细的说明,请自行翻阅插件本身的文档。
以模糊查询插件为例,具体使用详见:http://postgres.cn/docs/13/pgtrgm.html
select show_trgm('hello 这个傻猿不太冷 怎么这么帅');
{0x83a7ab,0x8df69b,0xaf9d36,0xaf7274,0xb6a741,0xc4ce61,0x19d70f,0x193489,0x1a5c56," h"," he",0x48c484,0x5c7a09,0x61e652,ell,hel,llo,"lo ",0x76572a,0x7e334d}
看到以下内容,就大功告成了。好了,你可以尽情摸索新插件带来的新特性
删除扩展
通过命令 DROP EXTENSION
相关命令介绍详见:http://docs-cn.greenplum.org/v6/ref_guide/sql_commands/DROP_EXTENSION.html#topic1
DROP EXTENSION [ IF EXISTS ] name [, ...] [ CASCADE | RESTRICT ]
示例:
DROP EXTENSION fuzzystrmatch;
更改已安装的扩展定义
相关命令介绍详见:http://docs-cn.greenplum.org/v6/ref_guide/sql_commands/ALTER_EXTENSION.html#topic1
ALTER EXTENSION name UPDATE [ TO new_version ]
ALTER EXTENSION name SET SCHEMA new_schema
ALTER EXTENSION name ADD member_object
ALTER EXTENSION name DROP member_object
where member_object is:
ACCESS METHOD object_name |
AGGREGATE aggregate_name ( aggregate_signature ) |
CAST (source_type AS target_type) |
COLLATION object_name |
CONVERSION object_name |
DOMAIN object_name |
EVENT TRIGGER object_name |
FOREIGN DATA WRAPPER object_name |
FOREIGN TABLE object_name |
FUNCTION function_name ( [ [ argmode ] [ argname ] argtype [, ...] ] ) |
MATERIALIZED VIEW object_name |
OPERATOR operator_name (left_type, right_type) |
OPERATOR CLASS object_name USING index_method |
OPERATOR FAMILY object_name USING index_method |
[ PROCEDURAL ] LANGUAGE object_name |
SCHEMA object_name |
SEQUENCE object_name |
SERVER object_name |
TABLE object_name |
TEXT SEARCH CONFIGURATION object_name |
TEXT SEARCH DICTIONARY object_name |
TEXT SEARCH PARSER object_name |
TEXT SEARCH TEMPLATE object_name |
TRANSFORM FOR type_name LANGUAGE lang_name |
TYPE object_name |
VIEW object_name
and aggregate_signature is:
* | [ argmode ] [ argname ] argtype [ , ... ] |
[ [ argmode ] [ argname ] argtype [ , ... ] ]
ORDER BY [ argmode ] [ argname ] argtype [ , ... ]
附录
GP支持插件列表
完整清单详见: https://github.com/greenplum-db/gpdb/tree/02f2a3f39abefa8f7cb24c392d59d0d7be8b8495/contrib

阿里云-RDS PostgreSQL 插件说明&实践文档(白嫖)
地址:https://help.aliyun.com/document_detail/142340.html
阿里云上已经帮我们收集了大量插件说明文档,还整理了很多有用实践文档。除了pg的一些实操,在这神奇的网站上,大家还可以找到很多有用的实践文档。不得不感谢”996“工程师们做出的无私贡献(看吧,我没在摸鱼)。一时白嫖一时爽,一直白嫖一直爽。
RoaringBitmap插件地址
https://github.com/zeromax007/gpdb-roaringbitmap
make 时报错:…/…/src/Makefile.global: 没有那个文件或目录
RoaringBitmap插件地址
https://github.com/zeromax007/gpdb-roaringbitmap
常见编译报错处理
make 时报错:…/…/src/Makefile.global: 没有那个文件或目录

看Makefile文件即可知道如何解决

解决方案:带上参数USE_PGXS即可
make USE_PGXS=1
make 时报错:gcc问题
一般都是没有安装gcc的原因,安装下gcc
sudo yum install gcc
报错:gp相关命令不存在
一般重新应用下环境变量即可
source ~/.base_profile
报错:source环境变量,但是gp命令不生效
已经执行了source ~/.base_profile,但是gp命令还是报错:找不到。更神奇的是在特定目录下,可以执行gp命令,其他目录都不可用。
出现这情况,请检查下配置文件的GP安装主路径是否正确。可通过env命令检查当前生效的环境变量信息
env
如果发现gp的path跟实际的有出入,请做出相应更改。可能是GPHOME变量有问题,在环境变量中指定GPHOME变量
# 查看 GPHOME变量
echo $GPHOME
# 如果与实际不符,手动指定GPHOME
vim ~/.bash_profile
# 重新执行 srouce命令
source ~/.bash_profile
转自:
https://blog.csdn.net/y408526120/article/details/121962414

浙公网安备 33010602011771号