starrocks
一、StarRocks 是开源的新一代极速全场景MPP数据库。它采用新一代的弹性MPP架构,可以高效支持大数据量级的多维分析、实时分析、高并发分析等多种数据分析场景。StarRocks 性能出色,它采用了全面向量化技术,比同类产品平均快3-5倍。
二、定位。原有的以 Hadoop 为核心的大数据生态,在性能、实效性、运维难度及灵活性等方面都难以满足企业的需求。OLAP 数据库面临着越来越多的挑战,很难有一种数据库能够适配大部分的业务。作为一款 MPP 架构的分析性数据库,StarRocks 能够支撑 PB 级别的数据量,拥有灵活的建模方式,可以通过向量化引擎、物化视图、位图索引、稀疏索引等优化手段构建极速统一的分析层数据存储系统。定位如图:

三、架构

1)FE-leader:
-
- Leader 从 Follower 中自动选出,进行选主需要集群中有半数以上的 Follower 节点存活。如果 Leader 节点失败,Follower 会发起新一轮选举。
- Leader FE 提供元数据读写服务。只有 Leader 节点会对元数据进行写操作,Follower 和 Observer 只有读取权限。Follower 和 Observer 将元数据写入请求路由到 Leader 节点,Leader 更新完数据后,会通过 BDB JE 同步给 Follower 和 Observer。必须有半数以上的 Follower 节点同步成功才算作元数据写入成功。
2)FE-follower:
-
- 只有元数据读取权限,无写入权限。通过回放 Leader 的元数据日志来异步同步数据。
- 参与 Leader 选举,必须有半数以上的 Follower 节点存活才能进行选主。
4)FE-Observer
-
- 主要用于扩展集群的查询并发能力,可选部署。
- 不参与选主,不会增加集群的选主压力。
- 通过回放 Leader 的元数据日志来异步同步数据。
3)BE
-
-
数据存储方面,StarRocks 的 BE 节点都是完全对等的,FE 按照一定策略将数据分配到对应的 BE 节点。BE 负责将导入数据写成对应的格式存储下来,并生成相关索引。
-
在执行 SQL 计算时,一条 SQL 语句首先会按照具体的语义规划成逻辑执行单元,然后再按照数据的分布情况拆分成具体的物理执行单元。物理执行单元会在对应的数据存储节点上执行,这样可以实现本地计算,避免数据的传输与拷贝,从而能够得到极致的查询性能。
-
四、安装
1)下载地址:https://releases.starrocks.io/starrocks/StarRocks-2.5.10.tar.gz
2)docker安装集群

FE:
FROM java:8 ADD ./StarRocks-2.5.10.tar.gz . RUN mv StarRocks-2.5.10 starrocks RUN mkdir -p /starrocks/fe/meta CMD /starrocks/fe/bin/start_fe.sh
BE:
FROM java:8 ADD ./StarRocks-2.5.10.tar.gz . RUN mv StarRocks-2.5.10 starrocks RUN mkdir -p /starrocks/be/storage CMD "/starrocks/be/bin/start_be.sh"
docker-compose.yml:
version: '3' services: xbd-starrocks-fe-1: build: context: ./ dockerfile: Dockerfile-StarRocks-FE image: xbd-starrocks-fe container_name: xbd-starrocks-fe-1 restart: always privileged: true ports: - 8030:8030 - 9030:9030 networks: - starrocks xbd-starrocks-fe-2: build: context: ./ dockerfile: Dockerfile-StarRocks-FE image: xbd-starrocks-fe container_name: xbd-starrocks-fe-2 restart: always privileged: true ports: - 8031:8030 - 9031:9030 command: "/starrocks/fe/bin/start_fe.sh --helper xbd-starrocks-fe-1:9010" depends_on: - xbd-starrocks-fe-1 networks: - starrocks xbd-starrocks-be-1: build: context: ./ dockerfile: Dockerfile-StarRocks-BE image: xbd-starrocks-be container_name: xbd-starrocks-be-1 restart: always privileged: true volumes: - /var/lib/starrocks-1:/starrocks/be/storage networks: - starrocks xbd-starrocks-be-2: build: context: ./ dockerfile: Dockerfile-StarRocks-BE image: xbd-starrocks-be container_name: xbd-starrocks-be-2 restart: always privileged: true volumes: - /var/lib/starrocks-2:/starrocks/be/storage networks: - starrocks networks: starrocks: external: true
在docker-compose的环境下执行
docker-compose up -d
然后等待启动,通过mysql客户端执行以下操作:密码为空
mysql -h xbd-starrocks-fe-1 -P 9030 -u root -p
连接后执行:
alter system add follower "xbd-starrocks-fe-2:9010"; alter system add backend "xbd-starrocks-be-1:9050"; alter system add backend "xbd-starrocks-be-2:9050";
添加集群就可以了。
其他命令:
alter system add follower/observer/backend "host:editLogPort/editLogPort/port"; #添加节点 alter system drop follower/observer "host:editlogport/editlogport/port"; # 删除fe节点 alter system decommission backend "host:port"; # 删除be节点
SHOW PROC '/frontends'; # fe信息
SHOW PROC '/backends'; # be信息

页面查看


五、官网bug修复安装
version: '3' services: xbd-starrocks-fe: image: starrocks/fe-ubuntu:3.1.3 container_name: xbd-starrocks-fe restart: always privileged: true volumes: - /var/lib/starrocks/fe/meta:/opt/starrocks/fe/meta ports: - 8030:8030 - 9030:9030 command: - /bin/bash - -c - | rm /opt/starrocks/fe/bin/fe.pid -rf /opt/starrocks/fe/bin/start_fe.sh networks: starrocks: ipv4_address: 172.16.0.2 xbd-starrocks-be-1: image: starrocks/be-ubuntu:3.1.3 container_name: xbd-starrocks-be-1 restart: always privileged: true volumes: - /var/lib/starrocks/be-1/storage:/opt/starrocks/be/storage command: - /bin/bash - -c - | rm /opt/starrocks/be/bin/be.pid /opt/starrocks/be/bin/start_be.sh networks: starrocks: ipv4_address: 172.16.0.3 xbd-starrocks-be-2: image: starrocks/be-ubuntu:3.1.3 container_name: xbd-starrocks-be-2 restart: always privileged: true volumes: - /var/lib/starrocks/be-2/storage:/opt/starrocks/be/storage command: - /bin/bash - -c - | rm /opt/starrocks/be/bin/be.pid /opt/starrocks/be/bin/start_be.sh networks: starrocks: ipv4_address: 172.16.0.4 xbd-starrocks-init: image: mysql:8.0.29 restart: 'no' container_name: xbd-starrocks-init privileged: true command: - /bin/bash - -c - | sleep 15s; mysql -h 'xbd-starrocks-fe' -u root -P9030 -e "alter system add backend 'xbd-starrocks-be-1:9050';alter system add backend 'xbd-starrocks-be-2:9050';SET PROPERTY FOR 'root' 'max_user_connections'='1024';SET PASSWORD = PASSWORD('root');" depends_on: - xbd-starrocks-fe - xbd-starrocks-be-1 - xbd-starrocks-be-2 networks: starrocks: ipv4_address: 172.16.0.24 # docker network create --subnet=172.16.0.0/24 starrocks networks: starrocks: external: true
问题说明:在添加多个fe的时候出现了cluster id不一直的问题,导致不能添加follower。
集群部署:
version: '3' services: xbd-starrocks-fe-1: image: starrocks/fe-ubuntu:3.1.3 container_name: xbd-starrocks-fe-1 restart: always privileged: true volumes: - /var/lib/starrocks/fe-1/meta:/opt/starrocks/fe/meta ports: - 8030:8030 - 9030:9030 command: - /bin/bash - -c - | rm /opt/starrocks/fe/bin/fe.pid -rf /opt/starrocks/fe/bin/start_fe.sh networks: starrocks: ipv4_address: 172.16.0.2 xbd-starrocks-fe-2: image: starrocks/fe-ubuntu:3.1.3 container_name: xbd-starrocks-fe-2 restart: always privileged: true volumes: - /var/lib/starrocks/fe-2/meta:/opt/starrocks/fe/meta ports: - 8031:8030 - 9031:9030 command: - /bin/bash - -c - | rm /opt/starrocks/fe/bin/fe.pid -rf /opt/starrocks/fe/bin/start_fe.sh --helper xbd-starrocks-fe-1:9010 networks: starrocks: ipv4_address: 172.16.0.3 xbd-starrocks-be-1: image: starrocks/be-ubuntu:3.1.3 container_name: xbd-starrocks-be-1 restart: always privileged: true volumes: - /var/lib/starrocks/be-1/storage:/opt/starrocks/be/storage command: - /bin/bash - -c - | rm /opt/starrocks/be/bin/be.pid /opt/starrocks/be/bin/start_be.sh networks: starrocks: ipv4_address: 172.16.0.4 xbd-starrocks-be-2: image: starrocks/be-ubuntu:3.1.3 container_name: xbd-starrocks-be-2 restart: always privileged: true volumes: - /var/lib/starrocks/be-2/storage:/opt/starrocks/be/storage command: - /bin/bash - -c - | rm /opt/starrocks/be/bin/be.pid /opt/starrocks/be/bin/start_be.sh networks: starrocks: ipv4_address: 172.16.0.5 xbd-starrocks-init: image: mysql:8.0.29 restart: 'no' container_name: xbd-starrocks-init privileged: true command: - /bin/bash - -c - | sleep 15s; mysql -h 'xbd-starrocks-fe-1' -u root -P9030 -e "alter system add follower 'xbd-starrocks-fe-2:9010';alter system add backend 'xbd-starrocks-be-1:9050';alter system add backend 'xbd-starrocks-be-2:9050';SET PROPERTY FOR 'root' 'max_user_connections'='1024';SET PASSWORD = PASSWORD('root');" depends_on: - xbd-starrocks-fe-1 - xbd-starrocks-fe-2 - xbd-starrocks-be-1 - xbd-starrocks-be-2 networks: starrocks: ipv4_address: 172.16.0.24 # docker network create --subnet=172.16.0.0/24 starrocks networks: starrocks: external: true
问题:fe关联较死,重启拉不起来,去除 --helper也不行
六、问题处理
1)由于starrocks的fe不能处理ip变更的问题,导致fe一直启动失败,出现问题如下:
wait globalStateMgr to be ready. FE type: INIT. is ready: false

处理方式:在启动的时候通过FQDN方式进行。官方说明:https://docs.starrocks.io/zh/docs/administration/enable_fqdn/
/opt/starrocks/fe/bin/start_fe.sh --host_type FQDN
我这里通过k8s启动可以查看hosts文件:

即使重启也不影响,主要是通过域名来绑定的,不会影响重启
注意k8s be部署必须使用Headless(None)
2)最终方案
version: "3" services: xbd-starrocks-fe: image: starrocks/fe-ubuntu:3.1.3 hostname: xbd-starrocks-fe container_name: xbd-starrocks-fe restart: always privileged: true environment: - TZ=Asia/Shanghai - HOST_TYPE=FQDN volumes: - /var/lib/starrocks/fe/meta:/opt/starrocks/fe/meta ports: - 8030:8030 - 9030:9030 command: - /bin/bash - -c - | /opt/starrocks/fe_entrypoint.sh xbd-starrocks-fe networks: - starrocks xbd-starrocks-be-1: image: starrocks/be-ubuntu:3.1.3 hostname: xbd-starrocks-be-1 container_name: xbd-starrocks-be-1 restart: always privileged: true environment: - TZ=Asia/Shanghai volumes: - /var/lib/starrocks/be-1/storage:/opt/starrocks/be/storage command: - /bin/bash - -c - | rm /opt/starrocks/be/bin/be.pid /opt/starrocks/be/bin/start_be.sh networks: - starrocks xbd-starrocks-be-2: image: starrocks/be-ubuntu:3.1.3 hostname: xbd-starrocks-be-2 container_name: xbd-starrocks-be-2 restart: always privileged: true environment: - TZ=Asia/Shanghai volumes: - /var/lib/starrocks/be-2/storage:/opt/starrocks/be/storage command: - /bin/bash - -c - | rm /opt/starrocks/be/bin/be.pid /opt/starrocks/be/bin/start_be.sh networks: - starrocks xbd-starrocks-init: image: mysql:8.0.29 restart: 'no' container_name: xbd-starrocks-init privileged: true command: - /bin/bash - -c - | sleep 60s; mysql -h 'xbd-starrocks-fe' -u root -P9030 -e "alter system add backend 'xbd-starrocks-be-1:9050';alter system add backend 'xbd-starrocks-be-2:9050';SET PROPERTY FOR 'root' 'max_user_connections'='1024';SET PASSWORD = PASSWORD('root');" depends_on: - xbd-starrocks-fe - xbd-starrocks-be-1 - xbd-starrocks-be-2 networks: - starrocks networks: starrocks: external: true
七、经过实践和优化,目前采用存算一体的方案存在内存消耗过大,读写同步时,出现查询过慢等问题,后续调整成了存算分离的模式,并且也可以控制冷热数据的问题。

部署方式:
version: "3.8" x-starrocks-common: &starrocks-common restart: always extra_hosts: - hdfs-namenode:127.0.0.1 - hdfs-datanode:127.0.0.1 environment: - TZ=Asia/Shanghai x-starrocks-cn-common: &starrocks-cn-common <<: *starrocks-common build: context: ./ dockerfile: ./Dockerfile-starrocks-be image: starrocks-cn command: - /bin/bash - -c - | sed -i '/mem_limit=40%/d' /opt/starrocks/cn/conf/cn.conf echo 'mem_limit=40%' >> /opt/starrocks/cn/conf/cn.conf rm /opt/starrocks/cn/bin/cn.pid /opt/starrocks/cn/bin/start_cn.sh services: starrocks-fe: <<: *starrocks-common build: context: ./ dockerfile: ./Dockerfile-starrocks-fe image: starrocks-fe container_name: starrocks-fe ports: - 8030:8030 - 9030:9030 volumes: - /opt/apps/data/starrocks/cn-fe/meta:/opt/starrocks/fe/meta command: - /bin/bash - -c - | sed -i '/run_mode=shared_data/d' /opt/starrocks/fe/conf/fe.conf sed -i '/max_routine_load_task_num_per_be=128/d' /opt/starrocks/fe/conf/fe.conf sed -i '/routine_load_task_consume_second=30/d' /opt/starrocks/fe/conf/fe.conf sed -i '/max_routine_load_batch_size=8589934592/d' /opt/starrocks/fe/conf/fe.conf sed -i '/routine_load_unstable_threshold_second=604800/d' /opt/starrocks/fe/conf/fe.conf sed -i '/cloud_native_storage_type=HDFS/d' /opt/starrocks/fe/conf/fe.conf sed -i '/cloud_native_hdfs_url=hdfs:\/\/hdfs-namenode:8020\/starrocks\//d' /opt/starrocks/fe/conf/fe.conf echo 'run_mode=shared_data' >> /opt/starrocks/fe/conf/fe.conf echo 'max_routine_load_task_num_per_be=128' >> /opt/starrocks/fe/conf/fe.conf echo 'routine_load_task_consume_second=30' >> /opt/starrocks/fe/conf/fe.conf echo 'max_routine_load_batch_size=8589934592' >> /opt/starrocks/fe/conf/fe.conf echo 'routine_load_unstable_threshold_second=604800' >> /opt/starrocks/fe/conf/fe.conf echo 'cloud_native_storage_type=HDFS' >> /opt/starrocks/fe/conf/fe.conf echo 'cloud_native_hdfs_url=hdfs://hdfs-namenode:8020/starrocks/' >> /opt/starrocks/fe/conf/fe.conf rm /opt/starrocks/fe/bin/fe.pid -rf /opt/starrocks/fe/bin/start_fe.sh --host_type FQDN networks: starrocks: ipv4_address: 172.100.0.10 starrocks-cn-1: <<: *starrocks-cn-common container_name: starrocks-cn-1 volumes: - /opt/apps/data/starrocks/cn-1/storage:/opt/starrocks/cn/storage networks: starrocks: ipv4_address: 172.100.0.11 starrocks-cn-2: <<: *starrocks-cn-common container_name: starrocks-cn-2 volumes: - /opt/apps/data/starrocks/cn-2/storage:/opt/starrocks/cn/storage networks: starrocks: ipv4_address: 172.100.0.12 # --- networks ---- networks: starrocks: driver: bridge ipam: config: - subnet: 172.100.0.0/24 # --- networks ----
添加方式改成:
alter system add compute node '172.100.0.11:9050'; alter system add compute node '172.100.0.12:9050';
注意:在时间过程中如何使用容器内的域名访问,grpc连接使用域名会报错,所以改成指定IP的形式。
其中注意:这里选择的事HDFS作为存储方式,也可以选择s3相关的工具,比如minio.
依赖:
Dockerfile-starrocks-fe:
FROM starrocks/fe-ubuntu:3.1.17 COPY ./driver /opt/starrocks
Dockerfile-starrocks-be:
FROM starrocks/fe-ubuntu:3.1.17 COPY ./driver /opt/starrocks
注意:driver中主要是添加的驱动,比如要支持达梦数据库。在external resource 中需要用到。
八、starrocks为了适配集群环境,比如hdfs的HA集群模式(参考地址:链接),需要在hdfs上面做出部分调整,举例如下:
1、准备HDFS中配置的2个文件core-site.xml和hdfs-site.xml,具体参考HDFS的HA集群配置。
core-site.xml
<configuration> <property><name>ha.zookeeper.quorum</name><value>xbd-zk-1:2181,xbd-zk-2:2182,xbd-zk-3:2183</value></property> <property><name>fs.defaultFS</name><value>hdfs://hdfs-cluster</value></property> </configuration>
hdfs-site.xml
<configuration> <property><name>dfs.namenode.http-address.hdfs-cluster.xbd-nn-2</name><value>xbd-nn-2:9871</value></property> <property><name>dfs.nameservices</name><value>hdfs-cluster</value></property> <property><name>dfs.namenode.http-address.hdfs-cluster.xbd-nn-1</name><value>xbd-nn-1:9870</value></property> <property><name>dfs.namenode.http-address</name><value>0.0.0.0:9870</value></property> <property><name>dfs.ha.fencing.methods</name><value>shell(/bin/true)</value></property> <property><name>dfs.namenode.rpc-address.hdfs-cluster.xbd-nn-2</name><value>xbd-nn-2:8021</value></property> <property><name>dfs.ha.automatic-failover.enabled</name><value>true</value></property> <property><name>dfs.namenode.rpc-address.hdfs-cluster.xbd-nn-1</name><value>xbd-nn-1:8020</value></property> <property><name>dfs.permissions.enable</name><value>false</value></property> <property><name>dfs.namenode.name.dir</name><value>/data</value></property> <property><name>dfs.namenode.shared.edits.dir</name><value>qjournal://xbd-jn-1:8485;xbd-jn-2:8486;xbd-jn-3:8487/hdfs-cluster</value></property> <property><name>dfs.namenode.rpc-address</name><value>0.0.0.0:8020</value></property> <property><name>dfs.client.failover.proxy.provider.hdfs-cluster</name><value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value></property> <property><name>dfs.ha.namenodes.hdfs-cluster</name><value>xbd-nn-1,xbd-nn-2</value></property> </configuration>
2、将文件复制到指定目录
Dockerfile-starrocks-fe:/opt/starrocks/fe/conf
FROM starrocks/fe-ubuntu:3.3.16 MAINTAINER xbd COPY ./core-site.xml /opt/starrocks/fe/conf COPY ./hdfs-site.xml /opt/starrocks/fe/conf
Dockerfile-starrocks-cn:/opt/starrocks/fe/conf
FROM starrocks/cn-ubuntu:3.3.16 MAINTAINER xbd COPY ./core-site.xml /opt/starrocks/cn/conf COPY ./hdfs-site.xml /opt/starrocks/cn/conf
3、starrocks连接HA简易配置如下:
version: "3.8" x-starrocks-common: &starrocks-common restart: always network_mode: host user: root privileged: true environment: - TZ=Asia/Shanghai x-starrocks-cn-common: &starrocks-cn-common <<: *starrocks-common build: context: ./ dockerfile: ./Dockerfile/Dockerfile-starrocks-cn image: csp-starrocks-cn:3.3 services: xbd-starrocks-fe: <<: *starrocks-common build: context: ./ dockerfile: ./Dockerfile/Dockerfile-starrocks-fe image: csp-starrocks-fe:3.3 container_name: xbd-starrocks-fe hostname: xbd-starrocks-fe volumes: - /opt/apps/data/starrocks/fe-1/meta:/opt/starrocks/fe/meta command: - /bin/bash - -c - | sed -i "/run_mode=shared_data/d" /opt/starrocks/fe/conf/fe.conf sed -i "/max_routine_load_task_num_per_be=128/d" /opt/starrocks/fe/conf/fe.conf sed -i "/routine_load_task_consume_second=30/d" /opt/starrocks/fe/conf/fe.conf sed -i "/max_routine_load_batch_size=8589934592/d" /opt/starrocks/fe/conf/fe.conf sed -i "/routine_load_unstable_threshold_second=604800/d" /opt/starrocks/fe/conf/fe.conf sed -i "/cloud_native_storage_type=HDFS/d" /opt/starrocks/fe/conf/fe.conf sed -i "/cloud_native_hdfs_url=hdfs:\/\/hdfs-cluster\/starrocks\//d" /opt/starrocks/fe/conf/fe.conf echo "run_mode=shared_data" >> /opt/starrocks/fe/conf/fe.conf echo "max_routine_load_task_num_per_be=128" >> /opt/starrocks/fe/conf/fe.conf echo "routine_load_task_consume_second=30" >> /opt/starrocks/fe/conf/fe.conf echo "max_routine_load_batch_size=8589934592" >> /opt/starrocks/fe/conf/fe.conf echo "routine_load_unstable_threshold_second=604800" >> /opt/starrocks/fe/conf/fe.conf echo "cloud_native_storage_type=HDFS" >> /opt/starrocks/fe/conf/fe.conf echo "cloud_native_hdfs_url=hdfs://hdfs-cluster/starrocks/" >> /opt/starrocks/fe/conf/fe.conf rm /opt/starrocks/fe/bin/fe.pid -rf /opt/starrocks/fe/bin/start_fe.sh xbd-starrocks-cn: <<: *starrocks-cn-common container_name: xbd-starrocks-cn volumes: - /opt/apps/data/starrocks/cn/storage:/opt/starrocks/cn/storage command: - /bin/bash - -c - | sed -i "/mem_limit=40%/d" /opt/starrocks/cn/conf/cn.conf echo "mem_limit=40%" >> /opt/starrocks/cn/conf/cn.conf rm /opt/starrocks/cn/bin/cn.pid /opt/starrocks/cn/bin/start_cn.sh

浙公网安备 33010602011771号