[INFLUXDB/OpenGemini] 查询数据时,TSDB数据库报“InfluxDBException: user is locked”
1 问题描述
- 通过Query API查询INFLUXDB数据库数据时,查询失败,日志中报INFLUXDB数据库错误:
...
org.influxdb.InfluxDBException: user is locked
	at org.influxdb.InfluxDBException.buildExceptionFromErrorMessage(InfluxDBException.java:161) ~[influxdb-java-2.22.jar!/:?]
	at org.influxdb.InfluxDBException.buildExceptionForErrorState(InfluxDBException.java:173) ~[influxdb-java-2.22.jar!/:?]
	at org.influxdb.impl.InfluxDBImpl.execute(InfluxDBImpl.java:846) ~[influxdb-java-2.22.jar!/:?]
	at org.influxdb.impl.InfluxDBImpl.executeQuery(InfluxDBImpl.java:833) ~[influxdb-java-2.22.jar!/:?]
	at org.influxdb.impl.InfluxDBImpl.query$original$NqPAZts7(InfluxDBImpl.java:559) ~[influxdb-java-2.22.jar!/:?]
	at org.influxdb.impl.InfluxDBImpl.query$original$NqPAZts7$accessor$6GQc4J6p(InfluxDBImpl.java) ~[influxdb-java-2.22.jar!/:?]
	at org.influxdb.impl.InfluxDBImpl$auxiliary$JDCxBf2K.call(Unknown Source) ~[influxdb-java-2.22.jar!/:?]
	at org.apache.skywalking.apm.agent.core.plugin.interceptor.enhance.InstMethodsInter.intercept(InstMethodsInter.java:86) ~[skywalking-agent.jar:8.9.0]
	at org.influxdb.impl.InfluxDBImpl.query(InfluxDBImpl.java) ~[influxdb-java-2.22.jar!/:?]
...

或者:
# curl -v http://127.0.0.1:8086/query -u Xxxuser:'xxxx' --data-urlencode "q=show users"
{"error":"user is locked"}
# curl -v http://127.0.0.1:8086/query -u Xxxuser:'xxxx' --data-urlencode "q=SET PASSWORD FOR otherUser = 'xxxx';"
{"error":"user is locked"}
2 问题分析
由于 INFLUXDB / OpenGemini 在此问题上的现象、数据库代码逻辑均一致,以便合并分析、并简称 TSDB(时序数据库)。
- 根本原因:下午升级时,负责写入数据的Flink作业依赖的NACOS配置文件中密码配置错误,且用该错误密码高频请求导致了数据库用户被锁。
3 解决方法
Step1 停止错误密码请求的程序
- 
方法1 停止运行INFLUXDB的数据写入程序 
- 
方法2 将TSDB数据库集群对外提供数据读写的进程的所在服务器启动防火墙,并仅允许 TSDB 集群内的 IP 访问本机器 
在 OpenGemini 数据库中,特指
ts-sql组件/进程
- 启动防火墙
# 启动防火墙
systemctl status firewalld
systemctl start firewalld
systemctl status firewalld
# 仅允许 TSDB 数据库集群内的 IP 访问 (以暂时快速屏蔽外部的错误请求)
firewall-cmd --permanent --add-rich-rule="rule family='ipv4' source address='xx.xx.xx.01' accept"
firewall-cmd --permanent --add-rich-rule="rule family='ipv4' source address='xx.xx.xx.02' accept"
firewall-cmd --permanent --add-rich-rule="rule family='ipv4' source address='xx.xx.xx.03' accept"
# 重新载入防火墙设置,使设置生效
firewall-cmd --reload
# 查看防火墙的状态、已设置的规则
firewall-cmd --state
firewall-cmd --zone=public --list-rich-rules
firewall-cmd --list-all
# 在其他节点上访问 ts-sql 所在机器,以验证防火墙的有效性
curl -v http://xx.xx.xx.xx:8086/ping
- 修改密码(OpenGemini 的 可选步骤)
# 临时关闭 ts-sql 组件的身份认证 
vim /usr/local/opengemini/gemini-deploy/ts-sql-8086/conf/ts-sql.toml
[http]
auth-enabled = false
# 修改密码(方法1)
curl -v http://127.0.0.1:8086/query -u Xxxuser:'xxxx' --data-urlencode "q=SET PASSWORD FOR otherUser = 'xxxx';" 【X 错误示范】
curl -v http://127.0.0.1:8086/query --data-urlencode "q=SET PASSWORD FOR otherUser = 'xxxx';" 【√ 正确示范】
# 修改密码(方法2)
/root/go/src/opengemini/build/ts-cli --database monitor --host xx.xx.xx.xx --port 8086 -u admin --password {password} 【X 错误示范】
/root/go/src/opengemini/build/ts-cli --database monitor --host xx.xx.xx.xx --port 8086 【√ 正确示范】
> SET PASSWORD FOR admin = 'xxxxxx'
# 再次恢复 ts-sql 组件的身份认证 
vim /usr/local/opengemini/gemini-deploy/ts-sql-8086/conf/ts-sql.toml
[http]
auth-enabled = true
# 验证新密码是否有效
curl -G http://10.37.19.242:8086/query -u admin:'RK0Ym4&U94' --data-urlencode "q=show users"
- 关闭防火墙
systemctl stop firewalld
Step2 确保所有错误密码请求的程序停止运行后,静待30秒钟,将自动解锁
停止之后,数据库30s默认自动解锁(否则,即使是尝试重置密码,也可能一直无法重置密码成功)
4 总结与关联操作
TS-SQL异常情况的观测方法
- 观测
ts-sql错误密码请求的实时日志
tail -100f /usr/local/opengemini/gemini-log/ts-sql-8086/sql.error.log
创建新用户
# /root/go/src/opengemini/build/ts-cli --database monitor --host xx.xx.xx.xx --port 8086 -u admin --password {password}
CREATE USER rwuser WITH PASSWORD 'xxxx';
SHOW USERS;
GRANT ALL ON xxx_db TO rwuser;
SHOW GRANTS FOR rwuser;
需同步修改 ts-monitor 组件的密码(OpenGemini 可选)
如果 OpenGemini 数据库
ts-monitor使用的 TSDB 就是出问题的 TSDB 数据库集群实例本身(而非单独新建的 OpenGemini 实例)时,需同步修改ts-monitor中的 TSDB 连接配置。
vim /usr/local/gemini-deploy/ts-monitor/conf/ts-monitor.toml
[query]
  # query for some DDL. Report for these data to monitor cluster.
  # - SHOW DATABASES
  # - SHOW MEASUREMENTS
  # - SHOW SERIES CARDINALITY FROM mst
  query-enable = true
  http-endpoint = "xx.xx.xx.xx:8086"
  query-interval = "5m"
  username = "XxxUser"
  password = "XxxPassword"
  # https-enable = false
[report]
  # Address for metric data to be reported.
  address = "xx.xx.xx.xx:8086"
  # Database name for metric data to be reported.
  database = "monitor_xxx"
  rp = "autogen"
  rp-duration = "168h"
  username = "XxxUser"
  password = "XxxPassword"
TSDB数据库集群给用户加锁/解锁的逻辑
OpenGemini v1.2.0 为例
- lib/util/lifted/influx/meta/errors.go
lib/util/lifted/influx/meta/errors.go:187: ErrUserLocked = errors.New("user is locked")
https://github.com/openGemini/openGemini/blob/v1.2.0/lib/util/lifted/influx/meta/errors.go

引用
ErrUserLocked代码的地方有:
$ grep -nR "ErrUserLocked"
app/ts-store/run/server_test.go:477:    return nil, meta2.ErrUserLocked
engine/shard_test.go:6616:      return nil, meta2.ErrUserLocked
lib/httpserver/handler.go:99:                                   if err == meta2.ErrUserLocked { 【分析重点】
lib/metaclient/meta_client.go:1969:             return nil, meta2.ErrUserLocked
lib/util/lifted/influx/httpd/handler.go:1993:                                   if err == meta2.ErrUserLocked {
lib/util/lifted/influx/meta/errors.go:186:      // ErrUserLocked is returned when a user that is locked.
lib/util/lifted/influx/meta/errors.go:187:      ErrUserLocked = errors.New("user is locked")
- lib/httpserver/handler.go#Authenticate
lib/httpserver/handler.go:99: if err == meta2.ErrUserLocked {
func Authenticate(inner func(http.ResponseWriter, *http.Request), client meta.MetaClient, requireAuthentication bool) http.Handler {
    ...
	_, err = client.Authenticate(creds.Username, creds.Password) //client := var client meta.MetaClient
	if err != nil {
		errMsg := "authorization failed"
		if err == meta2.ErrUserLocked {
			errMsg = err.Error()
		}
		err := errno.NewError(errno.HttpUnauthorized)
		log := logger.NewLogger(errno.ModuleHTTP)
		log.Error(errMsg, zap.Error(err))
		util.HttpError(w, err.Error(), http.StatusUnauthorized)
		return
	}
    ...
}
- /lib/metaclient/meta_client.go#Authenticate(username, password string)
内部调用了
isLockedUser方法


- /lib/metaclient/meta_client.go#isLockedUser
func (c *Client) isLockedUser(u string) bool {
	c.muAuthData.RLock()
	defer c.muAuthData.RUnlock()
	if v, ok := c.authFailRcds[u]; ok {
		if len(v.occurTimeLst) >= maxLoginLimit { // 重点关注: maxLoginLimit / occurTimeLst (涉及上锁和解锁的配置和逻辑)
			c.logger.Info("The user has been locked.", zap.String("user", u))
			return true
		}
	}
	return false
}
- /lib/metaclient/meta_client.go#maxLoginLimit/authFailCacheLimit/lockUserTime/maxLoginValidTime
https://github.com/openGemini/openGemini/blob/v1.2.0/lib/metaclient/meta_client.go

文件路径: /lib/metaclient/meta_client.go
const (
    ...
	//for lock user
	maxLoginLimit      = 5    //Maximum number of login attempts
	authFailCacheLimit = 200  //Size of the channel for processing authentication failures.
	lockUserTime       = 30   //User Lock Duration, in seconds.
	maxLoginValidTime  = 3600 //Validity duration of login records, in seconds.
    ...
)
Y 推荐文献
firewalld / iptables / chkconfig 命令
- OpenGemini
X 参考文献
- Influxdb
 
    本文链接: https://www.cnblogs.com/johnnyzen
关于博文:评论和私信会在第一时间回复,或直接私信我。
版权声明:本博客所有文章除特别声明外,均采用 BY-NC-SA 许可协议。转载请注明出处!
日常交流:大数据与软件开发-QQ交流群: 774386015 【入群二维码】参见左下角。您的支持、鼓励是博主技术写作的重要动力!

 
                
            
         
         浙公网安备 33010602011771号
浙公网安备 33010602011771号