日常运维记录
一、csdn复制
javascript:document.body.contentEditable='true';document.designMode='on'; #使用ctrl +x复制
二、npm依赖问题
1、npm依赖生成
``` sh
yum install npm #部分应用node版本有要求,需要确认
npm config set registry=https://registry.npm.taobao.org
npm i -g cnpm ;cnpm -v ;npm install -g cgr #可选
cnpm i 或 npm i #生成依赖,正常情况下npm即可完成,生成后即可使用
常见问题
1)spawn git ENOENT #表示git没有安装
2)near heap limit Allocation failed - JavaScript heap out of memory
export NODE_OPTIONS="--max-old-space-size=4096" #根据主机配置,限制内存
3)找不到"node:path" modules #升级npm版本
4)You installed esbuild for another platform than the one you're currently using.
This won't work because esbuild is written with native code and needs to
install a platform-specific binary executable.
这个fishx的框架 是区分操作系统的 所以没法直接拿Windows的依赖用;要自己在Linux虚机里面下代码 打包编译 然后把依赖copy出来
5)cnpm certificate has expired
npm config set strict-ssl false
6)@ccmany/theme@^0.7.24' is not in the npm registry. 私有依赖
npm config set registry https://registry-cnpm.dayu.work/
npm config set @ccmany:registry=http://127.0.0.1:8081/repository/npm-public/
三、maven依赖
注意清理:"_remote.repositories" 和 ".lastUpdated",排除本地仓库本身问题
示例:a.txt内容如下(内容可以从maven提示缺失的依赖中获取)
com.alibaba.cloud:spring-cloud-starter-alibaba-nacos-discovery:jar:1.5.0.RELEASE
com.alibaba.cloud:spring-cloud-starter-alibaba-nacos-config:jar:1.5.0.RELEASE
com.alibaba.cloud:spring-cloud-starter-alibaba-nacos-discovery:jar:1.5.0.RELEASE
问题:downloading http://0.0.0.0/...
虽然指定了 -s,但是还是会访问maven安装目录内conf/settings.xml,注释掉即可
生成安装命令
for info in `cat a`;do echo $info |awk -F':' '{print "mvn install:install-file -Dfile="$2"-"$4".jar -DgroupId="$1" -DartifactId="$2" -Dversion="$4" -Dpackaging=jar"}' ;done
注意:有的是pom,不一定是jar
mvn生成依赖:
mvn dependency:copy-dependencies -DoutputDirectory=./lib -DincludeScope=runtime
outputDirectory: 指定依赖项的存储路径(绝对或相对路径)。
includeScope: 可选参数,指定依赖范围(默认包含compile、runtime)。
四、python依赖
./bin/pip3 install elasticsearch sentence_transformers einops gunicorn flask #安装某个依赖
./bin/pip3 freeze > requirements.txt #生成依赖列表
./bin/pip3 download -r requirements.txt -d tmp #下载依赖到tmp文件下
./bin/pip3 install --no-index --find-links=/home/es/tmp -r /home/es/requirements.txt #新主机上安装依赖
ModuleNotFoundError: No module named '_ssl' 错误,
./configure --with-ssl-default-suites=openssl --enable-optimizations
五、black_exporter
1、安装black-exporter
https://github.com/prometheus/blackbox_exporter/releases/download/v0.20.0-rc.0/blackbox_exporter-0.20.0-rc.0.linux-amd64.tar.gz
2、创建主机定义配置文件
[root@master1 prometheus]# cat file_sd/file.yml
- targets:
- "127.0.0.1:3306"
labels:
node: "127.0.0.1:3306"
- targets:
- "127.0.0.1:9090"
labels:
node: "127.0.0.1:9090"
3、修改prometheus配置文件
- job_name: 'blackbox_tcp_connect' # 检测某些端口是否在线
scrape_interval: 10s
metrics_path: /probe
params:
module: [tcp_connect]
file_sd_configs:
- files:
- ./file_sd/*.yml
refresh_interval: 1m
relabel_configs:
- source_labels: [__address__]
target_label: __param_target
- source_labels: [__param_target]
target_label: instance
- target_label: __address__
replacement: 192.168.74.128:9115 # blackbox-exporter 服务所在的机器和端口
4、创建prometheusrules
- alert: BlackboxProbeFailed
expr: probe_success == 0
for: 1m
labels:
severity: critical
annotations:
summary: Blackbox 探测到 failed (instance {{ $labels.instance }})
description: "Blackbox 探测到主机tomcat 8080端口异常 \n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
5、创建grafana大盘
https://grafana.com/grafana/dashboards/9965
https://grafana.com/grafana/dashboards/11543
6、告警测试
curl 'https://oapi.dingtalk.com/robot/send?access_token=xxx' -H 'Content-Type: application/json' -d '{"msgtype": "text","text": {"content":"我就是我, 是不一样的烟火"}}'
六、ffmpeg
ffmpeg -i out.mp4 -vf delogo=x=40:y=10:w=360:h=120:show=0,delogo=x=1525:y=10:w=360:h=120:show=0 -c:v libx264 -c:a copy correst.mp4
ffplay -i out.mp4 -ss 57 -vf delogo=x=40:y=10:w=360:h=120:show=1,delogo=x=1525:y=10:w=360:h=120:show=1
-ss: 开始时间
-t: 持续多久
x:左右位置,0为左上
y:上下位置,0为左上
修改视频:
#!/bin/bash
file=$1
generaresult(){
echo -n > ${file}.result
sed -i 's/[[:space:]]//g' $file
for i in `cat $file`;do
count=`echo $i |wc -c`
if [[ $i == P* ]];then
title=$i
fi
if [ $count -ge 10 ] ;then
name=$i
fi
if [ $count -ge 4 ] && [ $count -le 6 ] && [[ $i != P* ]];then
time=$i
echo "====>$title--->$name-->$time"
echo "$name#$time" >> ${file}.result #生成文件
fi
done
}
echo "============rename===="
rename() {
for m in `ls *.mp4 `;do
echo "==>$m"
vtime=`ffprobe -v error -select_streams v:0 -show_entries stream=duration -of default=noprint_wrappers=1:nokey=1 $m |cut -d'.' -f1`
for n in `cat ${file}.result`;do
fname=`echo $n |awk -F'#' '{print $1}'`
ftime=`echo $n |awk -F'#' '{print $2}'`
ftime1=`echo $ftime |cut -d':' -f1 |sed 's/^0//'`
ftime2=`echo $ftime |cut -d':' -f2 |sed 's/^0//'`
ftime3=$(($ftime1*60 + $ftime2 ))
avg=$(($vtime - $ftime3))
abs_num=${avg#-}
if [ $abs_num -le 1 ] ;then
mv $m ${fname}.mp4
fi
done
done
}
#generaresult
rename
七、os问题
pcre导致无法执行命令:提示缺少 libpcre.so.1
1、intramfs中有 libpcre.so.1文件,copy过去直接使用即可
2、/usr/lib/dracut/skipcpio /boot/initramfs-3.10.0-1160.el7.x86_64.img |zcat | cpio -div
3、dd of=usr/lib64/libpcre.so.1 of=/usr/lib64/libpcre.so.1
pcre导致无法执行命令:提示缺少 libpcre.so.1
1、intramfs中有 libpcre.so.1文件,copy过去直接使用即可
2、/usr/lib/dracut/skipcpio /boot/initramfs-3.10.0-1160.el7.x86_64.img |zcat | cpio -div
3、dd if=usr/lib64/libpcre.so.1 of=/usr/lib64/libpcre.so.1
dd if=usr/lib64/libz.so.1 of=/usr/lib64/libz.so.1
dd if=usr/lib64/libz.so.1.2.7 of=/usr/lib64/libz.so.1.2.7
yum reinstall zlib pcre
zlib安装后导致的问题:
[root@localhost tmp]# rpm -qa |grep zlib
rpm: error while loading shared libraries: libz.so.1: cannot open shared object file: No such file or directory
cp -arp usr/lib64/libz.so.1* /usr/lib64/
磁盘在线扩容后,df -h 空间没变化,但是磁盘大小已生效
resize2fs /dev/vdb
cron 任务不执行
Nov 14 11:10:01 ops2 crond[1203048]: (root) PAM ERROR (Module is unknown)
Nov 14 11:10:01 ops2 crond[1203048]: (root) FAILED to authorize user with PAM (Module is unknown)
/etc/cron.allow 添加root
修改 /etc/pam.d/cron
session sufficient pam_loginuid.so
日志:/var/log/secure,/var/log/cron
pam_tally.so 改为:pam_faillock.so
#!/bin/bash
for ip in 10.40.168.224 10.40.168.226 10.40.168.85;
do echo "====>$ip"
ssh root@$ip "date"
done
八、终端留痕
echo '/usr/local/script.sh' >> /etc/profile
cat >> /usr/local/script.sh << E
#!/bin/bash
LOGPATH="/usr/local/script/"
TRM=\$(echo `tty` |sed 's/\///g') #user terminal
U=\$(whoami) #user
T=\$(date +%Y%m%d_%H%M%S) #time
FILE="\${U}_\${TRM}_\${T}"
script -t\${LOGPATH}\${FILE}.time -f \${LOGPATH}\${FILE}.log
E
mkdir -p /usr/local/script
chmod 777 -R /usr/local/script
chmod +x /usr/local/script.sh
回放文件:(注意这里如果用户停顿了多久,在回放的时候也会停顿)
scriptreplay -t $时间文件, -f $命令文件
九、java
java查看提供的hsf接口
1、解压war包,进入lib目录
2、开发确认是哪个jar包提供的功能,解压war包,包中看具体的路径,有对应的class文件即为存在
3、在日志中找,关键字匹配
4、看服务启动时间,在startup后还是前
https://blog.51cto.com/hmtk520/2067043 #java内存问题
查看class:jar -tf #查看class文件
替换war包中单个内容:
[root@localhost tmp]# jar tvf a.war |grep wdk.properties
9044 Thu Mar 27 15:20:18 CST 2025 WEB-INF/classes/wdk.properties
[root@localhost tmp]# jar xvf a.war WEB-INF/classes/wdk.properties
inflated: WEB-INF/classes/wdk.properties
[root@localhost tmp]# ls -l
total 318048
-rw-r--r-- 1 root root 325676004 Mar 27 16:09 jsdptest-eap.war
drwxr-xr-x 3 root root 4096 Mar 27 16:11 WEB-INF
[root@localhost tmp]# vi WEB-INF/classes/wdk.properties #修改文件内容
[root@localhost tmp]# jar -uvf a.war WEB-INF/classes/wdk.properties
十、oracle问题
oracle会自动启动服务
后续改密码你们用 connect sys/密码@127.0.0.1/pdborcl as sysdba登陆,不指定数据库实例登陆的是oracle的默认库
sqlplus / as sysdba #登陆方式1
connect sys/密码@127.0.0.1/pdborcl as sysdba
sqlplus USER/PASSWD@localhost:port/SCHEMA
ora-01033:
sqlplus / as sysdba
startup mount #提示链接未关闭
shutdown immediate #关闭连接
startup mount #启动正常
alter database open; #注意需要分号,如果提示错误ORA-01589 ,则打开数据库必须使用RESETLOGS或NORESETLOGS选项;
startup #启动
提示“ invalid username/password; logon denied”
alter user master_oper identified by master_pwd;
参考:https://blog.csdn.net/beihuanlihe130/article/details/108149030
ORA-28000: the account is locked
alter user MASTER_OPER account unlock;
十一、db2重启
重启思路:1)关闭链接 ;2)停止实例(会自动停止数据库)
启动思路:1)先启动实例;2)再启动数据库(一般是随着实例自动启动)
注: 一切操作,先切换用户 db2inst1
1、停止步骤
db2 list applications #查看活跃链接,重启会丢失链接,确认后再操作
db2 force application all #关闭链接
db2 list applications #确认已经没有链接
2、停止实例
db2stop #停止实例
db2start #启动实例
3、激活数据库
db2 list active databases #查看活跃的数据库是否包含目标数据库,则启动,如果已经启动则不需要再执行启动动作
#db2 activate database $数据库 #启动实例
十二、kafka日志清理
#确保kafka已经关闭,如果磁盘完全打满,可以先手动清理部分日志,否则kafka跑不起来
ps -ef |grep kafka
#修改配置文件
cd /data/kafka/kafka_2.12-2.3.0/
vim config/server.properties #增加下面3个选项清理,分别是保留时间,自动清理,清理延迟执行时间,
log.retention.hours=1
log.cleaner.enable=true
log.cleaner.delete.retention.ms=500
#启动,遇到启动失败报zookeeper连接问题,需要先启动zook
./bin/kafka-server-start.sh -daemon ./config/server.properties
tail -f logs/kafkaServer.out
#查看是否成功设置有删除标记
find / -name *.deleted
十三、阿里云slb默认类型
k8s默认创建的slb类型为公网类型
https://help.aliyun.com/zh/ack/ack-managed-and-ack-dedicated/user-guide/configure-an-ingress-controller-to-use-an-internal-facing-slb-instance?spm=a2c4g.11186623.help-menu-85222.d_2_3_4_1_5.6b212529HdP1Sc&scm=20140722.H_151506._.OR_help-T_cn~zh-V_1#e2ac121dab926
默认创建网络类型为
apiVersion: v1
kind: Service
metadata:
name: nginx-ingress-lb-intranet
namespace: kube-system
labels:
app: nginx-ingress-lb
annotations:
service.beta.kubernetes.io/alibaba-cloud-loadbalancer-address-type: intranet # 指明负载均衡实例地址类型为私网类型。
spec:
type: LoadBalancer
externalTrafficPolicy: "Cluster"
ports:
- port: 80
name: http
targetPort: 80
- port: 443
name: https
targetPort: 443
selector:
app: ingress-nginx
十四、数据库常用sql
1、dump语句
mysqldump --all-databases --single-transaction --master-data=2 --triggers --routines --host=127.0.0.1 --port=3306 --user=root > dbpaas-metadb.sql
2、磁盘空间
1)数据文件磁盘空间占用高。
参数 genery_log
SELECT table_schema AS '数据库', table_name,SUM(data_length + index_length + data_free)/1024/1024 AS "表大小MB",SUM(DATA_FREE)/1024/1024 AS "碎片大小MB"
FROM information_schema.TABLES
WHERE table_name='general_log'
TRUNCATE TABLE mysql.general_log;
SELECT file_name, concat(TOTAL_EXTENTS,'M') as 'FIle_size' FROM INFORMATION_SCHEMA.FILES order by TOTAL_EXTENTS DESC
2)日志文件磁盘空间占用高。 在没有正确设置日志备份策略时,可能会由于大事务SQL导致日志增长较快
3)临时文件磁盘空间占用高。
通常导致临时文件占用高的原因是由于查询语句的排序、分组、关联表产生的临时表文件,或者大事务未提交前产生的日志缓存文件。
show processlist;
explain select * from alarm group by created_on order by default;
4)系统文件磁盘空间占用高。
系统文件过大主要是由于undo文件过大。当存在对InnoDB表长时间不结束的查询语句,而且在查询过程中表有大量的数据变化时,系统会生成大量的undo信息,占用大量存储空间。
3、sql信息统计
https://cloud.tencent.com/developer/article/1776619
#分别统计每个分库大小
select table_schema,
concat(round(sum(data_length/1024/1024/1024),2),'GB') as data_length,
concat(round(sum(index_length/1024/1024/1024),2),'GB') as index_length,
concat(round(sum(data_free/1024/1024/1024),2),'GB') as data_free
from tables GROUP by table_schema ;
#查看单个表大小
select table_schema,
concat(round(sum(data_length/1024/1024/1024),2),'GB') as data_length,
concat(round(sum(index_length/1024/1024/1024),2),'GB') as index_length,
concat(round(sum(data_free/1024/1024/1024),2),'GB') as data_free
from tables where information_schema.table_schema=’DB_Name’ and information_schema.table_name=’Table_Name’;
#单个库内所有表大小排序(按照data_free排序)
select
table_schema as '数据库',
table_name as '表名',
table_rows as '记录数',
truncate(data_length/1024/1024, 2) as '数据容量(MB)',
truncate(index_length/1024/1024, 2) as '索引容量(MB)',
truncate(data_free/1024/1024, 2) as '碎片容量(MB)'
from information_schema.tables where table_schema='DB_NAME'
order by data_free desc, data_length desc limit 10;
黑马腾空^_^