aws cloudwatch监控怎么通过钉钉机器人报警

最近在完善海外业务在aws服务的CloudWatchh监控,发现CloudWatch报警通知要通过aws的sns服务,直接支持的通道有短信和邮件,但是我们想推到钉钉群里面的群机器人里面这个就要借助aws的Lambda函数服务

然后选择用sns来触发

 

python脚本内容变量event是sns传过来的消息内容,有点坑的是CloudWatch的报警信息很乱还要稍微优化一下

 

# -*- coding: utf-8 -*-
import json
import os
import re
from botocore.vendored import requests


def size_b_to_other(size):
    """用于转换容量单位"""
    units = ['B', 'KB', 'MB', 'GB', 'TB']
    # 处理异常
    if size <= 0:
        return False

    # 遍历单位的位置并用取整除赋值
    for unit in units:
        if size >= 1024:
            size //= 1024
        else:
            size_h = '{} {}'.format(size, unit)
            return size_h

    size_h = '{} {}'.format(size, unit)
    return size_h
    
    
def lambda_handler(event, context):
    token = os.getenv('token')
    url = "https://oapi.dingtalk.com/robot/send?access_token="
    headers = {'Content-Type': 'application/json'}
    # 解析要使用的字段
    Sns = event['Records'][0]['Sns']
    Subject = Sns['Subject']
    if "ALARM" in Subject:
        title = "======报警信息======"
    elif "OK" in Subject:
        title = "======恢复======"
    Timestamp = Sns['Timestamp']
    Message = Sns['Message']
    Message = json.loads(Message)
    try:
        NewStateReason = Message['NewStateReason']
        AlarmDescription = Message['AlarmDescription']
        # 转换cloudwatch单位为友好单位
        datapoint = re.findall(r'[[](.*?)[]]', NewStateReason)
        threshold = re.findall(r'[(](.*?)[)]', NewStateReason)
        count = (datapoint[0].count(","))
        if count == 0:
            datapoint = float(str.split(datapoint[0])[0])
            threshold = float(str.split(threshold[1])[0])
            if threshold > 1000:
                datapoint = size_b_to_other(datapoint)
                threshold = size_b_to_other(threshold)
        else:
            i = threshold[len(threshold) - 1]
            pattern = re.compile(r'^[0-9]+\.[0-9]')
            if pattern.search(i):
                threshold = threshold[len(threshold) - 1]
            else:
                threshold = threshold[len(threshold) - 2] + threshold[len(threshold) - 1]
            
        # 定义消息内容
        content = title + "\n报警主题:" + "" + Subject + "" \
                "\n报警时间:" + "" + Timestamp + "" \
                "\n报警原因:" + "" + NewStateReason + "" \
                "\n友好信息:" + "" + "当前值=" + str(datapoint) + " 连续" + str(count + 1) + "次达到 " + "阀值=" + str(threshold) + "" \
                "\n备注信息:" + "" + str(AlarmDescription) + ""
    
    except:
        Message = json.dumps(Message, sort_keys=True, indent=2)
        content = title + "\n报警主题:" + "" + Subject + "" \
            "\n详细信息:" + "" + Message + "" \
            "\n备注信息:【消息解析异常】"

    data = {
        "msgtype": "text",
        "text": {
            "content": content
        }
    }
    
    data = json.dumps(data)
    request = requests.post(url + token, data, headers=headers)
    return request.text

 测试一下

 

posted @ 2019-04-15 10:48  三木燕  阅读(1697)  评论(0编辑  收藏  举报