Azure China Application Gateway 性能监控

 目前中国区Azure Portal无法支持应用程序网关的日志诊断,显示和告警功能,之前在www.azure.cn网站的应用程序网关文档页面曾经出现过关于日志诊断的配置文章,但其实是直接翻译自Azure Global的,现在已经被删除。

这样意味着中国客户使用AppGW的网站或API,无法通过直接对AppGW的使用情况进行监控,以进行适当的性能扩展。如果一定要看,只能向21V开工单申请拉性能图表,但实时性差了很多,也不方便。这一问题已遭到客户的多次吐槽。

微软后台研发在上周末对中国区应用程序网关进行了hotfix,实现了部分的海外功能,即Diagnostic log生成和写入到Blob Storage的功能。下面对具体如何使用作出分享。

一.Application Gateway的诊断日志开启和检查

目前中国区对AppGW开启诊断日志只能通过以下Powershell命令行进行:Set-AzureRmDiagnosticSetting -ResourceId <String> -StorageAccountId <String> -Enabled <Boolean>

其中ResourceId代表需要被开启日志诊断功能的资源,在本文中具体指应用程序网关,StorageAccountId代表日志被写入的存储账号,由用户自定义。比如我再Powershell登录自己的Azure China测试订阅后,打开一个AppGW的日志诊断功能操作如下:

PS C:\Users\huxu> Set-AzureRmDiagnosticSetting -ResourceId /subscriptions/3fd8e8ff-2373-4d1c-8587-bddcd13a1ba0/resourceGroups/DWTEST01/providers/Microsoft.Network/applicationGateways/aapgwmon01 -StorageAccountId /subscriptions/3fd8e8ff-2373-4d1c-8587-bddcd13a1ba0/resourceGroups/dwtest01/providers/Microsoft.Storage/storageAccounts/appgwtest -Enabled $true 

StorageAccountId : /subscriptions/3fd8e8ff-2373-4d1c-8587-bddcd13a1ba0/resourceGroups/dwtest01/providers/ 

Microsoft.Storage/storageAccounts/appgwtest 

ServiceBusRuleId : 

StorageAccountName : 

Metrics 

 Enabled : True 

Timegrain : PT1M 

 RetentionPolicy 

 Enabled : False 

 Days : 0 

Logs 

 Enabled : True 

 Category : ApplicationGatewayAccessLog 

 RetentionPolicy 

 Enabled : False 

 Days : 0 

 Enabled : True 

 Category : ApplicationGatewayPerformanceLog 

 RetentionPolicy 

 Enabled : False 

 Days : 0 

 Enabled : True 

 Category : ApplicationGatewayFirewallLog 

 RetentionPolicy 

 Enabled : False 

 Days : 0 

 

然后过了一段时间,在此存储账号内可以看到生成了日志文件。

日志文件为Json格式,其中accesslog文件格式如下:

{ 

    "records": 

    [   

        { 

             "resourceId": "/SUBSCRIPTIONS/1D753E0D-7C47-4EC0-9FEC-EB512CD37742/RESOURCEGROUPS/KD-WEB-RG-PE/PROVIDERS/MICROSOFT.NETWORK/APPLICATIONGATEWAYS/APPGWKD", 

             "operationName": "ApplicationGatewayAccess", 

             "time": "2017-06-12T06:50:23Z", 

             "category": "ApplicationGatewayAccessLog", 

             "properties": {"instanceId":"ApplicationGatewayRole_IN_1","clientIP":"211.156.223.30","clientPort":44107,"httpMethod":"POST","requestUri":"/expressApi/ems/managePush","requestQuery":"X-AzureApplicationGateway-CACHE-HIT=0&SERVER-ROUTED=172.16.5.12&X-AzureApplicationGateway-LOG-ID=9e84951a-62ca-4903-b66a-86a8cff8eee8&SERVER-STATUS=200","userAgent":"Java/1.6.0_20","httpStatus":200,"httpVersion":"HTTP/1.1","receivedBytes":668,"sentBytes":335,"timeTaken":80,"sslEnabled":"off"} 

        } 

        , 

        { 

             "resourceId": "/SUBSCRIPTIONS/1D753E0D-7C47-4EC0-9FEC-EB512CD37742/RESOURCEGROUPS/KD-WEB-RG-PE/PROVIDERS/MICROSOFT.NETWORK/APPLICATIONGATEWAYS/APPGWKD", 

             "operationName": "ApplicationGatewayAccess", 

             "time": "2017-06-12T06:50:22Z", 

             "category": "ApplicationGatewayAccessLog", 

             "properties": {"instanceId":"ApplicationGatewayRole_IN_2","clientIP":"211.156.223.30","clientPort":43661,"httpMethod":"POST","requestUri":"/expressApi/ems/managePush","requestQuery":"X-AzureApplicationGateway-CACHE-HIT=0&SERVER-ROUTED=172.16.5.12&X-AzureApplicationGateway-LOG-ID=9b79883f-639a-45c1-aa26-4797295b62a0&SERVER-STATUS=200","userAgent":"Java/1.6.0_20","httpStatus":200,"httpVersion":"HTTP/1.1","receivedBytes":668,"sentBytes":335,"timeTaken":125,"sslEnabled":"off"} 

        }       

    ] 

} 

 

Performance日志内容如下:

 

{ 

    "records": 

    [ 
         
        { 

             "resourceId": "/SUBSCRIPTIONS/1D753E0D-7C47-4EC0-9FEC-EB512CD37742/RESOURCEGROUPS/KD-WEB-RG-PE/PROVIDERS/MICROSOFT.NETWORK/APPLICATIONGATEWAYS/APPGWKD", 

             "operationName": "ApplicationGatewayPerformance", 

             "time": "2017-06-12T06:54:00Z", 

             "category": "ApplicationGatewayPerformanceLog", 

             "properties": {"instanceId":"ApplicationGatewayRole_IN_2","healthyHostCount":2,"unHealthyHostCount":0,"requestCount":0,"latency":22,"failedRequestCount":0,"throughput":0} 

        } 

        , 

        { 

             "resourceId": "/SUBSCRIPTIONS/1D753E0D-7C47-4EC0-9FEC-EB512CD37742/RESOURCEGROUPS/KD-WEB-RG-PE/PROVIDERS/MICROSOFT.NETWORK/APPLICATIONGATEWAYS/APPGWKD", 

             "operationName": "ApplicationGatewayPerformance", 

             "time": "2017-06-12T06:54:00Z", 

             "category": "ApplicationGatewayPerformanceLog", 

             "properties": {"instanceId":"ApplicationGatewayRole_IN_0","healthyHostCount":2,"unHealthyHostCount":0,"requestCount":0,"latency":6,"failedRequestCount":0,"throughput":0} 

        } 

    ] 

} 

 

目前中国区的功能属于被阉割版,无法在portal上对应用程序网关的日志进行操作和显示,也无法自定义度量值,比如实例的CPU,MEMORY,只能看到上面日志中默认生成的度量值。监控日志每分钟分别对所有实例生成一次,requestCount和throughput都是过去1分钟每秒的平均值,latency表示后端server的响应平均延迟。

 

二.利用Python对Application Gateway的诊断日志进行可视化

感谢研发的努力才能拿到这样的结果,但更多的功能还需要漫长的等待。然而客户的吐槽仍然没有停止,难道在系统性能测试时需要用肉眼去看json吗?

对日志文件实现可视化的方法很多,比如官网介绍了将json转成表格然后用powerbi进行酷炫的展示。这里为各位介绍利用python数据处理和画图相关的包进行可视化的方式,个人觉得比较简单灵活。

首先当然是通过python从blob storage取得日志文件,官网有详细操作说明,这里不再赘述。

然后对取得的json文件进行图表展示,比如这里对第二个实例的延迟和存活的后端server数量进行展示,代码如下:

 

import pandas as pd
import matplotlib.ticker as tic
import matplotlib.pyplot as plt
import matplotlib.dates as dat
import datetime as dt
import os
import json
from matplotlib.font_manager import FontProperties
#载入文件
os.chdir('C:\\Work\\AppGW0613')
with open('PerformanceLog.json','r') as f:
 data = json.load(f)
#赋值
ss=data['records'][0]['time']
list1 = [dt.datetime.strptime(ss,'%Y-%m-%dT%H:%M:%SZ')]
list2 = [data['records'][0]['properties']['latency']]
list3 = [data['records'][0]['properties']['healthyHostCount']]
#遍历监控数据
for i in range (1,data['records'].__len__()):
if data['records'][i]['properties']['instanceId'] == 'ApplicationGatewayRole_IN_2':
 s=data['records'][i]['time']
 timeTuple = dt.datetime.strptime(s,'%Y-%m-%dT%H:%M:%SZ')
list1.append(timeTuple)
 list2.append(data['records'][i]['properties']['latency'])
 list3.append(data['records'][i]['properties']['healthyHostCount'])

plt.figure(figsize=(8,20))
#延迟图
ax = plt.subplot(211)
plt.xlabel("Time")
plt.ylabel("Latency")
ax.set_ylim(ymin=0,ymax=25)
ymajor = tic.MultipleLocator(5)
ax.yaxis.set_major_locator(ymajor)
tittle = 'ApplicationGatewayRole_IN_2'
font = FontProperties(size=14)
ax.set_title(tittle,fontproperties=font)
ax.xaxis.set_major_formatter(dat.DateFormatter('%Y-%m-%d %H:%M:%S'))
plt.xticks(pd.date_range(list1[0],list1[-1],freq='1min'))
plt.plot(list1,list2,"g")
ax.xaxis_date()

for label in ax.get_xticklabels():
 label.set_rotation(20)
 label.set_horizontalalignment('right')
#存活服务器数量图
ax2 = plt.subplot(212)
ax2.set_ylim(ymin=0,ymax=10)
plt.xlabel("Time")
plt.ylabel("HealthyHost")
ymajor = tic.MultipleLocator(1)
ax2.yaxis.set_major_locator(ymajor)
ax2.xaxis.set_major_formatter(dat.DateFormatter('%Y-%m-%d %H:%M:%S'))
plt.xticks(pd.date_range(list1[0],list1[-1],freq='1min'))
plt.plot(list1,list3,"r")
ax2.xaxis_date()

for label in ax2.get_xticklabels():
 label.set_rotation(20)
 label.set_horizontalalignment('right')

plt.show() 

 

 

显示效果如下:

 

 Python的画图方便易用,但我画的有点丑。希望能够有帮助:)

 

posted on 2017-06-18 10:00  Huajun0323  阅读(950)  评论(0编辑  收藏  举报

导航