AWMS SQL query

AWMS 用户报告系统完整指南

概述

本文档详细说明了AWMS网页端"用户/用户组群"报告的SQL实现和Python调用方法。


1. 网页端报告类型对应关系

1.1 用户级别报告

网页端显示 RDL文件名 存储过程 文件位置
各用户的使用量摘要 NONCR_UsageSummaryAtUserLevel rpt_sp_NonCR_UsageSummaryByTranTypeAtUserLevel rpt/procedures/rpt_sp_NonCR_UsageSummaryByTranTypeAtUserLevel.sql
各用户服务类型的使用量 NONCR_UsageSummaryByTranTypeAtUserLevel rpt_sp_NonCR_UsageSummaryByTranTypeAtUserLevel 同上
用户列表 UserUserGroupList rpt_sp_UserUserGroupList rpt/procedures/rpt_sp_UserUserGroupList.sql
各用户的交易详情 NonCR_UserTransactionDetails rpt_sp_NonCR_UserTransactionDetails rpt/procedures/rpt_sp_NonCR_UserTransactionDetails.sql

1.2 用户组群级别报告

网页端显示 RDL文件名 存储过程 文件位置
各用户组群的使用量摘要 NonCR_UsageSummaryByUserGroup rpt_sp_NonCR_UserGroupUsageSummary rpt/procedures/rpt_sp_NonCR_UserGroupUsageSummary.sql
各用户组群服务类型的使用量 NonCR_UserGroupSummaryByTranType rpt_sp_NonCR_UsageByUserGroupTransactionType rpt/procedures/rpt_sp_NonCR_UsageByUserGroupTransactionType.sql

⭐ = 最常用的报告类型


2. 核心存储过程详解

2.1 rpt_sp_NonCR_UsageSummaryByTranTypeAtUserLevel

用途: 各用户服务类型的使用量(不含成本信息)

文件: ./db_script/SQLServer/clean/rpt/procedures/rpt_sp_NonCR_UsageSummaryByTranTypeAtUserLevel.sql

调用示例:

EXEC rpt.rpt_sp_NonCR_UsageSummaryByTranTypeAtUserLevel
    @TransactionType = '1,3,4,5',    -- 1:Print, 3:FaxIn, 4:Copy, 5:FaxOut
    @StartDate = '2025-12-01',
    @EndDate = '2025-12-31',
    @UserIDs = 'All',
    @UserGroupIDs = 'All',
    @DeviceGroupIDs = 'All',
    @DeviceIDs = 'All',
    @IncludeUsersInSubGroups = 0

返回字段:

NimUniqueID          -- 用户唯一ID
NetworkID            -- 网络ID (用户名)
FirstName            -- 名
LastName             -- 姓
TransactionType      -- 服务类型 (资源键,如 "FXA_PRINT_0001")
TransactionTypeId    -- 服务类型ID (1-8)
ColorPages           -- 彩色页数
BWPages              -- 黑白页数
TotalPages           -- 总页数
TotalJobs            -- 总作业数

数据来源:

  • 主表: rms.jobLogInfo (作业日志)
  • 详情表: rms.jobLogDetailOutputInfo (输出), rms.jobLogDetailInputInfo (输入)
  • 用户表: nim.NIM_USER (用户信息)
  • 资源表: sf.sf_lookup_items (类型映射)

2.2 rpt_sp_UsageByUserGroupTransactionType

用途: 各用户组群服务类型的使用量(含成本信息)

文件: ./db_script/SQLServer/clean/rpt/procedures/rpt_sp_UsageByUserGroupTransactionType.sql

调用示例:

EXEC rpt.rpt_sp_UsageByUserGroupTransactionType
    @TransactionType = '1,3,4,5',
    @StartDate = '2025-12-01',
    @EndDate = '2025-12-31',
    @SplitBySubUserGroup = 0,
    @DeviceGroupIDs = 'All',
    @DeviceIDs = 'All',
    @UserGroupIDs = 'All',
    @DecimalScale = 2

返回字段:

ParentUserGroupName  -- 父用户组群名称
UserGroupName        -- 用户组群名称
UserGroupID          -- 用户组群ID
TransactionType      -- 服务类型
TransactionTypeId    -- 服务类型ID
ColorPages           -- 彩色页数
BWPages              -- 黑白页数
TotalPages           -- 总页数
TotalJobs            -- 总作业数
TotalCost            -- 总成本 (字符串格式)
TotalCostAsMoney     -- 总成本 (数值格式)

数据来源:

  • 交易表: cst.cst_transactions (计费交易)
  • 交易详情: cst.cst_transaction_details (交易页数详情)
  • 用户表: nim.NIM_USER, nim.NIM_GROUP (用户/组群)

3. 作业类型映射

3.1 服务类型 (SF_JobType)

ID 英文 中文 说明
1 Print 打印 从客户端发送的打印作业
2 Scan 扫描 扫描到邮箱/文件夹等
3 Fax 传真接收 接收传真
4 Copy 复印 复印作业
5 Fax 传真发送 发送传真
6 File Transfer 文件传输
7 Email Transfer 邮件传输
8 Job Flow 作业流程

资源键格式: FXA_{TYPE}_0001

  • FXA_PRINT_0001 = 打印
  • FXA_SCAN_0001 = 扫描
  • FXA_COPY_0001 = 复印
  • FXA_FAX_0001 = 传真

3.2 颜色模式

描述
1 黑白 (B/W)
2 全彩色 (Full Color)
>2 其他彩色模式 (2-Color, Single Color等)

页数计算规则:

-- 彩色页数 = SUM(CASE WHEN color_mode > 1 THEN impressions ELSE 0 END)
-- 黑白页数 = SUM(CASE WHEN color_mode = 1 THEN impressions ELSE 0 END)

4. Python 实现

4.1 快速开始

import pyodbc

# 数据库连接
conn = pyodbc.connect(
    'DRIVER={ODBC Driver 11 for SQL Server};'
    'SERVER=192.168.1.101;'
    'DATABASE=AWMS2;'
    'UID=sa;'
    'PWD=P@ssw0rd'
)

# 调用存储过程
cursor = conn.cursor()
cursor.execute("""
    EXEC rpt.rpt_sp_NonCR_UsageSummaryByTranTypeAtUserLevel
        @TransactionType = ?,
        @StartDate = ?,
        @EndDate = ?,
        @UserIDs = 'All',
        @UserGroupIDs = 'All',
        @DeviceGroupIDs = 'All',
        @DeviceIDs = 'All'
""", '1,3,4,5', '2025-12-01', '2025-12-31')

# 获取结果
for row in cursor:
    print(f"用户: {row.NetworkID}, 类型: {row.TransactionType}, 页数: {row.TotalPages}")

4.2 完整脚本

参考 generate_user_reports.py,该脚本提供:

  1. get_user_usage_by_transaction_type() - 获取用户服务类型使用量
  2. get_user_group_usage_by_transaction_type() - 获取用户组群服务类型使用量
  3. export_to_excel() - 导出到Excel
  4. export_to_csv() - 导出到CSV

使用示例:

# 生成过去1个月的报告
from datetime import datetime, timedelta

end_date = datetime.now()
start_date = end_date - timedelta(days=30)

# 获取数据
data = get_user_usage_by_transaction_type(
    start_date=start_date.strftime('%Y-%m-%d'),
    end_date=end_date.strftime('%Y-%m-%d'),
    transaction_types='1,3,4,5'
)

# 导出到Excel
export_to_excel(data, '各用户服务类型使用量.xlsx')

5. 数据表关系

5.1 核心表关系图

nim.NIM_USER (用户表)
    |
    | UNIQUE_ID
    |
    v
rms.jobLogInfo (作业日志主表)
    |
    | autoNumber (PK)
    |
    +---> rms.jobLogDetailOutputInfo (输出详情 - Print/Copy)
    |
    +---> rms.jobLogDetailInputInfo (输入详情 - Scan/Fax)
            |
            | jobLogID (FK)
            |
            v
        impressions, color_mode, size

5.2 用户组群关系

nim.NIM_GROUP (用户组群表)
    |
    | NIM_UNIQUE_ID
    |
    v
nim.NIM_GROUP_MATTR_UNIQUE_MEMBER (组群成员关系表)
    |
    | UNIQUE_MEMBER
    |
    v
nim.NIM_USER (用户表)

6. 网页端 vs CSV导出

6.1 数据来源对比

特性 网页报告 (存储过程) CSV导出脚本
数据源 cst_transactions + jobLog rms.jobLogInfo 直接查询
聚合级别 已聚合 (按用户/类型) 原始明细
成本信息 包含 (使用计费系统) 不包含
用户关联 通过NIM_USER表 直接从jobLogInfo
复杂度 高 (处理调整、重复) 低 (直接查询)
用途 财务报表、成本分析 日志导出、数据备份

6.2 何时使用哪种方式

使用网页报告存储过程:

  • 需要成本信息
  • 需要聚合数据(按用户/组群/类型)
  • 生成财务报表
  • 需要处理特殊逻辑(如Fax配对)

使用直接CSV导出:

  • 需要原始作业日志
  • 需要完整的作业详情
  • 数据备份或迁移
  • 自定义数据分析

7. 常见问题

Q1: 为什么用户名显示为空?

A: 存储过程使用 nim.NIM_USER 表进行关联,如果用户未在NIM系统中注册,FirstNameLastName 会为空。此时应使用 NetworkID 字段(对应 jobLogInfo.user_name)。

Q2: 彩色页数和黑白页数如何计算?

A: 根据详情表中的 color_mode 字段:

  • color_mode = 1 → 黑白
  • color_mode > 1 → 彩色

不同作业类型使用不同的详情表:

  • Print/Copy (1,3): 使用 jobLogDetailOutputInfo
  • Scan/FaxIn (2,3): 使用 jobLogDetailInputInfo

Q3: TransactionType 为什么显示为 "FXA_PRINT_0001"?

A: 这是资源键 (Resource Key),用于多语言支持。实际类型ID在 TransactionTypeId 字段中。

Q4: 如何只获取特定用户的数据?

A: 使用 @UserIDs 参数,传入逗号分隔的用户ID列表:

@UserIDs = 'user001,user002,user003'

Q5: 日期范围如何设置?

A:

@StartDate = '2025-12-01'  -- 开始日期 (包含)
@EndDate = '2025-12-31'    -- 结束日期 (包含)

使用 jobLogInfo.start_time 进行筛选。


8. 生成的文件

8.1 Python脚本

文件名 描述
generate_user_reports.py 用户报告生成脚本 ⭐
generate_csv_final.py CSV导出脚本 (原始日志)
AWMS_User_Report_Analysis.md 本分析文档

8.2 输出文件

文件名 描述
用户服务类型使用量_YYYY-MM-DD_YYYY-MM-DD.xlsx Excel格式报告
用户服务类型使用量_YYYY-MM-DD_YYYY-MM-DD.csv CSV格式报告
各用户服务类型的使用量.xlsx 用户提供的参考文件

9. 关键文件位置

./db_script/
├── SQLServer/clean/
│   ├── rpt/procedures/                 # 报告存储过程 ⭐
│   │   ├── rpt_sp_NonCR_UsageSummaryByTranTypeAtUserLevel.sql
│   │   ├── rpt_sp_UsageByUserGroupTransactionType.sql
│   │   ├── rpt_sp_NonCR_UserGroupUsageSummary.sql
│   │   └── ...
│   ├── rms/tables/                     # 核心表定义
│   │   ├── rms_create_table.sql        # deviceInfo等
│   │   └── rms_create_joblog_table.sql # jobLogInfo等
│   └── cst/tables/                     # 计费系统表
│       └── cst_transactions.sql
├── SQLObjects/StoredProcedure/         # 其他存储过程
│   └── cst_cst_get_joblog_information.sql
└── SQLServer/clean/rpt/data/           # 报告元数据
    └── rpt_reports_data.sql            # 报告类型定义

10. 总结

  1. "各用户服务类型的使用量" 对应存储过程 rpt_sp_NonCR_UsageSummaryByTranTypeAtUserLevel
  2. "各用户组群服务类型的使用量" 对应存储过程 rpt_sp_UsageByUserGroupTransactionType
  3. 使用 Python 脚本 generate_user_reports.py 可直接生成相同格式的报告
  4. 数据来源是 rms.jobLogInfo 表,通过 nim.NIM_USER 关联用户信息
  5. 页数根据作业类型从不同的详情表 (OutputInfo/InputInfo) 获取

快速开始:

# 运行报告生成脚本
python generate_user_reports.py

这将生成与网页端完全相同的Excel和CSV报告。

generate_user_reports.py

"""
AWMS 用户报告生成脚本
生成与网页端相同的用户/用户组群服务类型使用量报告

作者: geyee & Claude Code
 
"""

import pyodbc
import pandas as pd
from datetime import datetime, timedelta
import csv

# 数据库配置
DB_CONFIG = {
    'server': '192.168.1.101',
    'database': 'AWMS2',
    'username': 'sa',
    'password': 'P@ssw0rd',
    'driver': '{ODBC Driver 11 for SQL Server}'
}

# 作业类型映射
JOB_TYPES = {
    1: 'Print',      # 打印
    2: 'Scan',       # 扫描
    3: 'Fax',        # 传真(接收)
    4: 'Copy',       # 复印
    5: 'Fax',        # 传真(发送)
    6: 'File Transfer',
    7: 'Email Transfer',
    8: 'Job Flow'
}


def get_connection():
    """获取数据库连接"""
    conn_str = (
        f"DRIVER={DB_CONFIG['driver']};"
        f"SERVER={DB_CONFIG['server']};"
        f"DATABASE={DB_CONFIG['database']};"
        f"UID={DB_CONFIG['username']};"
        f"PWD={DB_CONFIG['password']}"
    )
    return pyodbc.connect(conn_str)


def get_user_usage_by_transaction_type(
    start_date,
    end_date,
    transaction_types='1,3,4,5',
    user_ids='All',
    user_group_ids='All',
    device_group_ids='All',
    device_ids='All'
):
    """
    获取用户服务类型使用量报告

    对应网页端: 用户 - 按交易类型的使用量
    存储过程: rpt.rpt_sp_NonCR_UsageSummaryByTranTypeAtUserLevel

    参数:
        start_date: 开始日期 (YYYY-MM-DD)
        end_date: 结束日期 (YYYY-MM-DD)
        transaction_types: 作业类型,逗号分隔 (1:Print, 2:Scan, 3:Fax, 4:Copy, 5:Fax)
        user_ids: 用户ID列表,逗号分隔,默认'All'
        user_group_ids: 用户组群ID列表,逗号分隔,默认'All'
        device_group_ids: 设备组群ID列表,逗号分隔,默认'All'
        device_ids: 设备ID列表,逗号分隔,默认'All'

    返回:
        list[dict]: 报告数据
    """
    conn = get_connection()
    cursor = conn.cursor()

    # 调用存储过程
    sql = """
    EXEC rpt.rpt_sp_NonCR_UsageSummaryByTranTypeAtUserLevel
        @TransactionType = ?,
        @StartDate = ?,
        @EndDate = ?,
        @UserIDs = ?,
        @UserGroupIDs = ?,
        @DeviceGroupIDs = ?,
        @DeviceIDs = ?,
        @IncludeUsersInSubGroups = 0
    """

    try:
        cursor.execute(sql, transaction_types, start_date, end_date,
                      user_ids, user_group_ids, device_group_ids, device_ids)

        # 获取列名
        columns = [column[0] for column in cursor.description]

        # 获取数据
        rows = []
        for row in cursor:
            rows.append(dict(zip(columns, row)))

        return rows

    except Exception as e:
        print(f"查询失败: {e}")
        return []
    finally:
        conn.close()


def get_user_group_usage_by_transaction_type(
    start_date,
    end_date,
    transaction_types='1,3,4,5',
    user_group_ids='All',
    device_group_ids='All',
    device_ids='All',
    split_by_sub_group=False
):
    """
    获取用户组群服务类型使用量报告

    对应网页端: 用户组群 - 按交易类型的使用量
    存储过程: rpt.rpt_sp_UsageByUserGroupTransactionType

    参数:
        start_date: 开始日期
        end_date: 结束日期
        transaction_types: 作业类型,逗号分隔
        user_group_ids: 用户组群ID列表,逗号分隔
        device_group_ids: 设备组群ID列表,逗号分隔
        device_ids: 设备ID列表,逗号分隔
        split_by_sub_group: 是否按子组群分组

    返回:
        list[dict]: 报告数据
    """
    conn = get_connection()
    cursor = conn.cursor()

    sql = """
    EXEC rpt.rpt_sp_UsageByUserGroupTransactionType
        @TransactionType = ?,
        @StartDate = ?,
        @EndDate = ?,
        @SplitBySubUserGroup = ?,
        @DeviceGroupIDs = ?,
        @DeviceIDs = ?,
        @UserGroupIDs = ?
    """

    try:
        cursor.execute(sql, transaction_types, start_date, end_date,
                      int(split_by_sub_group), device_group_ids,
                      device_ids, user_group_ids)

        columns = [column[0] for column in cursor.description]
        rows = []
        for row in cursor:
            rows.append(dict(zip(columns, row)))

        return rows

    except Exception as e:
        print(f"查询失败: {e}")
        return []
    finally:
        conn.close()


def export_to_excel(data, output_path, sheet_name='Sheet1'):
    """导出数据到Excel"""
    if not data:
        print("没有数据可导出")
        return

    df = pd.DataFrame(data)

    # 重命名列(中文友好)
    column_mapping = {
        'NimUniqueID': '用户ID',
        'NetworkID': '网络ID',
        'FirstName': '名',
        'LastName': '姓',
        'TransactionType': '服务类型',
        'TransactionTypeId': '服务类型ID',
        'ColorPages': '彩色页数',
        'BWPages': '黑白页数',
        'TotalPages': '总页数',
        'TotalJobs': '总作业数',
        'ParentUserGroupName': '父用户组群',
        'UserGroupName': '用户组群',
        'UserGroupID': '用户组群ID',
        'TotalCost': '总成本',
        'TotalCostAsMoney': '总成本(数值)'
    }

    df = df.rename(columns=column_mapping)

    # 导出到Excel
    df.to_excel(output_path, index=False, sheet_name=sheet_name)
    print(f"已导出到: {output_path}")


def export_to_csv(data, output_path):
    """导出数据到CSV"""
    if not data:
        print("没有数据可导出")
        return

    if not data:
        return

    # 获取所有字段名
    fieldnames = list(data[0].keys())

    with open(output_path, 'w', newline='', encoding='utf-8-sig') as f:
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(data)

    print(f"已导出到: {output_path}")


def main():
    """主函数 - 演示报告生成"""

    # 设置日期范围(过去1个月)
    end_date = datetime.now()
    start_date = end_date - timedelta(days=30)

    start_date_str = start_date.strftime('%Y-%m-%d')
    end_date_str = end_date.strftime('%Y-%m-%d')

    print("=" * 60)
    print("AWMS 用户报告生成工具")
    print("=" * 60)
    print(f"查询期间: {start_date_str} 至 {end_date_str}")
    print()

    # 1. 生成用户服务类型使用量报告
    print("[1] 生成用户服务类型使用量报告...")
    user_data = get_user_usage_by_transaction_type(
        start_date=start_date_str,
        end_date=end_date_str,
        transaction_types='1,3,4,5'  # Print, Fax, Copy
    )

    if user_data:
        print(f"   找到 {len(user_data)} 条记录")

        # 导出到Excel
        excel_file = f"用户服务类型使用量_{start_date_str}_{end_date_str}.xlsx"
        export_to_excel(user_data, excel_file, '用户服务类型使用量')

        # 导出到CSV
        csv_file = f"用户服务类型使用量_{start_date_str}_{end_date_str}.csv"
        export_to_csv(user_data, csv_file)

        # 显示前5条记录
        print("\n   前5条记录:")
        for i, row in enumerate(user_data[:5], 1):
            print(f"   {i}. {row.get('FirstName', '')} {row.get('LastName', '')} - "
                  f"{row.get('TransactionType', '')}: {row.get('TotalPages', 0)} 页 "
                  f"({row.get('ColorPages', 0)} 彩色, {row.get('BWPages', 0)} 黑白)")

    print()

    # 2. 生成用户组群服务类型使用量报告
    print("[2] 生成用户组群服务类型使用量报告...")
    group_data = get_user_group_usage_by_transaction_type(
        start_date=start_date_str,
        end_date=end_date_str,
        transaction_types='1,3,4,5'
    )

    if group_data:
        print(f"   找到 {len(group_data)} 条记录")

        # 导出到Excel
        excel_file = f"用户组群服务类型使用量_{start_date_str}_{end_date_str}.xlsx"
        export_to_excel(group_data, excel_file, '用户组群服务类型使用量')

        # 显示前5条记录
        print("\n   前5条记录:")
        for i, row in enumerate(group_data[:5], 1):
            print(f"   {i}. {row.get('UserGroupName', '')} - "
                  f"{row.get('TransactionType', '')}: {row.get('TotalPages', 0)} 页")

    print()
    print("=" * 60)
    print("报告生成完成!")
    print("=" * 60)


if __name__ == "__main__":
    main()

返回已注册用户的完整信息

 SELECT *
  FROM rms.jobLogInfo jl
  INNER JOIN nim.NIM_USER u
      ON u.UNIQUE_ID COLLATE DATABASE_DEFAULT = jl.user_id COLLATE DATABASE_DEFAULT

generate_user_usage_report.py

import pyodbc
import csv
import datetime

# Configuration
DB_CONFIG = {
    'server': '192.168.1.101',
    'database': 'AWMS2',
    'username': 'sa',
    'password': 'P@ssw0rd',
    'driver': '{ODBC Driver 11 for SQL Server}'
}

OUTPUT_FILE = 'user_usage_report.csv'

def get_connection():
    conn_str = f"DRIVER={DB_CONFIG['driver']};SERVER={DB_CONFIG['server']};DATABASE={DB_CONFIG['database']};UID={DB_CONFIG['username']};PWD={DB_CONFIG['password']}"
    return pyodbc.connect(conn_str)

def generate_report(start_date, end_date, transaction_types='1,3,4,5'):
    try:
        conn = get_connection()
        cursor = conn.cursor()

        # Stored Procedure: rpt.rpt_sp_NonCR_UsageSummaryByTranTypeAtUserLevel
        # Parameters (in correct order):
        # @TransactionType VARCHAR(10),
        # @StartDate DATETIME,
        # @EndDate DATETIME,
        # @UserIDs VARCHAR(8000),
        # @UserGroupIDs VARCHAR(8000),
        # @DeviceGroupIDs VARCHAR(4000),
        # @DeviceIDs VARCHAR(4000),
        # @IncludeUsersInSubGroups BIT

        # Prepare parameters in correct order
        params = (
            transaction_types,  # @TransactionType
            start_date,         # @StartDate
            end_date,           # @EndDate
            'All',              # @UserIDs
            'All',              # @UserGroupIDs
            'All',              # @DeviceGroupIDs
            'All',              # @DeviceIDs
            0                   # @IncludeUsersInSubGroups
        )

        print(f"Executing Stored Procedure with params: {params}")

        # Execute SP
        sql = "{CALL rpt.rpt_sp_NonCR_UsageSummaryByTranTypeAtUserLevel (?, ?, ?, ?, ?, ?, ?, ?)}"
        cursor.execute(sql, params)
        
        rows = cursor.fetchall()
        
        if not rows:
            print("No data returned.")
            return

        # Get column names
        columns = [col[0] for col in cursor.description]
        print(f"Columns: {columns}")
        
        # Write to CSV
        with open(OUTPUT_FILE, 'w', newline='', encoding='utf-8-sig') as f:
            writer = csv.writer(f)
            writer.writerow(columns)
            for row in rows:
                writer.writerow(row)
                
        print(f"Report generated: {OUTPUT_FILE}")
        print(f"Total rows: {len(rows)}")

    except Exception as e:
        print(f"Error: {e}")
    finally:
        if 'conn' in locals():
            conn.close()

if __name__ == "__main__":
    # Default to last 30 days
    end_date = datetime.date.today().strftime('%Y-%m-%d')
    start_date = (datetime.date.today() - datetime.timedelta(days=30)).strftime('%Y-%m-%d')
    
    # Or use specific dates if needed
    # start_date = '2025-12-01'
    # end_date = '2025-12-31'
    
    print(f"Generating report for period: {start_date} to {end_date}")
    generate_report(start_date, end_date)

analyze_bw_color_usage.py

"""
AWMS 黑白彩色打印统计分析脚本
分析用户服务类型使用量报告中的彩色和黑白打印数据
"""

import csv
import os

# CSV文件路径
CSV_FILE = r'.\user_usage_report.csv'


def load_data(filepath):
    """加载CSV数据"""
    data = []
    with open(filepath, 'r', encoding='utf-8-sig') as f:
        reader = csv.DictReader(f)
        for row in reader:
            if row['TotalPages'] != '0':  # 排除0页记录
                data.append(row)
    return data


def get_user_name(record):
    """获取用户显示名称"""
    # 优先使用姓名
    if record['FirstName'] and record['LastName']:
        return f"{record['LastName']} {record['FirstName']}"
    # 其次使用NetworkID
    elif record['NetworkID'] and record['NetworkID'] != '':
        return f"用户{record['NetworkID']}"
    # 最后使用NimUniqueID
    elif record['NimUniqueID'] and record['NimUniqueID'] != '':
        return f"用户{record['NimUniqueID']}"
    else:
        return '匿名'


def analyze_total(data):
    """总体统计分析"""
    total_color = sum(int(r['ColorPages']) for r in data)
    total_bw = sum(int(r['BWPages']) for r in data)
    total_all = total_color + total_bw

    print('=' * 60)
    print('=== 黑白 vs 彩色打印统计 ===')
    print('=' * 60)
    print('\n【总体统计】')
    print(f'  彩色页数: {total_color:,} 页')
    print(f'  黑白页数: {total_bw:,} 页')
    print(f'  总页数:   {total_all:,} 页')
    print(f'  彩色占比: {total_color/total_all*100:.1f}%')
    print(f'  黑白占比: {total_bw/total_all*100:.1f}%')

    return total_color, total_bw, total_all


def analyze_by_type(data):
    """按服务类型统计分析"""
    by_type = {}
    for r in data:
        t = r['TransactionType']
        if t not in by_type:
            by_type[t] = {'color': 0, 'bw': 0, 'jobs': 0}
        by_type[t]['color'] += int(r['ColorPages'])
        by_type[t]['bw'] += int(r['BWPages'])
        by_type[t]['jobs'] += int(r['TotalJobs'])

    print('\n【按服务类型统计】')
    for t, stats in sorted(by_type.items()):
        total = stats['color'] + stats['bw']
        if total > 0:
            print(f'  {t}:')
            print(f'    彩色: {stats["color"]:4d} 页 ({stats["color"]/total*100:5.1f}%)')
            print(f'    黑白: {stats["bw"]:4d} 页 ({stats["bw"]/total*100:5.1f}%)')
            print(f'    总计: {total:4d} 页 ({stats["jobs"]:3d} 个作业)')


def analyze_by_user(data, top_n=10):
    """按用户统计分析"""
    user_stats = {}

    for r in data:
        user = get_user_name(r)

        if user not in user_stats:
            user_stats[user] = {'color': 0, 'bw': 0, 'jobs': 0}

        user_stats[user]['color'] += int(r['ColorPages'])
        user_stats[user]['bw'] += int(r['BWPages'])
        user_stats[user]['jobs'] += int(r['TotalJobs'])

    # 按总页数排序
    sorted_users = sorted(
        user_stats.items(),
        key=lambda x: x[1]['color'] + x[1]['bw'],
        reverse=True
    )

    print(f'\n【按用户统计 - Top {top_n}】')
    for i, (user, stats) in enumerate(sorted_users[:top_n], 1):
        total = stats['color'] + stats['bw']
        color_pct = stats['color']/total*100 if total > 0 else 0
        bw_pct = stats['bw']/total*100 if total > 0 else 0
        print(f'  {i:2d}. {user:25s} | 彩色:{stats["color"]:4d}页 ({color_pct:5.1f}%) | 黑白:{stats["bw"]:4d}页 ({bw_pct:5.1f}%) | 总计:{total:4d}页 | 作业:{stats["jobs"]:3d}')

    return sorted_users


def analyze_pure_users(sorted_users):
    """分析纯彩色/纯黑白用户"""
    pure_color = [(u, s) for u, s in sorted_users if s['bw'] == 0 and s['color'] > 0]
    pure_bw = [(u, s) for u, s in sorted_users if s['color'] == 0 and s['bw'] > 0]

    print('\n【纯彩色打印用户】')
    print(f'  共 {len(pure_color)} 位用户只使用彩色打印')
    for user, stats in pure_color[:10]:
        print(f'    - {user}: {stats["color"]} 页')

    print('\n【纯黑白打印用户】')
    print(f'  共 {len(pure_bw)} 位用户只使用黑白打印')
    for user, stats in pure_bw[:10]:
        print(f'    - {user}: {stats["bw"]} 页')


def analyze_color_heavy_users(sorted_users, threshold_pct=50):
    """分析彩色打印占比较高的用户"""
    print(f'\n【彩色打印占比 >= {threshold_pct}% 的用户】')

    color_heavy = []
    for user, stats in sorted_users:
        total = stats['color'] + stats['bw']
        if total > 0:
            color_pct = stats['color'] / total * 100
            if color_pct >= threshold_pct:
                color_heavy.append((user, stats, color_pct))

    color_heavy.sort(key=lambda x: x[2], reverse=True)

    for user, stats, pct in color_heavy[:10]:
        total = stats['color'] + stats['bw']
        print(f'  - {user:25s}: 彩色 {pct:5.1f}% ({stats["color"]}页/{total}页) | 作业数: {stats["jobs"]}')


def main():
    """主函数"""
    # 检查文件是否存在
    if not os.path.exists(CSV_FILE):
        print(f"错误: 找不到文件 {CSV_FILE}")
        return

    # 加载数据
    print(f"正在加载数据: {CSV_FILE}")
    data = load_data(CSV_FILE)
    print(f"加载了 {len(data)} 条记录\n")

    # 执行各项分析
    analyze_total(data)
    analyze_by_type(data)
    sorted_users = analyze_by_user(data, top_n=15)
    analyze_pure_users(sorted_users)
    analyze_color_heavy_users(sorted_users, threshold_pct=30)

    print('\n' + '=' * 60)
    print('分析完成!')
    print('=' * 60)


if __name__ == '__main__':
    main()


从sql server 数据库中获取最新的最新的1000条作业记录

import pyodbc
import csv
import datetime

# Configuration
DB_CONFIG = {
    'server': '192.168.1.101',
    'database': 'AWMS2',
    'username': 'sa',
    'password': 'P@ssw0rd',
    'driver': '{ODBC Driver 11 for SQL Server}'
}

OUTPUT_FILE = 'generated_report_final.csv'
LIMIT = 1000

CSV_HEADERS = [
    "Printer Name", "Branch No.", "MIB Type", "Product Name", "Job ID", "Service Type", 
    "Host Name", "User Name", "Document Name", "Job Owner Name", "Job Client ID", 
    "Print Status", "Port", "Data Format", "IP Address", "Output Tray", "N Up", 
    "Input Tray", "Logging Date", "Log Collection Time", "Printer Date", "Printer Time", 
    "Print Start Date", "Print Start Time", "Print End Date", "Print End Time", 
    "Total Printed Impressions", "Total Printed Impressions Since Power On", "Error", 
    "Printing Output Color", "Printing Size", "Pages", "Paper Type", "Paper Size", 
    "Sheets", "Job Template Name", "Copies", "Staple Count", "Original Output Color", 
    "Original Pages", "Original Type", "Original Size", "Recipient IP Address", 
    "Mailbox No.", "Fax Recipient Phone Number", "Fax Remote Terminal Name", "F Code", 
    "Fax Line", "Fax Start Date", "Fax Start Time", "Fax End Date", "Fax End Time", 
    "Fax Duration", "Remote ID", "Fax Images Sent", "Fax Images Size", 
    "Fax Speed Dial Number", "Detailed Service Type", "Related Job ID", "Account ID", 
    "Card No.", "Meter 1", "Meter 2", "Meter 3", "Billing Meter Number", 
    "Recipient Info", "Staple Type", "Fold Type", "Fold Count", "Punch Type", 
    "Punch Count", "Bind Count", "UUID", "Document Number", "Cause of Error", 
    "Job Error Code", "Fax Sender ID", "Fax Dial-In Number", "Fax Recipient Information", 
    "Fax Call Type", "Fax Transmission Speed", "Fax Reconnection Flag", 
    "Name in Address Book", "Send Fax Status", "Fax Communication Result", "Stored Fax Pages"
]

# Mappings from JobLogResource_en.properties
MAPPINGS = {
    'jobtype': {
        1: "Print", 2: "Scan", 3: "Fax", 4: "Copy", 5: "Fax", 6: "File Transfer", 7: "Email Transfer", 8: "Job Flow"
    },
    'jobstatus': {
        1: "Completed", 2: "Completed (with warning)", 3: "Aborted", 4: "Stopped (System)", 
        5: "Completed (with error)", 6: "Aborted", 7: "Aborted", 8: "Stopped (System)", 
        100: "Unknown"
    },
    'protocol': {
        1: "LPD", 2: "Port9100", 3: "IPP", 4: "SMB", 5: "FTP", 10: "USB", 13: "WSD Print", 
        20: "JobTemplateService", 21: "USB2", 22: "WSD Print"
    },
    'documentformat': {
        1: "HP-PCL", 2: "PostScript", 3: "PDF", 4: "TIFF", 5: "XPS", 6: "JPEG", 
        11: "PDF", 12: "HP-PCLXL", 13: "ART", 14: "PLW", 15: "KS", 16: "KSSM", 17: "XJCL", 18: "JFIF"
    },
    'jobcolormode': {
        1: "B/W", 2: "Full Color", 3: "2-Color", 4: "Single Color", 5: "Few Colors", 6: "Low-Priced Output Color", 7: "Auto"
    },
    'paper_size': {
        -3: "Unknown", -1: "Others",
        1: "A0", 2: "A1", 3: "A2", 4: "A3", 5: "A4", 6: "A5", 7: "A6", 8: "A7", 9: "A8", 10: "A9", 11: "A10",
        12: "ISO B0", 13: "ISO B1", 14: "ISO B2", 15: "ISO B3", 16: "ISO B4", 17: "ISO B5", 18: "ISO B6",
        21: "B0", 22: "B1", 23: "B2", 24: "B3", 25: "B4", 26: "B5", 27: "B6",
        30: "11x17\"(Ledger)", 31: "8.5x11\"(Letter)", 32: "8.5x14\"(Legal)",
        40: "Postcard"
    },
    'nup': {
        1: "1 Up", 2: "2 Up", 4: "4 Up", 8: "8 Up", 16: "16 Up", 32: "32 Up"
    },
    'outputtype': {
        1: "Plain", 2: "Transparencies", 3: "Envelope", 4: "Labels", 5: "Forms", 6: "Coated", 7: "Tracing paper", 8: "Heavyweight", 9: "Recycled"
    }
}

def get_connection():
    conn_str = f"DRIVER={DB_CONFIG['driver']};SERVER={DB_CONFIG['server']};DATABASE={DB_CONFIG['database']};UID={DB_CONFIG['username']};PWD={DB_CONFIG['password']}"
    return pyodbc.connect(conn_str)

def row_to_dict(cursor, row):
    return {col[0]: val for col, val in zip(cursor.description, row)}

def map_value(category, key, default=None):
    if default is None:
        default = str(key)
    return MAPPINGS.get(category, {}).get(key, default)

def format_date(dt):
    if not dt: return ""
    return dt.strftime("%Y/%m/%d")

def format_time(dt):
    if not dt: return ""
    return dt.strftime("%H:%M:%S")

def get_ip_address(job, dev_info):
    # deviceInfo.hostName stores the IP address of the device (e.g., 192.168.1.100)
    # jobLogInfo.host_address stores the client IP (e.g., 10.10.10.100)
    # Combined: DeviceIP\ClientIP
    
    device_ip = dev_info.get('hostName', '')
    client_ip = job.get('host_address', '')
    
    if device_ip and client_ip:
        # Avoid duplication if client_ip already contains device_ip (unlikely but safe)
        if client_ip.startswith(device_ip):
            return client_ip
        return f"{device_ip}\\{client_ip}"
    
    return client_ip or device_ip

def main():
    try:
        conn = get_connection()
        cursor = conn.cursor()

        # Step 0: Fetch Device Info
        # Removed ipAddress as it does not exist in rms.deviceInfo
        print("Step 0: Fetching Device Info (rms.deviceInfo)...")
        sql_device = "SELECT UUIDdeviceInfo, DisplayName, modelName, serialNumber, hostName FROM rms.deviceInfo"
        cursor.execute(sql_device)
        device_rows = cursor.fetchall()
        devices = {}
        for row in device_rows:
            d = row_to_dict(cursor, row)
            devices[d['UUIDdeviceInfo']] = d
        print(f"Loaded {len(devices)} devices.")

        # Step 1: Query main table
        print("Step 1: Querying main table (rms.jobLogInfo)...")
        sql_main = f"SELECT TOP {LIMIT} * FROM rms.jobLogInfo ORDER BY autoNumber DESC"
        cursor.execute(sql_main)
        main_rows = cursor.fetchall()
        
        if not main_rows:
            print("No jobs found.")
            return

        jobs = [row_to_dict(cursor, row) for row in main_rows]
        job_log_ids = [job['autoNumber'] for job in jobs]
        ids_str = ",".join(map(str, job_log_ids))
        
        print(f"Collected {len(job_log_ids)} Job Log IDs.")

        # Step 2: Query Detail Tables
        output_details = {}
        input_details = {}

        if job_log_ids:
            print("Step 2a: Querying Output Detail...")
            sql_out = f"SELECT * FROM rms.jobLogDetailOutputInfo WHERE jobLogID IN ({ids_str})"
            cursor.execute(sql_out)
            for row in cursor.fetchall():
                d = row_to_dict(cursor, row)
                output_details[d['jobLogID']] = d

            print("Step 2b: Querying Input Detail...")
            sql_in = f"SELECT * FROM rms.jobLogDetailInputInfo WHERE jobLogID IN ({ids_str})"
            cursor.execute(sql_in)
            for row in cursor.fetchall():
                d = row_to_dict(cursor, row)
                input_details[d['jobLogID']] = d

        # Step 3: Assemble and Write
        print("Step 3: Assembling data and writing to CSV...")
        with open(OUTPUT_FILE, 'w', newline='', encoding='utf-8-sig') as f:
            writer = csv.writer(f)
            writer.writerow(CSV_HEADERS)

            for job in jobs:
                job_id = job['autoNumber']
                out_info = output_details.get(job_id, {})
                in_info = input_details.get(job_id, {})
                
                # Device Info Lookup
                dev_uuid = job.get('device_uuid')
                dev_info = devices.get(dev_uuid, {})
                
                printer_name = dev_info.get('DisplayName', '')
                product_name = dev_info.get('modelName', '')
                branch_no = "1 of 1"
                mib_type = "FX MIB(SOAP)"
                
                j_type = job.get('job_type')
                
                # Logic for Impressions/Color/Size
                if j_type == 2: # Scan
                    pages = in_info.get('impressions')
                    color = map_value('jobcolormode', in_info.get('color_mode'), "")
                    size_val = in_info.get('size')
                    paper_type = map_value('outputtype', in_info.get('paper_type'), "")
                else: # Print(1), FaxIn(3), Copy(4), FaxOut(5), FileTransfer(6)
                    pages = out_info.get('impressions')
                    color = map_value('jobcolormode', out_info.get('color_mode'), "")
                    size_val = out_info.get('size')
                    paper_type = map_value('outputtype', out_info.get('paper_type'), "")
                
                size = map_value('paper_size', size_val, "")

                # Fax Logic
                is_fax_in = j_type == 3
                is_fax_out = j_type == 5
                
                fax_phone = job.get('fax_out_phone_number') if is_fax_out else job.get('fax_in_phone_number')
                fax_remote = job.get('fax_out_remote_name') if is_fax_out else job.get('fax_in_remote_name')
                f_code = job.get('f_code_out') if is_fax_out else job.get('f_code_in')
                fax_line = job.get('fax_out_line') if is_fax_out else job.get('fax_in_line')
                fax_start = job.get('fax_out_start_time') if is_fax_out else job.get('fax_in_start_time')
                fax_end = job.get('fax_out_complete_time') if is_fax_out else job.get('fax_in_complete_time')
                fax_dur = job.get('fax_out_duration') if is_fax_out else job.get('fax_in_duration')
                remote_id = job.get('fax_out_remote_id') if is_fax_out else job.get('fax_in_remote_id')
                fax_res = job.get('fax_out_result') if is_fax_out else job.get('fax_in_result')

                row_data = [
                    printer_name, # Printer Name
                    branch_no, # Branch No.
                    mib_type, # MIB Type
                    product_name, # Product Name
                    job.get('jobID'), # Job ID
                    map_value('jobtype', j_type), # Service Type
                    job.get('host_name'), # Host Name
                    job.get('user_name'), # User Name
                    job.get('document_name'), # Document Name
                    job.get('user_id'), # Job Owner Name
                    job.get('job_client_id'), # Job Client ID
                    map_value('jobstatus', job.get('job_status')), # Print Status
                    map_value('protocol', job.get('protocol')), # Port
                    map_value('documentformat', job.get('document_format')), # Data Format
                    get_ip_address(job, dev_info), # IP Address
                    out_info.get('output_tray_no'), # Output Tray
                    map_value('nup', out_info.get('n_up'), ""), # N Up
                    out_info.get('input_tray_no'), # Input Tray
                    format_date(job.get('start_time')), # Logging Date
                    format_time(job.get('start_time')), # Log Collection Time
                    format_date(job.get('start_time')), # Printer Date
                    format_time(job.get('start_time')), # Printer Time
                    format_date(job.get('start_time')), # Print Start Date
                    format_time(job.get('start_time')), # Print Start Time
                    format_date(job.get('complete_time')), # Print End Date
                    format_time(job.get('complete_time')), # Print End Time
                    job.get('totalImpressions'), # Total Printed Impressions
                    job.get('totalImpressionsPowerOn'), # Total Printed Impressions Since Power On
                    job.get('fault_code'), # Error
                    color, # Printing Output Color
                    size, # Printing Size
                    pages, # Pages
                    paper_type, # Paper Type
                    "", # Paper Size
                    out_info.get('sheets'), # Sheets
                    "", # Job Template Name
                    job.get('copies'), # Copies
                    "", # Staple Count
                    "", # Original Output Color
                    "", # Original Pages
                    "", # Original Type
                    "", # Original Size
                    "", # Recipient IP Address
                    job.get('mailbox_number'), # Mailbox No.
                    fax_phone, # Fax Recipient Phone Number
                    fax_remote, # Fax Remote Terminal Name
                    f_code, # F Code
                    fax_line, # Fax Line
                    format_date(fax_start), # Fax Start Date
                    format_time(fax_start), # Fax Start Time
                    format_date(fax_end), # Fax End Date
                    format_time(fax_end), # Fax End Time
                    fax_dur, # Fax Duration
                    remote_id, # Remote ID
                    "", # Fax Images Sent
                    "", # Fax Images Size
                    job.get('fax_speed_number'), # Fax Speed Dial Number
                    job.get('job_detail_type'), # Detailed Service Type
                    "", # Related Job ID
                    job.get('account_id'), # Account ID
                    job.get('card_id'), # Card No.
                    job.get('meter_1'), # Meter 1
                    job.get('meter_2'), # Meter 2
                    job.get('meter_3'), # Meter 3
                    "", # Billing Meter Number
                    "", # Recipient Info
                    job.get('staple_type'), # Staple Type
                    job.get('fold_type'), # Fold Type
                    job.get('fold_count'), # Fold Count
                    job.get('punch_type'), # Punch Type
                    job.get('punch_count'), # Punch Count
                    job.get('bind_count'), # Bind Count
                    job.get('uuid'), # UUID
                    "", # Document Number
                    job.get('fault_reason'), # Cause of Error
                    "", # Job Error Code
                    job.get('fax_sender_number'), # Fax Sender ID
                    job.get('fax_dial_in_number'), # Fax Dial-In Number
                    "", # Fax Recipient Information
                    "", # Fax Call Type
                    "", # Fax Transmission Speed
                    job.get('fax_resend'), # Fax Reconnection Flag
                    job.get('fax_address_book_name'), # Name in Address Book
                    job.get('fax_out_status'), # Send Fax Status
                    fax_res, # Fax Communication Result
                    ""  # Stored Fax Pages
                ]
                writer.writerow(row_data)

        print(f"Report generated: {OUTPUT_FILE}")

    except Exception as e:
        print(f"Error: {e}")

if __name__ == "__main__":
    main()

posted @ 2026-01-02 21:56  geyee  阅读(20)  评论(0)    收藏  举报