model_infer.py

class ModelInfer()

1. `init`

def __init__(self):
    self._service_url_gen = self._cyclic_url_sequence()

    self._total_num = 0  # 计数器，用于判断是否一列车（一般是84张图）已经全部推理完成
    self.end_logo_counter = 0
    self.bad_result_counter = 0

    self._session = None
    self._dinghan_result = None

    self.confidence_table = None

    self.process_start_time = None

作用：初始化推理对象的所有运行状态

变量	含义
`_service_url_gen`	模型服务 URL 轮询生成器
`_total_num`	当前列车总图片数
`end_logo_counter`	已完成图片数
`bad_result_counter`	推理失败计数
`_session`	aiohttp 会话
`_dinghan_result`	鼎汉结果
`_java_result`	Java 结果
`confidence_table`	置信度表（未用）
`process_start_time`	流程开始时间

2. `set_base_para`

def set_base_para(self, base_para):
    self.requestId = base_para["requestId"]
    self.systemId = base_para["systemId"]
    self.inspectionStartTime = base_para["inspectionStartTime"]
    self.inspectionEndTime = base_para["inspectionEndTime"]
    self.trainId = base_para["trainId"]
    self.lineCode = base_para["lineCode"]
    self.yardName = base_para["yardName"]
    self.direction = base_para["direction"]

    # 将结果需要的一些参数传入
    self.dinghan_result_init(self.requestId, self.inspectionStartTime, self.trainId)
    self.java_result_init(self.requestId, self.inspectionStartTime, self.trainId)

作用：设置一次任务（一列车）的基础信息

self.requestId = base_para["requestId"]
self.systemId = base_para["systemId"]
self.inspectionStartTime = base_para["inspectionStartTime"]
self.inspectionEndTime = base_para["inspectionEndTime"]
self.trainId = base_para["trainId"]
self.lineCode = base_para["lineCode"]
self.yardName = base_para["yardName"]
self.direction = base_para["direction"]

从 base_para 字典中读取任务相关信息。
保存为对象属性，供后续流程使用。

self.dinghan_result_init(self.requestId, self.inspectionStartTime, self.trainId)

初始化 鼎汉系统 上报结果的基础结构。

self.java_result_init(self.requestId, self.inspectionStartTime, self.trainId)

初始化 Java 系统 上报结果的基础结构。

3. `dinghan_result_init`

def dinghan_result_init(self, requestId, pass_time, train_code):
    pass_time_formatted = f"{pass_time[:4]}-{pass_time[4:6]}-{pass_time[6:8]} {pass_time[8:10]}:{pass_time[10:12]}:{pass_time[12:14]}"

    self._dinghan_result = {
        "key": settings.DINGHAN_REPORT_KEY,
        "uploaddatetime": None,
        "data": {
            "guid": requestId,
            "traincode": train_code,
            "order": 0,
            "direction": 1,
            "passdatetime": pass_time_formatted,
            "currentstatus": 1,
            "kind": 1,
            "systype": [
                {
                    "code": "1",
                    "componentcode": None,
                    "checkitemcode": None,
                    "currentstatus": 1
                }
            ],
            "items": []}}

作用：初始化鼎汉上报结果 JSON 结构

主要做两件事：

将时间字符串 YYYYMMDDHHMMSS转换为 YYYY-MM-DD HH:MM:SS，用于鼎汉接口要求的时间格式。
```
pass_time_formatted = f"{pass_time[:4]}-{pass_time[4:6]}-{pass_time[6:8]} {pass_time[8:10]}:{pass_time[10:12]}:{pass_time[12:14]}"
```
切片含义值

pass_time[:4] 年 2025

pass_time[4:6] 月 12

pass_time[6:8] 日 23

pass_time[8:10] 时 15

pass_time[10:12] 分 30

pass_time[12:14] 秒 45
按鼎汉固定协议生成结果 JSON 框架
- "key": settings.DINGHAN_REPORT_KEY：鼎汉接口认证用的固定 key。
- "uploaddatetime": None：实际上报时再填写上传时间。
- "guid": requestId：本次任务唯一标识。
- "traincode": train_code：列车编号。
- "order": 0：列车顺序。
- "direction": 1：运行方向。
- "passdatetime": pass_time_formatted：列车过车时间（格式化后）。
- "currentstatus": 1：当前状态。
- "kind": 1：任务类型。
- "systype": [...]：鼎汉系统固定字段。描述系统类型、部件、检测项状态。
  - "code": "1"：系统类型编码。"1"：表示当前使用的检测系统（具体含义由鼎汉系统定义）。
  - "componentcode": None,：部件编码。
  - "checkitemcode": None,：检测项编码。
  - "currentstatus": 1：当前状态。1：正常 / 有效状态（具体含义由鼎汉系统定义）。
- "items": []：存放最终的故障检测结果，后续逐条追加。
结果保存在：
```
self._dinghan_result
```

切片	含义	值
`pass_time[:4]`	年	`2025`
`pass_time[4:6]`	月	`12`
`pass_time[6:8]`	日	`23`
`pass_time[8:10]`	时	`15`
`pass_time[10:12]`	分	`30`
`pass_time[12:14]`	秒	`45`

4. `java_result_init`

作用：初始化 Java 上报结果 JSON 结构

逻辑和 dinghan_result_init 基本一致：

"taskid": requestId：Java 侧任务唯一标识

结果保存在：

self._java_result

5. `remove_same_result_with_different_camera`

作用：去掉“不同相机拍到的同一故障”

调用后处理模块analyze.remove_approximate_result：

self._dinghan_result["data"]["items"] = analyze.remove_approximate_result(self._dinghan_result["data"]["items"])
self._java_result["items"] = analyze.remove_approximate_result(self._java_result["items"])

对 鼎汉结果中的故障列表 去重，判断规则在 remove_approximate_result 内部
对 Java 结果中的故障列表 做同样的去重处理

6. `dinghan_add_items`

def dinghan_add_items(self, item):
    if item is not None:
        self._dinghan_result["data"]["items"].append(item)

作用：向鼎汉结果中添加一个故障项

判断item是否有效，避免把空结果加入列表。
将一条故障/检测结果，追加到鼎汉结果的 items 列表中。

7. `jave_add_items`

def jave_add_items(self, item, image_url):
    if item is not None:
        # item["check"][0]["url"] = image_url 此更改暂时不使用
        self._java_result["items"].append(item)

作用：向 Java 结果中添加一个故障项

和 dinghan_add_items 类似，只是操作对象不同。

8. `_get_next_infer_url`

两个函数如何配合

_cyclic_url_sequence：负责生成
_get_next_infer_url：负责取用

# 与生成器配合，给出url
def _get_next_infer_url(self):
    return next(self._service_url_gen)

作用：获取下一个模型推理服务 URL

内部调用 URL 轮询生成器：

next(self._service_url_gen)

(infer_url, url_index)

9. `_cyclic_url_sequence`

作用：按顺序、无限循环地返回模型服务 URL

# 生成器，用于无限循环式地给出一个url供使用
@staticmethod
def _cyclic_url_sequence():
    period = len(settings.SERVICE_URL)
    n = 0
    while True:
        yield (settings.SERVICE_URL[n], n)
        n = (n + 1) % period

period = len(settings.SERVICE_URL)：URL 总数。
n = 0：当前 URL 下标。
while True:无限循环。
yield (settings.SERVICE_URL[n], n)：返回当前 URL 和索引。
- 返回一个二元组：(url, index)
  - url：当前使用的模型服务地址。
  - index：该 URL 在列表中的位置。
- yield 会暂停函数执行，下次从这里继续。
n = (n + 1) % period：切换到下一个 URL，到末尾后回到第一个。

10. `set_total_num`

def set_total_num(self, total_num):
    self._total_num = total_num
    self.bad_result_counter = 0

作用：设置当前列车的图片总数，同时重置失败计数器 bad_result_counter

11. `set_session`

作用：设置 aiohttp 会话对象

外部创建 session，内部复用，避免重复创建连接。

12. `delete_old_images`

def delete_old_images(self):
    # 获取当前时间并计算阈值时间
    current_time = datetime.now(settings.TIME_ZONE)
    threshold_time = current_time - timedelta(days=settings.DELETE_TIME_CYCLE)

    # 将阈值时间转换为连续时间格式字符串
    threshold_str = threshold_time.strftime("%Y%m%d%H%M%S")

    # 用于统计结果
    total_files = 0
    deleted_files = 0

    # 遍历目录中的所有文件
    for filename in os.listdir(settings.DINGHAN_IMG_DIR):
        filepath = os.path.join(settings.DINGHAN_IMG_DIR, filename)

        # 跳过目录，只处理文件
        if not os.path.isfile(filepath):
            continue

        total_files += 1

        # 在文件名中查找时间戳（14位连续数字）
        timestamp_match = re.search(r'(\d{14})', filename)

        if timestamp_match:
            timestamp_str = timestamp_match.group(1)

            try:
                # 将时间戳字符串解析为datetime对象
                file_time = datetime.strptime(timestamp_str, "%Y%m%d%H%M%S")
                # 添加时区信息
                file_time = settings.TIME_ZONE.localize(file_time)

                # 比较文件时间戳和阈值时间
                if file_time < threshold_time:
                    # 删除文件
                    os.remove(filepath)
                    deleted_files += 1
                    # print(f"已删除: {filename} (时间: {file_time.strftime('%Y-%m-%d %H:%M:%S')})")

            except ValueError:
                # 时间戳格式无效，跳过
                continue

    if deleted_files:
        process_logger.info(f"[WARRING] Delete {deleted_files} old images before {threshold_str}")

作用：定期删除过期图片文件。

逻辑：

计算一个 时间阈值（如 7 天前）

遍历图片目录

从文件名中提取 YYYYMMDDHHMMSS

早于阈值就删除

计算“删除阈值时间”
```
current_time = datetime.now(settings.TIME_ZONE)
threshold_time = current_time - timedelta(days=settings.DELETE_TIME_CYCLE)
```
- current_time：当前时间（带时区）
- DELETE_TIME_CYCLE：保留天数（例如 7 天）
- threshold_time：早于这个时间的文件都会被删除
把阈值时间转成字符串（仅用于日志）
```
threshold_str = threshold_time.strftime("%Y%m%d%H%M%S")
```
- 格式：20251222083045
- 只是为了打印日志
- 不参与实际比较
统计变量初始化
```
total_files = 0
deleted_files = 0
```
- total_files：目录中实际检查的文件数
- deleted_files：最终被删除的文件数
注意：total_files 在这个函数里并没有被使用，只是统计用

遍历图片目录

for filename in os.listdir(settings.DINGHAN_IMG_DIR):
    filepath = os.path.join(settings.DINGHAN_IMG_DIR, filename)

遍历 鼎汉图片目录
拼出完整路径

只处理“文件”，跳过子目录
```
if not os.path.isfile(filepath):
    continue
```
- 如果是目录 → 跳过
- 只对文件做删除判断
统计文件数
```
total_files += 1
```
- 每检查一个文件就加 1
从文件名中提取时间戳
```
timestamp_match = re.search(r'(\d{14})', filename)
```
正则含义
- \d{14}：连续 14 位数字
- 格式假定为：
```
YYYYMMDDHHMMSS
```
示例

文件名：
```
20251222083045_camera3.jpg
```
提取到：
```
20251222083045
```
如果文件名中没有时间戳
```
if timestamp_match:
    ...
```
- 没有匹配 → 直接忽略该文件
- 不会删除
- 也不会报错

解析时间戳字符串

file_time = datetime.strptime(timestamp_str, "%Y%m%d%H%M%S")

把 20251222083045
转成 datetime 对象

补充时区信息
```
file_time = settings.TIME_ZONE.localize(file_time)
```
为什么要这一步？
- threshold_time 是 带时区的
- Python 不能直接比较“带时区”和“不带时区”的时间
- 所以给文件时间补上相同的时区
否则会抛异常。
判断是否需要删除
```
if file_time < threshold_time:
```
含义：

文件名时间 早于保留期限 → 删除
执行删除
```
os.remove(filepath)
deleted_files += 1
```
- 直接删除文件
- 统计删除数量
异常保护
```
except ValueError:
    continue
```
可能的异常场景
- 文件名里的 14 位数字不是合法时间
- 比如：
```
20251332010101.jpg
```
处理方式：
- 直接跳过该文件
- 不影响其他文件清理

删除完成后打日志

if deleted_files:
    process_logger.info(
        f"[WARNING] Delete {deleted_files} old images before {threshold_str}"
    )

只有 确实删除了文件 才记录日志
日志内容包括：
- 删除数量
- 删除阈值时间

13. `image_preprocess`

# 推理的 前处理
def image_preprocess(self, image_path):
    image_root, image_fname = os.path.split(image_path)
    img = cv2.imdecode(np.fromfile(image_path, dtype=np.uint8), -1)
    H, W = img.shape[:2]
    del img  # 减少内存占用
    gc.collect()
    with open(image_path, 'rb') as f:
        img_data = f.read()
    img_base64 = base64.b64encode(img_data).decode('utf-8')
    return img_base64, H, W

作用：模型推理前的图片预处理

具体步骤：

使用 cv2.imdecode 读取图片

获取图片高度和宽度

立即释放图片数组，减少内存占用

读取原始二进制数据

转成 base64 字符串

```
image_root, image_fname = os.path.split(image_path)
```
将图片路径拆分成：
- image_root：目录路径
- image_fname：文件名
```
img = cv2.imdecode(np.fromfile(image_path, dtype=np.uint8), -1)
```
从磁盘读取图片并解码为 OpenCV 的 ndarray：
- np.fromfile(...)：以二进制方式读取图片文件（解决 Windows 下中文路径问题）
- cv2.imdecode(..., -1)：解码图片
  - -1 表示保留原始格式（包括通道数、透明通道等）
```
H, W = img.shape[:2]
```
获取图片的 高度 H 和 宽度 W
```
del img  # 减少内存占用
gc.collect()
```
- 删除 img 变量，释放对大数组的引用
- 手动调用垃圾回收：
  - 对大图 / 高并发推理场景有意义
  - 防止内存持续增长

14. `all_task_end`

def all_task_end(self):
    self.end_logo_counter += 1
    if self.end_logo_counter == self._total_num:
        process_logger.info(
            f"[DONE] Whole Train {self._total_num} imgs DONE within {self.bad_result_counter} imgs FAILED for RequestId {self.requestId}")
        return True
    else:
        return False

作用：判断一列车是否已全部推理完成

逻辑：

每调用一次，完成计数 +1

等于 _total_num → 返回 True

否则返回 False

```
self.end_logo_counter += 1
```
- 每调用一次这个函数，就把 end_logo_counter 加 1。
```
if self.end_logo_counter == self._total_num:
```
判断当前完成的任务数是否等于总任务数。
```
process_logger.info(
    f"[DONE] Whole Train {self._total_num} imgs DONE within {self.bad_result_counter} imgs FAILED for RequestId {self.requestId}")
```
当所有任务都完成时：
- 打一条 汇总日志
- 日志内容包括：
  - 总图片数：self._total_num
  - 失败图片数：self.bad_result_counter
  - 本次请求的唯一标识：self.requestId
```
return True
```
表示：所有任务已经全部结束
```
else:
    return False
```
如果当前完成数还没到总数，表示：还有任务未结束

15. `clean_imgs_cache`

def clean_imgs_cache(self, img_path):
    try:
        # 读取一下过车时间，用于删除用不到的图片
        img_path = os.path.basename(img_path)
        pass_time_prefix = img_path[:img_path.find("_")]

        clean_dirs = [settings.DOWNLOAD_LOCAL_DIR, settings.ERROR_IMG_DIR]
        for clean_dir in clean_dirs:
            for file in os.listdir(clean_dir):
                if file.startswith(pass_time_prefix):
                    os.remove(os.path.join(clean_dir, file))
    except Exception:
        process_logger.exception("Get unexcept error when clean imgs cache:")

作用：清理无用的缓存图片（./temp_data）

根据图片名前缀（过车时间）：

删除obs下载目录中的同批次图片

删除错误目录中的同批次图片

异常会被捕获并记录日志。

img_path = os.path.basename(img_path)

只保留文件名，去掉目录路径。例如：

/data/img/20231223103045_001.jpg
→ 20231223103045_001.jpg

pass_time_prefix = img_path[:img_path.find("_")]

从文件名中截取 _ 之前的部分，得到一个 时间前缀 / 过车时间。

20231223103045_001.jpg
→ pass_time_prefix = "20231223103045"

```
clean_dirs = [settings.DOWNLOAD_LOCAL_DIR, settings.ERROR_IMG_DIR]
```
定义需要清理的两个目录：
- 正常下载图片的缓存目录
- 错误图片目录
```
for clean_dir in clean_dirs:
    for file in os.listdir(clean_dir):
        if file.startswith(pass_time_prefix):
            os.remove(os.path.join(clean_dir, file))
```
- for clean_dir in clean_dirs:：依次遍历两个缓存目录
- for file in os.listdir(clean_dir):：遍历当前目录下的所有文件名
- if file.startswith(pass_time_prefix):判断文件名是否以当前图片的时间前缀开头。如果是，说明：
  - 和当前这张图片属于同一趟车 / 同一次请求
```
os.remove(os.path.join(clean_dir, file))
```
把这一趟车相关的所有缓存图片一次性清理掉。

except Exception:
    process_logger.exception("Get unexcept error when clean imgs cache:")

抛出异常：清理图片缓存时出现异常错误。

16. `result_postprocess`

# 推理结果的 后处理
def result_postprocess(self, ret, result, width, height):
    try:
        img_path = result["result"]["faults"][0]["path"]
        if ret:

            analyze.check_error(result)  # 检查是否有error code
            result = analyze.remove_outside_boundingbox(result, width)  # 删除越界的检测框
            # all_sub_detect_json = analyze.get_short_image_results(result)
            all_sub_detect_json = analyze.get_short_image_results_middle(result)
            # 把子图故障信息和长图信息集合在一起
            assembled_result = analyze.assemble_results(all_sub_detect_json, result, width, height)
            # 根据鼎汉的格式重新整理
            dinghan_item = remap_dinghan_item(assembled_result)
            return dinghan_item
        else:
            process_logger.info(
                f"[WARRING] Infer Failed when infer for img {img_path}, RequestId {self.requestId} ")
            self.bad_result_counter += 1
            return None
    except Exception:
        process_logger.exception("Get unexcept error when postprocess:")
        self.bad_result_counter += 1
        return None

作用：模型推理结果的核心后处理逻辑

正常流程（推理成功）：

检查是否有错误码（check_error）。

移除越界的检测框（remove_outside_boundingbox）。

获取子图检测结果（get_short_image_results_middle）。

合并子图和原图检测结果（assemble_results）。

转换为鼎汉格式（remap_dinghan_item）。

返回转换后的结果。

失败流程（推理失败）：

记录日志，失败计数器+1，返回 None。

异常处理：

如果过程中发生任何异常，记录日志，失败计数器+1，返回 None。

```
def result_postprocess(self, ret, result, width, height):
```
- ret: 推理是否成功的标志（True 或 False）
- result: 推理返回的结果数据
- width, height: 图像的宽度和高度，可能用于后续调整框的位置或尺寸
img_path = result["result"]["faults"][0]["path"]：从推理结果 result 中提取图片的路径 (img_path)。
analyze.check_error(result) ：检查推理结果是否包含错误码。

result = analyze.remove_outside_boundingbox(result, width)  # 删除越界的检测框

移除推理结果中越界的检测框。

all_sub_detect_json = analyze.get_short_image_results_middle(result)

获取图像的子图检测结果。

```
assembled_result = analyze.assemble_results(all_sub_detect_json, result, width, height)
```
- 调用 analyze.assemble_results() 来将子图的检测结果和原图结果合并。
- all_sub_detect_json: 子图的检测结果
- result: 原图的检测结果
- width, height: 用来调整合并后的结果，可能是对坐标系的统一。
```
dinghan_item = remap_dinghan_item(assembled_result)
```
调用 remap_dinghan_item(assembled_result) 将结果重新整理，符合鼎汉系统的格式要求。
```
return dinghan_item
```
如果推理成功，返回转换后的 dinghan_item。
```
else:
    process_logger.info(
        f"[WARRING] Infer Failed when infer for img {img_path}, RequestId {self.requestId} ")
    self.bad_result_counter += 1
    return None
```
如果推理结果 ret == False，说明推理失败：
- 记录一条警告日志，提示哪个图像的推理失败。
- 计数器 bad_result_counter 增加。
- 返回 None 表示没有有效结果。

except Exception:
    process_logger.exception("Get unexcept error when postprocess:")
    self.bad_result_counter += 1
    return None

如果在后处理过程中发生任何异常：

记录异常日志
计数器 bad_result_counter 增加
返回 None 表示发生了错误

17. `get_infer_result_dict`

async def get_infer_result_dict(self, image_path, img_base64):
    input_dict = {
        "input_image_info": [
            {
                "path": image_path,
                "image": img_base64
            }
        ],
        "line": "line15"
    }
    # 创建一个空内容，当推理失败的时候返回
    bad_result = {
        "result": {
            "faults": [
                {
                    "path": image_path,
                    "fault_info": []
                }
            ]
        }
    }
    # 由于网络波动，发送推理请求时可能触发SSL Error，这里用for循环做一个自动重发的机制
    # range函数里面是重发次数的上限，达到上限还没连通，就放弃，返回推理结果为None
    # 注：触发SSL Error时，服务端模型并没有收到请求，未进行推理，也就不会占用资源
    for _ in range(5):
        try:
            infer_url, url_idx = self._get_next_infer_url()
            async with self._session.post(infer_url, json=input_dict) as response:
                code = response.status
                response_json = await response.json()
                if code == 200:
                    # 在日志目录下保存相应的json，留存推理原始记录
                    try:
                        infer_result_json_path = os.path.join(settings.LOG_SAVE_DIR,
                                                              os.path.basename(image_path).replace(".jpg", ".json"))
                        with open(infer_result_json_path, 'w', encoding='utf-8') as f:
                            json.dump(response_json, f, ensure_ascii=False, indent=4)
                    except Exception as e:
                        process_logger.info(f"[WARRING] Can not write {infer_result_json_path} {e}")
                    process_logger.info(f"[DONE] infer {image_path} DONE wiht {url_idx}th url")
                    return True, response_json
                else:
                    process_logger.info(
                        f"[FAILED] infer failed with {url_idx}th url! code:{code}, response:{response_json}")
                    return False, bad_result
        except aiohttp.client_exceptions.ServerDisconnectedError:
            # process_logger.info(f"[PROCESS] infer FAILED with SSL Error (already try reconnect) : {e}")
            continue  # 捕获到指定类型的错误，重发
        except aiohttp.client_exceptions.ClientOSError:
            # process_logger.info(f"[PROCESS] infer FAILED with SSL Error (already try reconnect) : {e}")
            continue  # 捕获到指定类型的错误，重发
    # 达到重发尝试次数上限，无结果
    process_logger.info(f"[WARRING] Can't get infer result until max try times for image {image_path}")
    return False, bad_result

作用：异步调用模型推理服务

关键特点：

最多重试 5 次

自动轮询推理服务 URL

捕获网络 / SSL 异常

推理成功：

保存原始结果 JSON

返回 (True, result)

推理失败：

返回 (False, bad_result)

```
async def get_infer_result_dict(self, image_path, img_base64):
```
这是一个异步函数，用于通过 HTTP 请求获取推理结果。输入参数：
- image_path: 图像的路径，用于记录或日志。
- img_base64: 图像的 Base64 编码，用于推理请求。
构造推理请求数据
```
input_dict = {
    "input_image_info": [
        {
            "path": image_path,
            "image": img_base64
        }
    ],
    "line": "line15"
}
```
- input_dict 是推理请求的输入数据。
- 包含了图像的路径 image_path 和图像的 Base64 编码 img_base64，这两个是服务器端需要的输入。
- "line": "line15" 车辆线路。
创建失败结果的模板
```
bad_result = {
    "result": {
        "faults": [
            {
                "path": image_path,
                "fault_info": []
            }
        ]
    }
}
```
- 如果推理失败，函数会返回 bad_result。
- bad_result 是一个结构化的字典，表示推理失败，并包含了图像路径和故障信息。这里 fault_info 是一个空列表，表示没有进一步的错误信息。
自动重试机制
```
for _ in range(5):
```
设置了最多5 次重试。
异步请求推理
```
infer_url, url_idx = self._get_next_infer_url()
async with self._session.post(infer_url, json=input_dict) as response:
    code = response.status
    response_json = await response.json()
```
- infer_url, url_idx = self._get_next_infer_url()：获取推理服务的 URL 地址。
  - infer_url 是推理服务的 URL。
  - url_idx 是 URL 的索引，方便记录日志标记是第几个 URL。
- 使用 async with 异步方式发送 HTTP POST 请求，传递请求数据 input_dict。
- 响应返回后，检查状态码 response.status 并解析 JSON 格式的响应 response.json()。
判断推理是否成功。
```
if code == 200:
```
- 如果 HTTP 响应状态码是 200，表示推理成功，进入结果处理流程。

记录推理结果

try:
    infer_result_json_path = os.path.join(settings.LOG_SAVE_DIR, os.path.basename(image_path).replace(".jpg", ".json"))
    with open(infer_result_json_path, 'w', encoding='utf-8') as f:
        json.dump(response_json, f, ensure_ascii=False, indent=4)
except Exception as e:
    process_logger.info(f"[WARRING] Can not write {infer_result_json_path} {e}")

如果推理成功，将推理结果保存为 JSON 文件。文件名以图像的文件名命名，只是扩展名从 .jpg 改为 .json。
使用 json.dump() 将推理结果保存为 JSON 格式。若保存失败，会捕获异常并记录日志。

记录日志并返回结果
```
process_logger.info(f"[DONE] infer {image_path} DONE wiht {url_idx}th url")
return True, response_json
```
- 记录推理完成的日志，包含图像路径和使用的 URL 索引。
- 返回 True 和 response_json，表示推理成功并且返回了结果。

推理失败时的处理

else:
    process_logger.info(f"[FAILED] infer failed with {url_idx}th url! code:{code}, response:{response_json}")
    return False, bad_result

如果状态码不是 200，表示推理失败，记录相关日志并返回 False 和 bad_result。

处理异常

except aiohttp.client_exceptions.ServerDisconnectedError:
    continue  # 捕获到指定类型的错误，重发
except aiohttp.client_exceptions.ClientOSError:
    continue  # 捕获到指定类型的错误，重发

捕获 ServerDisconnectedError 或 ClientOSError 这类网络相关的错误，自动重试。
使用 continue 语句继续循环，重发请求。

达到最大重试次数仍未成功
```
process_logger.info(f"[WARRING] Can't get infer result until max try times for image {image_path}")
return False, bad_result
```
如果经过 5 次重试仍然无法成功，记录警告日志并返回 False 和 bad_result。

18. `send_result_to_dinghan`

def send_result_to_dinghan(self):
    with open(os.path.join(settings.LOG_SAVE_DIR, f"result_to_dinghan_{self.inspectionStartTime}.json"), 'w',
              encoding='utf-8') as f:
        json.dump(self._dinghan_result, f, ensure_ascii=False, indent=4)
    headers = {
        'Content-Type': 'application/json',
        'Access-Control-Allow-Origin': '*',  # 设置请求头
    }
    certificate_path = './dinghan_server.crt'  # 证书文件路径
    private_key_path = './dinghan_server.key'  # 私钥文件路径
    try:
        self._dinghan_result["uploaddatetime"] = datetime.now(settings.TIME_ZONE).strftime("%Y-%m-%d %H:%M:%S")
        response = requests.post(settings.API_REPORT_TO_DINGHAN, headers=headers, json=self._dinghan_result,
                                 cert=(certificate_path, private_key_path),
                                 verify=False)
        code = response.status_code
        response_json = response.json()
        if code == 200:
            process_logger.info(f"[SUCCESS] Send request {self.requestId} to DINGHAN")
        else:
            process_logger.info(
                f"[FAILED] Send request {self.requestId} to DINGHAN failed, code:{code}, response:{response_json}")
    except Exception as e:
        process_logger.info(f"get error when send result to DINGHAN: {e}")

作用：将最终结果上报鼎汉系统

这个方法的作用是：

保存推理结果：将推理结果保存为本地 JSON 文件，文件名包含检查开始时间。

准备请求数据：构建请求头，并为推理结果添加一个时间戳字段。

发送 POST 请求：

将推理结果通过 HTTPS 请求发送到远程 Dinghan 服务器。

使用 SSL 证书和私钥进行加密通信。

处理响应：根据返回的状态码记录日志，成功或失败。

异常处理：如果发送请求过程中出现任何异常，记录详细的错误信息。

```
def send_result_to_dinghan(self):
```
定义了一个方法 send_result_to_dinghan，主要任务是将推理结果 (self._dinghan_result) 发送到远程服务器 Dinghan，并保存一份本地 JSON 文件。
保存推理结果到本地。
```
with open(os.path.join(settings.LOG_SAVE_DIR, f"result_to_dinghan_{self.inspectionStartTime}.json"), 'w',
          encoding='utf-8') as f:
    json.dump(self._dinghan_result, f, ensure_ascii=False, indent=4)
```
- 路径构造：文件保存路径由 settings.LOG_SAVE_DIR 和 self.inspectionStartTime 组成，文件名格式为 result_to_dinghan_<inspectionStartTime>.json，其中 inspectionStartTime 表示检查开始的时间。
- 写入 JSON 文件：将 self._dinghan_result 写入该文件。
  - ensure_ascii=False：确保 JSON 文件可以保存 Unicode 字符（而不仅是 ASCII 编码）。
  - indent=4：格式化输出，缩进为 4 个空格，方便阅读。
设置请求头和证书路径
```
headers = {
    'Content-Type': 'application/json',
    'Access-Control-Allow-Origin': '*'  # 设置请求头
}
certificate_path = './dinghan_server.crt'  # 证书文件路径
private_key_path = './dinghan_server.key'  # 私钥文件路径
```
- 请求头：
  - 'Content-Type': 'application/json'：设置请求体的内容类型为 JSON。
  - 'Access-Control-Allow-Origin': '*'：允许跨域请求（通常用于前端或 API 访问）。
- 证书文件路径：
  - certificate_path 和 private_key_path 是服务器 SSL 证书和私钥的文件路径，用于与 Dinghan 服务器建立加密连接。
发送 HTTP 请求
```
try:
    self._dinghan_result["uploaddatetime"] = datetime.now(settings.TIME_ZONE).strftime("%Y-%m-%d %H:%M:%S")
    response = requests.post(settings.API_REPORT_TO_DINGHAN, headers=headers, json=self._dinghan_result,
                             cert=(certificate_path, private_key_path),
                             verify=False)
```
- 添加时间戳：在推理结果 self._dinghan_result 中字段 "uploaddatetime"，值为当前时间，格式为 "YYYY-MM-DD HH:MM:SS"。datetime.now(settings.TIME_ZONE) 获取当前时区的时间。
- 发送 POST 请求：
  - settings.API_REPORT_TO_DINGHAN 是 Dinghan 服务器的 API 接口 URL（应该在 settings.py 中定义）。
  - 使用 requests.post() 发送 HTTP POST 请求，携带 headers、json（推理结果）和 cert（证书路径和私钥）。
  - verify=False：表示禁用 SSL 验证（通常用于不需要证书验证的开发环境，生产环境建议开启验证）。
处理响应
```
code = response.status_code
response_json = response.json()
```
- response.status_code：获取 HTTP 响应的状态码。
- response.json()：将响应体解析为 JSON 格式的数据。

根据响应结果记录日志

if code == 200:
    process_logger.info(f"[SUCCESS] Send request {self.requestId} to DINGHAN")
else:
    process_logger.info(
        f"[FAILED] Send request {self.requestId} to DINGHAN failed, code:{code}, response:{response_json}")

成功：如果状态码是 200（表示成功），记录成功日志，包含 requestId。
失败：如果状态码不是 200，记录失败日志，并输出状态码和响应内容。

异常处理
```
except Exception as e:
    process_logger.info(f"get error when send result to DINGHAN: {e}")
```
- 如果在发送请求过程中发生异常（如网络错误、SSL 错误等），捕获异常并记录日志，输出异常信息。

19. `send_resutl_to_java`

def send_resutl_to_java(self):
    with open(os.path.join(settings.LOG_SAVE_DIR, f"result_to_java_{self.inspectionStartTime}.json"), 'w',
              encoding='utf-8') as f:
        json.dump(self._java_result, f, ensure_ascii=False, indent=4)
    headers = {"Content-Type": "application/json"}
    try:
        # 无故障
        if not len(self._java_result["items"]):
            response = requests.post(settings.API_REPORT_TO_JAVA_NORMAL, headers=headers,
                                     json=self._java_result["taskid"], verify=False)
        # 有故障
        else:
            response = requests.post(settings.API_REPORT_TO_JAVA, headers=headers, json=self._java_result,
                                     verify=False)
        code = response.status_code
        response_json = response.json()
        if code == 200:
            process_logger.info(f"[SUCCESS] Send request {self.requestId} to JAVA")
        else:
            process_logger.info(
                f"[FAILED] Send request {self.requestId} to JAVA failed, code:{code}, response:{response_json}")
    except Exception as e:
        process_logger.info(f"[FAILED] get error when send result to JAVA: {e}")

作用：将结果上报 Java 系统

逻辑分支：

无故障 → 调“正常接口”

有故障 → 调“故障接口”

同样会保存本地 JSON 作为留档。

```
 def send_resutl_to_java(self):
```
把推理结果上报给 Java 服务的函数。
保存推理结果到本地。
```
with open(os.path.join(settings.LOG_SAVE_DIR, f"result_to_java_{self.inspectionStartTime}.json"), 'w',
          encoding='utf-8') as f:
    json.dump(self._java_result, f, ensure_ascii=False, indent=4)
```
作用：
- 在日志目录下保存一份 即将发送给 Java 的原始 JSON
  - json.dump(obj, fp, **kwargs)：把 Python 对象 obj 转成 JSON 格式，然后写到文件 fp 里。
- 文件名带 inspectionStartTime，方便按一次检测/过车排查
- 即使后面网络失败：
  - 本地仍然保留完整数据
  - 便于问题回溯、补发

构造请求头

headers = {"Content-Type": "application/json"}

判断是否有故障结果
```
if not len(self._java_result["items"]):
```
self._java_result["items"]：存放故障项列表
无故障：调用“正常上报接口”
```
response = requests.post(
    settings.API_REPORT_TO_JAVA_NORMAL,
    headers=headers,
    json=self._java_result["taskid"],
    verify=False
)
```
调用 “无故障接口”
- 只上报taskid不传完整结果，减少数据量
- verify=False：不校验证书（内网 / 自签名证书常见）
有故障：调用“故障上报接口”
```
response = requests.post(
    settings.API_REPORT_TO_JAVA,
    headers=headers,
    json=self._java_result,
    verify=False
)
```
调用 完整故障上报接口
- 上报：全量 self._java_result，包含所有故障信息
处理响应
```
code = response.status_code
response_json = response.json()
```
- response.status_code：获取 HTTP 响应的状态码。
- response.json()：将响应体解析为 JSON 格式的数据。

根据响应结果记录日志

if code == 200:
    process_logger.info(f"[SUCCESS] Send request {self.requestId} to JAVA")
else:
    process_logger.info(
        f"[FAILED] Send request {self.requestId} to JAVA failed, code:{code}, response:{response_json}")

成功：如果状态码是 200（表示成功），记录成功日志，包含 requestId。
失败：如果状态码不是 200，记录失败日志，并输出状态码和响应内容。

异常处理
```
except Exception as e:
    process_logger.info(f"[FAILED] get error when send result to JAVA: {e}")
```
- 如果在发送请求过程中发生异常（如网络错误、SSL 错误等），捕获异常并记录日志，输出异常信息。

20. `infer`

# 推理流程
async def infer(self, image_path):
    try:
        # 是None说明OBS下载图片失败，总图片数减一
        if image_path is None:
            return False, None, None, None
        else:
            # 由于MA限制，改为推理压缩图
            compress_img_base64, H, W = self.image_preprocess(image_path)
            # 注意这里传入的是resized_img_path，从而返回的是resized_img_path，从而后处理过程中使用的path其实都是resized_img_path
            # 由于modelarts限制，这里临时使用小图推理，如果该问题解决了，这里要改回img
            ret, result = await self.get_infer_result_dict(image_path, compress_img_base64)
            # 本来不用做这一步，但是调试过程中发现返回的path不一定是原来的，path可能在云端被修改
            result["result"]["faults"][0]["path"] = image_path
            return ret, result, H, W
    except Exception:
        process_logger.exception("Get unexcept error when infer:")
        return False, None, None, None

作用：单张图片的完整推理流程

执行步骤：

图片路径为空 → 直接失败

图片预处理

调用模型推理

修正返回结果中的图片路径

返回：ret, result, H, W

```
 # 推理流程
 async def infer(self, image_path):
```
这是一个 单张图片的异步推理入口函数。负责把一张图片从「路径」一路处理到「推理结果 + 原始尺寸」。
先判断图片是否有效
```
if image_path is None:
    return False, None, None, None
```
- image_path is None：说明前一步 OBS 下载图片失败。直接返回失败：False, None, None, None
- 同时外部一般会 把总图片数减 1
图片前处理
```
compress_img_base64, H, W = self.image_preprocess(image_path)
```
这一行做了三件事：
1. 读取图片
2. 转 Base64（压缩图）
3. 获取原图尺寸H, W
调用推理服务（异步）
```
ret, result = await self.get_infer_result_dict(image_path, compress_img_base64)
```
异步调用推理接口

入参：
- 图片路径（用于日志 / 结果）
- Base64 编码后的压缩图
返回：
- ret：推理是否成功（True / False）
- result：推理服务返回的 JSON
强制修正返回的 path（很关键）
```
result["result"]["faults"][0]["path"] = image_path
```
这一步非常有实战意义：
- 实际问题：
  - 云端推理服务返回的 path
  - 可能被修改过
- 为了保证后处理 / 上报：
  - 本地逻辑全部使用 原始 image_path
- 直接覆盖云端返回的路径
```
 return ret, result, H, W
```
最终返回 4 个值：
1. ret：推理是否成功
2. result：推理原始 JSON 结果
3. H：原图高度
4. W：原图宽度

处理异常

except Exception:
    process_logger.exception("Get unexcept error when infer:")
    return False, None, None, None

任意异常：返回统一失败格式（"Get unexcept error when infer:"）。

21. `image_process_pipline`

作用：全局推理实例

image_process_pipline = ModelInfer()

整个进程只使用一个实例，避免重复初始化资源。

posted @ 2026-01-05 08:51 做梦当财神阅读(3) 评论(0) 收藏举报

刷新页面返回顶部

做梦当财神

model_infer.py

class ModelInfer()

1. __init__

2. set_base_para

3. dinghan_result_init

4. java_result_init

5. remove_same_result_with_different_camera

6. dinghan_add_items

7. jave_add_items

8. _get_next_infer_url

9. _cyclic_url_sequence

10. set_total_num

11. set_session

12. delete_old_images

13. image_preprocess

14. all_task_end

15. clean_imgs_cache

16. result_postprocess

17. get_infer_result_dict

18. send_result_to_dinghan

19. send_resutl_to_java

20. infer

21. image_process_pipline

公告