2025 年 4月 23 日随笔档案 - 远航。

2025年4月23日

摘要：工程类（1）Aider-Polyglot Benchmark 前言：openai o3发布，只选择了3个代码评测标准，aider就是其一，主要是评测模型实际工程的代码修复能力 Aider 的 Polyglot Benchmark 是一个专为评估 AI 模型在多语言编程任务中能力而设计的基准测试框架阅读全文

posted @ 2025-04-23 19:29 远航。阅读(540) 评论(0) 推荐(0)

人间四月芳菲尽

远航110的博客

公告