2025年9月10日

Can Group Relative Policy Optimization Improve Thai Legal Reasoning and Question Answering? 论文复现

摘要: 要复现 “Can Group Relative Policy Optimization Improve Thai Legal Reasoning and Question Answering?” (以下简称 “GRPO - 泰国法律 QA 论文”),需围绕 “泰国法律 QA 任务特性” 与 “GRP 阅读全文

posted @ 2025-09-10 11:20 limingqi 阅读(10) 评论(0) 推荐(0)

导航