Can Group Relative Policy Optimization Improve Thai Legal Reasoning and Question Answering? 论文复现
摘要:
要复现 “Can Group Relative Policy Optimization Improve Thai Legal Reasoning and Question Answering?” (以下简称 “GRPO - 泰国法律 QA 论文”),需围绕 “泰国法律 QA 任务特性” 与 “GRP 阅读全文
posted @ 2025-09-10 11:20 limingqi 阅读(10) 评论(0) 推荐(0)