摘要: Helpful? Honest? Harmless? Make sure AI response in those 3 ways. If not, we need RLHF is reduce the toxicity of the LLM. Reinforcement learning: is a 阅读全文
posted @ 2024-03-15 12:15 MiraMira 阅读(51) 评论(0) 推荐(0)