摘要:
def inbatch_softmax_loss(user_pred_vector, item_pred_vector, item_id, labels): labels = tf.linalg.diag(tf.reshape(tf.ones_like(labels),[-1])) diff = t 阅读全文
posted @ 2025-01-22 18:13
AI_Engineer
阅读(138)
评论(0)
推荐(0)
摘要:
论文链接:HoME: Hierarchy of Multi-Gate Experts for Multi-Task Learning at Kuaishou 背景 论文指出现在的MMOE/PLE等multitask模型存在以下几个问题: 专家崩溃:专家的输出分布存在显著差异,并且一些专家使用 ReL 阅读全文
posted @ 2025-01-22 12:04
AI_Engineer
阅读(453)
评论(0)
推荐(0)