2021 年 12月 29 日随笔档案 - 忘川酒

2021年12月29日

摘要： Transformer结构 Transformer模型中采用了 encoer-decoder 架构 encoder，包含self-attention层和前馈神经网络，self-attention能帮助当前节点不仅仅只关注当前的词，从而能获取到上下文的语义。 decoder在这两层中间还有一层atte 阅读全文

posted @ 2021-12-29 20:20 忘川酒阅读(321) 评论(0) 推荐(0)