2018 年 12月 5 日随笔档案 - 微笑sun

2018年12月5日

详解Transformer模型（Atention is all you need）

摘要： 1 概述在介绍Transformer模型之前，先来回顾Encoder-Decoder中的Attention。其实质上就是Encoder中隐层输出的加权和，公式如下：将Attention机制从Encoder-Decoder框架中抽出，进一步抽象化，其本质上如下图（图片来源：张俊林博客）：以机器阅读全文

posted @ 2018-12-05 16:15 微笑sun 阅读(20278) 评论(5) 推荐(3)

微笑sun

公告