模型之 BERT & Transformer

模型之 BERT & Transformer

1. BERT

BERT: Bidirectional Encoder Representation from Transformers

论文地址[2019]:BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

对应github代码:github-bert

BERT分为 两个阶段:

  • Pre-training:利用无标记语料预训练模型
  • Fine-tuning: 使用预训练的模型,对已经标记的语料根据实际的任务进行训练

https://harmonyhu.com/2021/04/21/BERT/

2. Transformer

论文地址[2017]:Attention Is All You Need

核心运算:Attention(Q,K,V)=softmax(QKTdk√)V

描述:查询(Query)到键值(Key-Value)的映射

https://harmonyhu.com/2021/04/10/transformer/

posted @ 2024-12-31 15:08  michaelchengjl  阅读(18)  评论(0)    收藏  举报