模型之 BERT & Transformer
模型之 BERT & Transformer
1. BERT
BERT: Bidirectional Encoder Representation from Transformers
论文地址[2019]:BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
对应github代码:github-bert
BERT分为 两个阶段:
- Pre-training:利用无标记语料预训练模型
- Fine-tuning: 使用预训练的模型,对已经标记的语料根据实际的任务进行训练
https://harmonyhu.com/2021/04/21/BERT/
2. Transformer
论文地址[2017]:Attention Is All You Need
核心运算:Attention(Q,K,V)=softmax(QKTdk√)V
描述:查询(Query)到键值(Key-Value)的映射

浙公网安备 33010602011771号