SciTech-BigDataAIML-LLM-PE(Positional Encoding)位置编码: Absolute(绝对)Position + Relative(相对)Position + Rotate(旋转)Position

SciTech-BigDataAIML-LLM

PE(Positional Encoding)位置编码:

  • BOW(Bag of Words)词袋模型:丢弃Word位置信息, 只统计Word之间的 Co-occurrence Probability(共现概率)。
  • RNN(Recurrent neural networks): 有Word的Position信息。
  • Transformer: Positional Encoding, 将Absolute Position位置信息Embedding 嵌入 Word Embedding Vector。
  • BERT: Trainable Position Embedding.
  • GPT: ?
  • Latest: Rotate Position(最新的旋转位置编码)。

数学公式应用:

  • 向量的“$\large Dot-Product\ Similarity $”点积相似度:
    $\large A_i = \vec{Q^{T}} \cdot \vec{K} = \vec{W_{Q}^T} \vec{Q_{i}^T} \cdot \vec{K} $

    $\large \vec{Q} = \vec{Q_{i}} \vec{W_{Q}} $
    $\large \vec{K} = \vec{K_{i}} \vec{W_{K}} $

    \(\large \vec{Q_{i}} = \vec{e_{i}} + \vec{p_{i}}\)
    \(\large \vec{K_{j}} = \vec{e_{j}} + \vec{p_{j}}\)
    \(\large \vec{Q_{i}}\) : Word Vector with "i" as its index.
    \(\large \vec{e_{i}}\) : Word Embedding Vector with "i" as its index.
    \(\large \vec{p_{i}}\) : Positional Embedding Vector with "i" as its index.
    \(\large \vec{A_{ij}}\) : Word Attention Score(Word Vector Similarity) between Word Vector \(\large \vec{Q_{i}}\) and $\large \vec{Q_{j}} $

  • $\large $


1 Absolute(绝对)Position


2 Relative(相对)Position


3 Rotate(旋转)Position

posted @ 2024-08-22 23:41  abaelhe  阅读(35)  评论(0)    收藏  举报