摘要: 1. embedding层 输入的:batch_size*seq_len 经过embedding层后的输出:batch_size*seq_len*dim(embedding后的dim维度) 2. attention import numpy as np def self_attention(X): 阅读全文
posted @ 2024-12-06 15:18 15375357604 阅读(28) 评论(0) 推荐(0)