2025 年 7月 28 日随笔档案 - Luna-Evelyn

2025年7月28日

摘要： Self-Attention Scaled Dot-Product Attention（缩放点积注意力）： Self-Attention允许模型在处理一个输入序列时，关注序列内部的每个元素之间的关系。每个元素既作为查询（Query），又作为键（Key）和值（Value），通过计算自身与其他元素的相关阅读全文

posted @ 2025-07-28 16:53 Luna-Evelyn 阅读(18) 评论(0) 推荐(0)

Tokenizer

摘要：分词粒度可分为word，sub-word，charlevel三个分词等级其中word level存在以下问题：超大的vocabulary size, 比如中文的常用词可以达到20W个通常面临比较严重的OOV问题 vocabulary 中存在很多相似的词 charlevel存在以下问题：文本阅读全文

posted @ 2025-07-28 03:31 Luna-Evelyn 阅读(10) 评论(0) 推荐(0)

The Blog

Do not go gentle into that good night.
Old age should burn and rave at close of day.
Rage, rage against the dying light.

公告

The Blog

Do not go gentle into that good night. Old age should burn and rave at close of day. Rage, rage against the dying light.

公告

Do not go gentle into that good night.
Old age should burn and rave at close of day.
Rage, rage against the dying light.