365/24/60

2025年12月2日

摘要： swift 3.10环境初始化 swift安装，正常安装 # 当前默认下载安装为 3.10 版本，之前为 3.6版本【对应transformer_engine也没出错】 pip install --no-build-isolation 'transformer_engine[pytorch]' # 阅读全文

posted @ 2025-12-02 10:57 365/24/60 阅读(5) 评论(0) 推荐(0)

2025年10月10日

MOE模型

摘要： MOE 模型 Qwen3MoeForCausalLM( (model): Qwen3MoeModel( (embed_tokens): Embedding(151936, 2048, padding_idx=151643) (layers): ModuleList( (0-47): 48 x Qwe 阅读全文

posted @ 2025-10-10 16:58 365/24/60 阅读(13) 评论(0) 推荐(0)

2025年9月30日

Verl实验

摘要：模型默认保存位置： checkpoints/<project_name>/<experiment_name>/ trainer.checkpoint_dir：是专门用于指定 checkpoint 保存根目录的参数 trainer.checkpoint_dir 不生效（部分 verl 版本可能未暴露阅读全文

posted @ 2025-09-30 10:45 365/24/60 阅读(180) 评论(0) 推荐(0)

2025年6月30日

Positional Encoding

摘要：这里的S是Q*K （ bs, multi_head, seq_len, seq_Len ）,相对位置编码考虑i,j亮点的相对情况即可 S_rel_shift[..., i, j] = S_rel[..., i, j - i + seq_len - 1] import torch import tor 阅读全文

posted @ 2025-06-30 18:15 365/24/60 阅读(13) 评论(0) 推荐(0)

2025年3月19日

torch、deepspeed训练问题

摘要： 319：h20训练报错：问题1：nvidia h20机器报错：Caught signal 8 (Floating point exception: integer divide by zero) 解决： pip3 install nvidia-cublas-cu12==12.3.4.1 expor 阅读全文

posted @ 2025-03-19 09:50 365/24/60 阅读(480) 评论(0) 推荐(0)

2025年2月26日

openr1复现

摘要： virualenv创建虚拟环境：virtualenv myenv --python=/usr/bin/python3.11 grpo原理：https://huggingface.co/docs/trl/main/en/grpo_trainer （https://mp.weixin.qq.com/s? 阅读全文

posted @ 2025-02-26 09:19 365/24/60 阅读(139) 评论(0) 推荐(0)

2025年2月10日

reward model相关技术

摘要： Reward Hacking 模型通过利用奖励系统的设计缺陷或漏洞，采取非预期的行为来获取高额奖励，而不是真正实现设计者期望的目标字节token https://mp.weixin.qq.com/s/lsCshrnmtO-bYaszLFBSNw DeepSeek训练图解：https://zhuan 阅读全文

posted @ 2025-02-10 10:45 365/24/60 阅读(64) 评论(0) 推荐(0)

2025年1月14日

LLM分词技术

摘要：大模型分词技术： BPE（Byte Pair Encoding）：执行分析的算法/模型：Tokenizer 分出来的最小粒度的组成部分：Token 分词的目标：尽可能使token蕴含更多有用的信息（1、上下文信息 2、shiyong更高频、丰富的字词作为token）整个过程称为 Tokeniza 阅读全文

posted @ 2025-01-14 22:19 365/24/60 阅读(82) 评论(0) 推荐(0)

嵌入Embedding-计算理解语言的钥匙

摘要：定义：将人类语言与数字建立联系的强大方法嵌入技术的演变： Wod2Vec CBOW（Continuous Bag of Words）：根据上下文词汇预测目标词汇（情感分析、文本分类、词相似性） Skip-Gram：根据目标单词预测周围单词在训练Word2Vec模型时，包含词典和词向量模型的训练阅读全文

posted @ 2025-01-14 18:16 365/24/60 阅读(138) 评论(0) 推荐(0)

2024年5月22日

LLM相关损失函数

摘要：信息熵：信息熵torch代码 event = {'a':2 , 'b':2, 'c':4} # 信息熵分：1.5 event2 = {'a':1 , 'b':1, 'c':1} # 信息熵分：1.585 p_e = [ v/sum(event.values()) for v in event.va 阅读全文

posted @ 2024-05-22 10:46 365/24/60 阅读(418) 评论(0) 推荐(0)

2024年5月21日

Llama_factory初始化

摘要：基础操作：数据处理：LLaMA-Factory-main\src\llamafactory\data\preprocess.py 训练示例：请首先阅读example相关示例：LLaMA-Factory-main\examples\README_zh.md 2、指令微调：LLaMA-Factory 阅读全文

posted @ 2024-05-21 11:08 365/24/60 阅读(262) 评论(0) 推荐(0)

2024年5月20日

vllm服务推理参数

摘要： stop: List of string。【生成文本时，碰到此token就会停下，但结果不会包含此token】 stop_token_ids: List of string。【生成id时，碰到此id就会停止，会包含此id，比如 tokenizer.eos_token_id [im_end]】最终判阅读全文

posted @ 2024-05-20 17:10 365/24/60 阅读(1433) 评论(0) 推荐(0)

2024年3月4日

LLM训练bug

摘要： LLM 编码： tokenizer = AutoTokenizer.from_pretrained(modelpath) text="你好" tokenizer.tokenize(text) # 直接编码 chat_text = tokenizer.apply_chat_template(text, 阅读全文

posted @ 2024-03-04 20:10 365/24/60 阅读(96) 评论(0) 推荐(0)

2023年12月22日

K2 sherpa编译使用

摘要：编译安装 pip卸载cmake、torch、k2 安装 cmake 3.22.3版本、k2、kaldi_feat【官方提供|install_dir】、torch==2.0.1【】缺cuda export LD_LIBRARY_PATH=/usr/local/cuda11.7/lib64:$LD_L 阅读全文

posted @ 2023-12-22 19:49 365/24/60 阅读(105) 评论(0) 推荐(0)

2023年12月21日

Pytorch模型结构修改

摘要： 1. Var方差在pytorch中的差别： tlist = input.tolist() print(input) print(np.mean(tlist), np.var(tlist)) print(torch.mean(input), torch.var(input)) 可以看到numpy与to 阅读全文

posted @ 2023-12-21 17:14 365/24/60 阅读(63) 评论(0) 推荐(0)

2023年12月10日

手写Conformer网络结构

摘要： import torch from torch import nn x = torch.randint(0, 10, size=(5, 280,80)) length = torch.tensor([10,9,9,9,9]) x.size(),x.shape,x[0].shape,length # 阅读全文

posted @ 2023-12-10 00:37 365/24/60 阅读(188) 评论(0) 推荐(0)

2023年10月13日

C++不常见语法分析总结

摘要：成员初始化列表用于在构造对象时给类或者结构体成员设置初值。语法为: 构造函数(): 成员1(参数值1),成员2(参数值2)...{} 成员初始化列表的作用和优点: 可以为非静态数据成员赋予初始值初始化顺序与成员在类中的定义顺序一致效率比在构造函数体内赋值初始化高可以为只读成员常量和引用成员提阅读全文

posted @ 2023-10-13 15:02 365/24/60 阅读(61) 评论(0) 推荐(0)

2023年10月12日

K2-lhotse数据读取、训练流程分析

摘要： class K2SpeechRecognitionDataset(torch.utils.data.Dataset): The PyTorch Dataset for the speech recognition task using k2 library. This dataset expects 阅读全文

posted @ 2023-10-12 17:37 365/24/60 阅读(370) 评论(0) 推荐(0)

2023年3月30日

Gumbel-Softmax

摘要： Gumbel-Softmax是一种用于对离散分布进行采样的技术，通常应用于生成模型和强化学习中。下面是对Gumbel-Softmax的分析： Gumbel分布 Gumbel分布是一种连续概率分布，它的概率密度函数可以用以下公式表示： $$f(x)=\frac{1}{\beta}e^{-\frac{x 阅读全文

posted @ 2023-03-30 15:07 365/24/60 阅读(1414) 评论(0) 推荐(0)

2023年1月5日

pytest测试使用

摘要：

pytest自动化测试阅读全文

posted @ 2023-01-05 16:59 365/24/60 阅读(64) 评论(0) 推荐(0)

Coding Poineer

Coding Poineer

Coding Poineer

Coding Poineer

Coding Poineer

Coding Poineer

Coding Poineer

Coding Poineer

Coding Poineer

Coding Poineer

Coding Poineer