SHICENT

2026年4月9日

HAViT: Historical Attention Vision Transformer & Disentangling Recall and Reasoning in Transformer Models through Layer-wise Attention and Activation Analysis

摘要：论文日报 2026-03-25 精选论文 1. HAViT: Historical Attention Vision Transformer 论文信息 arXiv编号: 2603.18585 作者: Swarnendu Banik, Manish Das, Shiv Ram Dubey, Satis 阅读全文

posted @ 2026-04-09 01:37 SHICENT 阅读(8) 评论(0) 推荐(0)

FlashAttention 全系列深度解析--IO 感知注意力计算如何重塑 LLM 训练与推理

摘要：技术日报 2026-03-25 一、技术背景与动机 1.1 标准注意力的根本瓶颈 Transformer 架构的注意力机制（Self-Attention）自 2017 年提出以来，已成为大语言模型（LLM）、视觉模型、多模态模型的基础组件。然而，随着序列长度 $N$ 的增大，标准注意力的时间与空间复阅读全文

posted @ 2026-04-09 01:18 SHICENT 阅读(29) 评论(0) 推荐(0)

MoE 大规模训练系统的工业级解决方案 & 推理模型"过度思考"的实时终止方案

摘要：论文日报 2026-03-24 论文一：MoE 大规模训练系统的工业级解决方案标题： Scalable Training of Mixture-of-Experts Models with Megatron Core 作者： Zijie Yan, Hongxiao Bai, Xin Yao, De 阅读全文

posted @ 2026-04-09 01:12 SHICENT 阅读(6) 评论(0) 推荐(0)

Speculative Decoding(投机解码) & KV Cache

摘要：技术日报 2026-03-24 目录技术一：投机解码（Speculative Decoding）及其演进——EAGLE 系列全解析技术二：KV Cache 优化技术体系——从原理到工程实践参考资料技术一：投机解码（Speculative Decoding）及 EAGLE 系列演进背景与动机阅读全文

posted @ 2026-04-09 01:10 SHICENT 阅读(11) 评论(0) 推荐(0)

DeepSeek mHC & FlashAttention 2

摘要：技术日报 2026-03-23 技术一：DeepSeek mHC（流形约束超连接）—— 解决超深Transformer训练稳定性难题 1. 技术背景与动机随着大语言模型（LLM）参数规模和深度的不断增长，超深Transformer网络面临着严重的训练稳定性问题。当网络层数增加到数百甚至上千层时，容阅读全文

posted @ 2026-04-09 01:05 SHICENT 阅读(2) 评论(0) 推荐(0)

2022年3月17日

OpenCL错误码

摘要： /* Error Codes */ #define CL_SUCCESS 0 #define CL_DEVICE_NOT_FOUND -1 #define CL_DEVICE_NOT_AVAILABLE -2 #define CL_COMPILER_NOT_AVAILABLE -3 #define 阅读全文

posted @ 2022-03-17 19:32 SHICENT 阅读(1221) 评论(0) 推荐(0)

2020年3月15日

IDEA的maven项目整合SSM时css等静态文件加载无效问题

摘要：检查引入路径我用的这种，其中css文件夹和jsp文件同目录 <% String path = request.getContextPath(); String basePath = request.getScheme()+"://"+request.getServerName()+":"+requ 阅读全文

posted @ 2020-03-15 21:27 SHICENT 阅读(1183) 评论(0) 推荐(0)

spring与springMVC整合

摘要：修改web.xml 将spring和springMVC配置文件一同加载 <init-param> <param-name>contextConfigLocation</param-name> <param-value>classpath:spring-mvc.xml,classpath:applic 阅读全文

posted @ 2020-03-15 20:04 SHICENT 阅读(244) 评论(0) 推荐(0)

springMVC环境搭建

摘要：建立controller层 UserController.java package com.imust.controller; import org.springframework.stereotype.Controller; import org.springframework.web.bind. 阅读全文

posted @ 2020-03-15 19:26 SHICENT 阅读(208) 评论(0) 推荐(0)

spring与mybatis整合

摘要：整合配置文件实质就是将mybatis的配置文件SqlSessionConfiguration.xml的配置信息整合到spring的核心配置文件applicationContext.xml中 applicationContext.xml <?xml version="1.0" encoding="U 阅读全文

posted @ 2020-03-15 17:37 SHICENT 阅读(229) 评论(0) 推荐(0)

永远不要停下前进的脚步

公告