上一页 1 2 3 4 5 6 7 ··· 15 下一页
摘要: 2022.Q3 沟通 with wangbiao 晋升milestone: Q3: 怎么做目标切分 怎么当一个owner Q4: 要独立当一个owner 2023.Q1: 要学习带人做项目 kernel优化路线 cuda c -> tensor core -> cutlass -> tvm Q3 p 阅读全文
posted @ 2024-03-27 13:09 ijpq 阅读(10) 评论(0) 推荐(0)
摘要: 03 Feb 2023 : 最近一周,重新梳理了dnn上rrconv的codegen代码,dnn上rrconv fprop全部test跑通。dnn rrconv dgrad不能通过,部分case计算错误。rrconv cutlass dgrad全部通过。 2.2号来了,先检查dgrad的codege 阅读全文
posted @ 2024-03-27 13:07 ijpq 阅读(41) 评论(0) 推荐(0)
摘要: https://developer.nvidia.com/blog/cutlass-linear-algebra-cuda/ Efficient Matrix Multiplication on GPUs 计算密集度 = (时间复杂度/空间复杂度) = O(N^3)/O(N^2) = O(N) // 阅读全文
posted @ 2024-03-26 13:47 ijpq 阅读(44) 评论(0) 推荐(0)
摘要: #include <vector> #include <utility> // 对于 std::move #include <type_traits> #include <iostream> using namespace std; template<typename T> struct A { A 阅读全文
posted @ 2024-01-16 17:06 ijpq 阅读(17) 评论(0) 推荐(0)
摘要: problem1 How many bytes is the program? For the above x86 assembly code, how many bytes of instructions need to be fetched if x = 0x01020304 and n = 5 阅读全文
posted @ 2023-12-13 12:23 ijpq 阅读(29) 评论(0) 推荐(0)
摘要: the report finished in first time the report finished in first time 3.4 Note how the mix of different types of instructions vary between benchmarks. R 阅读全文
posted @ 2023-12-10 23:37 ijpq 阅读(78) 评论(0) 推荐(0)
摘要: ![image](https://img2023.cnblogs.com/blog/1481923/202311/1481923-20231129110844813-948831792.png) 阅读全文
posted @ 2023-11-29 11:08 ijpq 阅读(16) 评论(0) 推荐(0)
摘要: 参考: code: https://github.com/mit-pdos/xv6-riscv book: https://pdos.csail.mit.edu/6.828/2021/xv6/book-riscv-rev2.pdf note: https://mit-public-courses-c 阅读全文
posted @ 2023-10-31 18:26 ijpq 阅读(69) 评论(0) 推荐(0)
摘要: Traps and System calls 什么是trap 在xv6操作系统中,"trap"是指cpu暂时跳出正常执行流程,从用户态切换到内核态的一种机制。这种切换,在xv6系统中,在这3个情况下发生:系统调用、异常、外部设备触发了中断。 系统调用就是使用了ecall指令,之前lab中增加过tra 阅读全文
posted @ 2023-10-18 13:48 ijpq 阅读(70) 评论(0) 推荐(0)
摘要: # sync_cnblog 阅读全文
posted @ 2023-09-02 09:51 ijpq 阅读(9) 评论(0) 推荐(0)
上一页 1 2 3 4 5 6 7 ··· 15 下一页