摘要:
环境 cuda drvier 11.6 cuda toolkit 11.1 pytorch ver 1.11 conda env # conda package list # packages in environment at /home/tangke/anaconda3/envs/py39tor 阅读全文
摘要:
前置知识: virtual table in C++ 对于每个opr,dispatcher构建了一个vtable(c多态性相关概念)。dispatcher的工作就是根据输入的tensor和其他一些meta信息,计算dispatch key,然后根据vtable跳转到相应的函数 c virtual t 阅读全文
摘要:
calling convention Entry sequence (the function prologue) a few instructions at the beginning of a function, which prepare the stack and registers for 阅读全文