pytorch多进程运行模型,报错:报错 RuntimeError: Cowardly refusing to serialize non-leaf tensor which requires_grad, since autograd does not support crossing process boundaries.

具体报错信息:

报错 RuntimeError: Cowardly refusing to serialize non-leaf tensor which requires_grad, since autograd does not support crossing process boundaries. If you just want to transfer the data, call detach() on the tensor before serializing (e.g., putting it on the queue).
[W412 09:35:34.731165326 CudaIPCTypes.cpp:16] Producer process has been terminated before all shared CUDA tensors released. See Note [Sharing CUDA tensors]



外网上关于这个报错信息的说法也是各种各样,国内的呢,也是基本见不到这方面的资料,于是就自己研究,发现了解决方法。

这个问题的出现其实就是在python的多进程中开启pytorch模型,如果是在子进程的__init__函数中创建模型,那么python就会把这个模型进行序列化,然后就会报错,解决方法就是把子进程中的神经网络初始化拿出来放到非__init__函数中就可以了。



具体:

image

改成:

image





posted on 2025-04-12 10:30  Angry_Panda  阅读(151)  评论(0)    收藏  举报

导航