PyTorch学习笔记

1.将模型运算在GPU上

　　PyTorch提供了一种很方便的方法，只需要在前向传播过程中将变量使用cuda()函数将cpu变量转换为gpu变量：

x = self.embedding(x)
#此时使用cpu运算
x = x.cuda(0)
#此时使用gpu计算

2.多GPU计算

　　使用torch.nn.parallel库提供的方法部署多GPU计算，具体思想和流程分为以下几步：

　　（1）replicate函数将模型复制到多个device上；

　　（2）scatter函数将input_data分布到各个device上；

　　（3）parallel_apply函数应用前两步的变换；

　　（4）gather函数负责聚集和连接多个设备上的输出。

　　将以上流程用代码表示如下：

import torch.nn as nn
def data_parallel(module,input,device_ids,output_device=None):
    if not device_ids:
        return module(input)
    if output_device is None:
        output_device = device_ids[0]
    
    replicas = nn.parallel.replicate(module,device_ids)
    inputs = nn.parallel.scatter(input,device_ids)
    replicas = replicas[:len(inputs)]
    nn.parallel.parallel_apply(replicas,inputs)
    return nn.parallel.gather(outputs,output_device)

3.PyTorch核心

　　pytorch主要的特点有两个：

　　1.提供can run in GPU上的n维张量

　　2.在建立和训练模型时的自动微分技术

　　(指定数据类型的方法：dtype = torch.LongTensor, x = torch.randn(m,n).type(dtype) #dtype可修改)

　　PyTorch使用了计算图的方式实现自动微分。Tensor和Variable是两个很类似的对象，区别在于使用Variable时会定义一个计算图，官方文档中两者的的关系为We wrap our PyTorch Tensors in Variable objects; a Variable represents a node in a computational graph. If x is a Variable then x.data is a Tensor, and x.grad is another Variable holding the gradient of x with respect to some scalar value。当使用Autograd时，在前向传播过程中会定义一个计算图。

posted @ 2018-03-15 14:56 htyd 阅读(182) 评论(0) 收藏举报

刷新页面返回顶部

大河之剑

PyTorch学习笔记

1.将模型运算在GPU上

2.多GPU计算

3.PyTorch核心

公告