【pytorch】torch.nn.DataParallel用法详解

参考博客:
https://blog.csdn.net/baidu_35120637/article/details/110785801
https://blog.csdn.net/zhjm07054115/article/details/104799661
https://blog.csdn.net/anshiquanshu/article/details/108186955

在多卡的GPU服务器,当我们在上面跑程序的时候,当迭代次数或者epoch足够大的时候,我们通常会使用nn.DataParallel函数来用多个GPU来加速训练。一般我们会在代码中加入以下这些:

model = model.cuda() 
device_ids = [0, 1] 	# id为01的两块显卡
model = torch.nn.DataParallel(model, device_ids=device_ids)

或者

device_ids = [0, 1]
model = torch.nn.DataParallel(model, device_ids=device_ids).cuda()

函数定义:

CLASS torch.nn.DataParallel(module, device_ids=None, output_device=None, dim=0)

Parameters 参数:
module即表示你定义的模型;
device_ids表示你训练的device;
output_device这个参数表示输出结果的device;
而这最后一个参数output_device一般情况下是省略不写的,那么默认就是在device_ids[0],也就是第一块卡上,也就解释了为什么第一块卡的显存会占用的比其他卡要更多一些

多GPU 负载不均衡参考解决方案:
https://blog.csdn.net/weixin_40087578/article/details/87186613

torch.nn.DataParallel源码解读:

class DataParallel(Module):
    
 
    def __init__(self, module, device_ids=None, output_device=None, dim=0):
        super(DataParallel, self).__init__()
 
        device_type = _get_available_device_type()
        if device_type is None:
            self.module = module
            self.device_ids = []
            return
 
        if device_ids is None:
            device_ids = _get_all_device_indices()
 
        if output_device is None:
            output_device = device_ids[0]
 
        self.dim = dim
        self.module = module
        self.device_ids = list(map(lambda x: _get_device_index(x, True), device_ids))
        self.output_device = _get_device_index(output_device, True)
        self.src_device_obj = torch.device(device_type, self.device_ids[0])
 
        _check_balance(self.device_ids)
 
        if len(self.device_ids) == 1:
            self.module.to(self.src_device_obj)
 
    def forward(self, *inputs, **kwargs):
        if not self.device_ids:
            return self.module(*inputs, **kwargs)
 
        for t in chain(self.module.parameters(), self.module.buffers()):
            if t.device != self.src_device_obj:
                raise RuntimeError("module must have its parameters and buffers "
                                   "on device {} (device_ids[0]) but found one of "
                                   "them on device: {}".format(self.src_device_obj, t.device))
 
        inputs, kwargs = self.scatter(inputs, kwargs, self.device_ids)
        if len(self.device_ids) == 1:
            return self.module(*inputs[0], **kwargs[0])
        replicas = self.replicate(self.module, self.device_ids[:len(inputs)])
        outputs = self.parallel_apply(replicas, inputs, kwargs)
        return self.gather(outputs, self.output_device)
 
    def replicate(self, module, device_ids):
        return replicate(module, device_ids, not torch.is_grad_enabled())
 
    def scatter(self, inputs, kwargs, device_ids):
        return scatter_kwargs(inputs, kwargs, device_ids, dim=self.dim)
 
    def parallel_apply(self, replicas, inputs, kwargs):
        return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
 
    def gather(self, outputs, output_device):
        return gather(outputs, output_device, dim=self.dim)
 
 
[docs]def data_parallel(module, inputs, device_ids=None, output_device=None, dim=0, module_kwargs=None):
    r"""Evaluates module(input) in parallel across the GPUs given in device_ids.
    This is the functional version of the DataParallel module.
    Args:
        module (Module): the module to evaluate in parallel
        inputs (Tensor): inputs to the module
        device_ids (list of int or torch.device): GPU ids on which to replicate module
        output_device (list of int or torch.device): GPU location of the output  Use -1 to indicate the CPU.
            (default: device_ids[0])
    Returns:
        a Tensor containing the result of module(input) located on
        output_device
    """
    if not isinstance(inputs, tuple):
        inputs = (inputs,)
 
    device_type = _get_available_device_type()
 
    if device_ids is None:
        device_ids = _get_all_device_indices()
 
    if output_device is None:
        output_device = device_ids[0]
 
    device_ids = list(map(lambda x: _get_device_index(x, True), device_ids))
    output_device = _get_device_index(output_device, True)
    src_device_obj = torch.device(device_type, device_ids[0])
 
    for t in chain(module.parameters(), module.buffers()):
        if t.device != src_device_obj:
            raise RuntimeError("module must have its parameters and buffers "
                               "on device {} (device_ids[0]) but found one of "
                               "them on device: {}".format(src_device_obj, t.device))
 
    inputs, module_kwargs = scatter_kwargs(inputs, module_kwargs, device_ids, dim)
    if len(device_ids) == 1:
        return module(*inputs[0], **module_kwargs[0])
    used_device_ids = device_ids[:len(inputs)]
    replicas = replicate(module, used_device_ids)
    outputs = parallel_apply(replicas, inputs, module_kwargs, used_device_ids)
    return gather(outputs, output_device, dim)
  • 55
    点赞
  • 197
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
torch.nn.Linear是PyTorch中的一个模块,用于定义一个线性层。它接受两个参数,即输入和输出的维度。通过调用torch.nn.Linear(input_dim, output_dim),可以创建一个线性层,其中input_dim是输入的维度,output_dim是输出的维度。Linear模块的主要功能是执行线性变换,将输入数据乘以权重矩阵,并加上偏置向量。这个函数的具体实现可以参考PyTorch官方文档中的链接。 在引用中的示例中,linear1是一个Linear模块的实例。可以通过print(linear1.weight.data)来查看linear1的权重。示例中给出了权重的具体数值。 在引用中的示例中,x是一个Linear模块的实例,输入维度为5,输出维度为2。通过调用x(data)来计算线性变换的结果。在这个示例中,输入data的维度是(5,5),输出的维度是(5,2)。可以使用torch.nn.functional.linear函数来实现与torch.nn.Linear相同的功能,其中weight和bias分别表示权重矩阵和偏置向量。 以上是关于torch.nn.Linear的一些介绍和示例。如果需要更详细的信息,可以参考PyTorch官方文档中关于torch.nn.Linear的说明。 https://pytorch.org/docs/stable/_modules/torch/nn/modules/linear.html<span class="em">1</span><span class="em">2</span><span class="em">3</span> #### 引用[.reference_title] - *1* [torch.nn.Linear详解](https://blog.csdn.net/sazass/article/details/123568203)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT0_1"}}] [.reference_item style="max-width: 33.333333333333336%"] - *2* [torch.nn.Linear](https://blog.csdn.net/weixin_41620490/article/details/127833324)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT0_1"}}] [.reference_item style="max-width: 33.333333333333336%"] - *3* [pytorch 笔记:torch.nn.Linear() VS torch.nn.function.linear()](https://blog.csdn.net/qq_40206371/article/details/124473437)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v92^chatsearchT0_1"}}] [.reference_item style="max-width: 33.333333333333336%"] [ .reference_list ]
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值