[swin-trans]分布式训练的debug:ValueError: Error initializing torch.distributed using env:// rendezvous: en
在用torch.distributed.init_process_group(backend='nccl', init_method='env://', world_size=world_size, rank=rank)时,出现
1、ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable MASTER_ADDR expected, but not set
解决
加入
os.environ['MASTER_ADDR'] = 'localhost'

2、ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable MASTER_PORT expected, but not set
解决
加入
os.environ['MASTER_PORT'] = '12345'


浙公网安备 33010602011771号