Fix BN tests for >= 8 GPU test environments (#19049)

Summary:
DDP does not support replicating BN layers within a process. Existing BN tests fail if the test environment has more than 8 GPUs. This is fixed by explicitly setting each process to use a single replica.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19049

Differential Revision: D14845286

Pulled By: mrshenli

fbshipit-source-id: 937dda5081d415ece48b21f2781b6b4e008dd42f
This commit is contained in:
Shen Li
2019-04-09 08:01:18 -07:00
committed by Facebook Github Bot
parent 17adce1b69
commit 544783fa1d

View File

@ -1516,7 +1516,9 @@ class _DistTestBase(object):
def test_DistributedDataParallel_SyncBatchNorm(self):
group, group_id, rank = self._init_global_test()
rank_to_GPU = self._init_multigpu_helper()
gpus = list(rank_to_GPU[rank])
# DDP does not support replicating BN layers within a process, hence
# testing with one module replica per process
gpus = [rank]
self._test_DistributedDataParallel_SyncBatchNorm(gpu_subset=gpus, rank=rank)
# test output_device