Data-science/deep learning
[pytorch] DistributedDataParallel vs DataParallel 차이
study&grow
2021. 1. 4. 08:25
728x90
The difference between DistributedDataParallel and DataParallel is: DistributedDataParallel uses multiprocessing where a process is created for each GPU, while DataParallel uses multithreading. By using multiprocessing, each GPU has its dedicated process, this avoids the performance overhead caused by GIL of Python interpreter.
If you use DistributedDataParallel, you could use torch.distributed.launch utility to launch your program, see Third-party backends.
DistributedDataParallel : multi processing 이용
DataParallel : multi thread 이용
python은 GLI 때문에 multi thread로 성능향상을 보기 어려운 걸로 알고 있다. 그러니 DistributedDataParallel를 쓰자.
pytorch.org/docs/stable/notes/cuda.html#cuda-nn-ddp-instead
CUDA semantics — PyTorch 1.7.0 documentation
CUDA semantics torch.cuda is used to set up and run CUDA operations. It keeps track of the currently selected GPU, and all CUDA tensors you allocate will by default be created on that device. The selected device can be changed with a torch.cuda.device cont
pytorch.org