Commit Graph

385 Commits

Author SHA1 Message Date
54107ae8cf convert output_device at data_parallel from torch.device to index (#10189)
Summary:
- fixes #9984
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10189

Differential Revision: D9545390

Pulled By: weiyangfb

fbshipit-source-id: 3a6a705437553ba319e9fd4b7f676ff73857a27e
2018-09-11 20:27:07 -07:00
0988bbad2d C10d release to torch.distributed for PT1 (#11405)
Summary:
The old `torch.distributed` will go to `torch.distributed.deprecated`
The old DDP will go to `torch.nn.parallel.deprecated`

Now `torch.nn.parallel.DDP` will use c10d DDP
Now `torch.distributed` will use C10d frontend API
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11405

Reviewed By: pietern

Differential Revision: D9733733

Pulled By: teng-li

fbshipit-source-id: d6a3f3e73f8d3a7fcb1f4baef53c78063b8cbb08
2018-09-10 23:27:22 -07:00
afd7477eaa Add `buffers(), named_buffers()` methods. (#10554)
Summary:
This commit adds the ``buffers()`` and ``named_buffers()`` methods as
analogues of ``parameters()`` and ``named_parameters()``.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10554

Reviewed By: SsnL

Differential Revision: D9367762

Pulled By: jma127

fbshipit-source-id: f2042e46a7e833dce40cb41681dbd80d7885c74e
2018-08-16 16:26:48 -07:00
a77b391de7 [SpectralNorm] don't register original weight as buffer (#8170)
* don't register original weight as buffer; fixes for buffers that require grad

* add test
2018-06-12 14:42:05 -04:00
52e4d3c4a2 add error when backend is not supported by DDP (#8325) 2018-06-11 02:18:30 -04:00
537cb10525 improve DataParallel/DistributedDataParallel docs (#7407) 2018-05-09 10:30:42 +02:00
5463a4a319 Fix typo. (#6609) 2018-04-15 11:43:10 +02:00
1499a604cf fix assertion error when input size smaller than number of module_copies (#6252) 2018-04-04 12:05:34 +02:00
f5aa8d55ad fix detach in place error in DDP (#5829)
* fix detach in DDP

* fix typo

* make lint happy
2018-03-16 09:22:04 -04:00
579de82bcf DDP: 10% of NCCL backend perf improvements with mixed-prec support (#5064) 2018-02-21 23:59:52 +01:00
4b8f4fc259 Added mixed-precision support in distributed training (#4891) 2018-02-21 14:29:39 +01:00
cac3026b35 Fix typo in DataParallel docs (#5268) 2018-02-15 23:02:26 +01:00
d7b6a61a54 DDP: coalescing many little broadcasts to improve performance (#4978) 2018-02-12 16:41:33 +01:00
805639906a Broacast output requires_grad if only corresponding input requires_grad (#5061) 2018-02-05 23:38:35 -05:00
ae28411af8 Slightly improve DDP single GPU multi-process dist training performance 2018-01-27 12:15:44 +01:00
154038e318 Removing NCCL clear_group_cache workaround with one more check in new_group (#4766) 2018-01-23 11:03:52 +01:00
d605058212 Replace Variable.volatile with torch.no_grad() (#3970)
This removes volatile from Variable. The functionality is mostly
replaced by a global (thread-local) flag, which is controlled by
torch.set_grad_enabled() and the context manager torch.no_grad().

In C++, the flag is exposed through GradMode::is_enabled() and GradMode::set_enabled()

Fixes #3627
2017-12-18 15:46:13 -05:00
7f41149e14 handle requires_grad when creating buckets for distributed (#4044) 2017-12-18 02:13:53 -05:00
926ed2b280 Implemented NCCL Distributed Backend for PyTorch with new dist APIs (#3435)
* Implemented NCCL Distributed Backend for PyTorch with new dist APIs

* Let FindNCCL to determine the NCCL version

* Let NCCL2 Backend use ATEN instead deprecated THPP

* Let distributed parallel model use a single reduction thread for NCCL backend

* Caching the sockets, bug fix, refactoring, and addressed Adam's comments

* Make BcastNcclID take a single param and bug fix for all_gather

* Removed barrier function, added warning for users, and not exposing experimental func to users

* Use the simplest single bucket working solution for distriubted data parallel model with rebase

* Cleanup, fixes and further addressed Adam's comments

* Used PySequence_Fast in distributed csrc

* Removed the limitation that each group is only bound to a given device sequence

* Used THPObjectPtr for PySequence_Fast
2017-11-29 15:57:02 -05:00
01be4d6b20 sparse broadcast_coalesce and reduce_add_coalesced 2017-10-28 18:52:35 -04:00
de1f4e69dd raw text (#3327) 2017-10-28 01:24:02 +05:30
6743d59513 Add missing import. Add return to __getstate__ 2017-10-08 11:07:10 -04:00
5f8bab47c8 bugfix for 2428 ussue (#3000) 2017-10-06 09:20:12 -04:00
7aa6bc516f add "Basics" section to distributed docs (#2433) 2017-08-24 17:07:20 -04:00
5d09fcd028 Make DistributedDataParallel threads Daemon threads to allow clean process exit (#2524) 2017-08-24 06:32:29 -04:00
4c69697d2a Distribtued bug fixes. (#2434) 2017-08-23 14:46:52 -04:00
5c43fcda8d Support params that don’t require grad in DistributedDataParallel (#2464) 2017-08-19 11:22:20 -04:00
9199c954f1 Fix typo in DistributedDataParallel (#2320) 2017-08-08 21:53:42 -04:00
dc17fb68e4 Fix minor bug in parallel_apply (#2193) 2017-07-25 03:45:00 +05:30
8ab3d214d5 Fixes for DistributedDataParallel (#2168) 2017-07-21 16:00:46 -04:00
4af40e3471 Let parallel_apply accept arbitrary inputs 2017-07-20 01:45:57 -04:00
10e23943b3 Fix missing _forward_pre_hooks in serialized modules (#2057) 2017-07-11 18:23:35 -04:00
46a868dab7 [Ready] Limit docs line length (#1900)
* some docs are ready

* docs

* docs

* fix some more

* fix some more
2017-07-10 10:24:54 -04:00
d9d50f80c7 Rename arguments to distributed collectives 2017-06-12 22:02:11 -04:00
12813b88f6 Add DistributedDataParallel 2017-06-12 22:00:22 -04:00