mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-20 21:14:14 +08:00
Implements #148689 Pull Request resolved: https://github.com/pytorch/pytorch/pull/149334 Approved by: https://github.com/d4l3k Co-authored-by: Paul de Supinski <pdesupinski@gmail.com>
601 B
601 B
Torch Distributed Elastic
Makes distributed PyTorch fault-tolerant and elastic.
Get Started
:caption: Usage
:maxdepth: 1
elastic/quickstart
elastic/train_script
elastic/examples
Documentation
:caption: API
:maxdepth: 1
elastic/run
elastic/agent
elastic/multiprocessing
elastic/errors
elastic/rendezvous
elastic/timer
elastic/metrics
elastic/events
elastic/subprocess_handler
elastic/control_plane
elastic/numa
:caption: Advanced
:maxdepth: 1
elastic/customization
:caption: Plugins
:maxdepth: 1
elastic/kubernetes