Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13109
The "right" strategy of creating a socket, binding to an undefined port, closing the socket, and reusing the port it was bound to, was subject to a race condition. Another process could bind to that same port sooner than the tests would, causing an "Address already in use" failure when rank 0 would try and bind to that same port. The THD tests have been using a fixed port since forever. Time will tell if this fixes#12876.
Differential Revision: D10850614
fbshipit-source-id: c19f12bb4916141187ee8ddb52880f5f418310dc