Files
pytorch/torch/lib/c10d/ProcessGroup.cpp
Dmytro Dzhulgakov c25e33789e Lightweight at-most-once logging for API usage (#20745)
Summary:
Resubmit #20698 which got messed up.

Idea is that when PyTorch is used in a custom build environment (e.g. Facebook), it's useful to track usage of various APIs centrally. This PR introduces a simple very lightweight mechanism to do so - only first invocation of a trigger point would be logged. This is significantly more lightweight than #18235 and thus we can allow to put logging in e.g. TensorImpl.

Also adds an initial list of trigger points. Trigger points are added in such a way that no static initialization triggers them, i.e. just linking with libtorch.so will not cause any logging. Further suggestions of what to log are welcomed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20745

Differential Revision: D15429196

Pulled By: dzhulgakov

fbshipit-source-id: a5e41a709a65b7ebccc6b95f93854e583cf20aca
2019-05-23 23:17:59 -07:00

56 lines
1.3 KiB
C++

#include <c10d/ProcessGroup.hpp>
#include <c10/util/Logging.h>
namespace c10d {
ProcessGroup::Work::~Work() {}
bool ProcessGroup::Work::isCompleted() {
std::lock_guard<std::mutex> lock(mutex_);
return completed_;
}
bool ProcessGroup::Work::isSuccess() const {
std::lock_guard<std::mutex> lock(mutex_);
return !exception_;
}
std::exception_ptr ProcessGroup::Work::exception() const {
std::lock_guard<std::mutex> lock(mutex_);
return exception_;
}
int ProcessGroup::Work::sourceRank() const {
throw std::runtime_error(
"sourceRank() may only be called on work objects "
"that correspond to a recv or recv-from-any call.");
}
void ProcessGroup::Work::synchronize() {}
void ProcessGroup::Work::wait() {
std::unique_lock<std::mutex> lock(mutex_);
cv_.wait(lock, [&] { return completed_; });
if (exception_) {
std::rethrow_exception(exception_);
}
synchronize();
}
void ProcessGroup::Work::finish(std::exception_ptr exception) {
std::unique_lock<std::mutex> lock(mutex_);
completed_ = true;
exception_ = exception;
lock.unlock();
cv_.notify_all();
}
ProcessGroup::ProcessGroup(int rank, int size) : rank_(rank), size_(size) {
C10_LOG_API_USAGE_ONCE("c10d.process_group");
}
ProcessGroup::~ProcessGroup() {}
} // namespace c10d