Files
pytorch/torch/csrc/utils/cpp_stacktraces.cpp
zdevito 772ae6da1e Fast standalone symbolize for unwinding (#123966)
We've had issues using addr2line. On certain versions of
CentOS it is on a version that has a performance regression making it very slow,
and even normallly it is not that fast, taking several seconds even when parallelized
for a typical memory trace dump.

Folly Symbolize or LLVMSymbolize are fast but it requires PyTorch take a dependency on those libraries to do this, and given the number of environments we run stuff in, we end up hitting cases where we fallback to slow addr2line behavior.

This adds a standalone symbolizer to PyTorch similar to the unwinder which has
no external dependencies and is ~20x faster than addr2line for unwinding PyTorch frames.

I've tested this on some memory profiling runs using all combinations of {gcc, clang} x {dwarf4, dwarf5} and it seems to do a good job at getting line numbers and function names right. It is also careful to route all reads of library data through the `CheckedLexer` object, which ensure it is not reading out of bounds of the section. Errors are routed through UnwindError so that those exceptions get caught and we produce a ?? frame rather than crash. I also added a fuzz test which gives all our symbolizer options random addresses in the process to make sure they do not crash.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/123966
Approved by: https://github.com/ezyang
2024-04-23 15:27:18 +00:00

78 lines
1.8 KiB
C++

#include <torch/csrc/utils/cpp_stacktraces.h>
#include <cstdlib>
#include <cstring>
#include <c10/util/Exception.h>
namespace torch {
namespace {
bool compute_cpp_stack_traces_enabled() {
auto envar = std::getenv("TORCH_SHOW_CPP_STACKTRACES");
if (envar) {
if (strcmp(envar, "0") == 0) {
return false;
}
if (strcmp(envar, "1") == 0) {
return true;
}
TORCH_WARN(
"ignoring invalid value for TORCH_SHOW_CPP_STACKTRACES: ",
envar,
" valid values are 0 or 1.");
}
return false;
}
bool compute_disable_addr2line() {
auto envar = std::getenv("TORCH_DISABLE_ADDR2LINE");
if (envar) {
if (strcmp(envar, "0") == 0) {
return false;
}
if (strcmp(envar, "1") == 0) {
return true;
}
TORCH_WARN(
"ignoring invalid value for TORCH_DISABLE_ADDR2LINE: ",
envar,
" valid values are 0 or 1.");
}
return false;
}
} // namespace
bool get_cpp_stacktraces_enabled() {
static bool enabled = compute_cpp_stack_traces_enabled();
return enabled;
}
static torch::unwind::Mode compute_symbolize_mode() {
auto envar_c = std::getenv("TORCH_SYMBOLIZE_MODE");
if (envar_c) {
std::string envar = envar_c;
if (envar == "dladdr") {
return unwind::Mode::dladdr;
} else if (envar == "addr2line") {
return unwind::Mode::addr2line;
} else if (envar == "fast") {
return unwind::Mode::fast;
} else {
TORCH_CHECK(
false,
"expected {dladdr, addr2line, fast} for TORCH_SYMBOLIZE_MODE, got ",
envar);
}
} else {
return compute_disable_addr2line() ? unwind::Mode::dladdr
: unwind::Mode::addr2line;
}
}
unwind::Mode get_symbolize_mode() {
static unwind::Mode mode = compute_symbolize_mode();
return mode;
}
} // namespace torch