[kineto] Optimize getStepCallbacks for common case of no active callbacks

Pull Request resolved: https://github.com/pytorch/pytorch/pull/77804

IIUC, the result of this function will be empty and unused if there are no sampled callbacks, which is the common case. We can accelerate this case by wrapping the result in an optional to save initializing an empty SmallVector.

Differential Revision: [D36497279](https://our.internmc.facebook.com/intern/diff/D36497279/)

**NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36497279/)!

Approved by: https://github.com/robieta
This commit is contained in:
Scott Wolchok
2022-05-19 15:26:22 -07:00
committed by PyTorch MergeBot
parent 02c4d877b4
commit c083489f46
7 changed files with 59 additions and 25 deletions

View File

@ -1,3 +1,4 @@
#include <torch/torch.h>
#include <ATen/record_function.h>
@ -49,9 +50,9 @@ float runPureRecordFunctionBench(int iter) {
typedef std::chrono::microseconds us;
std::chrono::time_point<clock> start_time = clock::now();
for (auto idx = 0; idx < iter; ++idx) {
auto step_callbacks = at::getStepCallbacks(at::RecordScope::USER_SCOPE);
if (!step_callbacks.empty()) {
at::RecordFunction guard(std::move(step_callbacks));
auto step_callbacks = at::getStepCallbacksUnlessEmpty(at::RecordScope::USER_SCOPE);
if (step_callbacks.has_value()) {
at::RecordFunction guard(std::move(*step_callbacks));
guard.before("Test", -1);
}
}