mirror of
https://github.com/pytorch/pytorch.git
synced 2025-10-21 05:34:18 +08:00
Flight recoder data as JSON (#129505)
Summary: Provide a new API to retrieve flight recorder data as JSON. The one minor difference between flight recorder as Pickle v/s JSON is that the JSON API does not retrieve stack traces at the moment. This ends up being far too much data. Test Plan: unit test Differential Revision: [D59536460](https://our.internmc.facebook.com/intern/diff/D59536460) Pull Request resolved: https://github.com/pytorch/pytorch/pull/129505 Approved by: https://github.com/wconstab, https://github.com/d4l3k
This commit is contained in:
committed by
PyTorch MergeBot
parent
86bca69c5f
commit
83c95c48f7
@ -3194,6 +3194,23 @@ such as `dist.all_reduce(tensor, async_op=True)`.
|
||||
Arguments:
|
||||
tensors(List[torch.Tensor]): List of tensors we want to hash.
|
||||
)");
|
||||
module.def(
|
||||
"_dump_nccl_trace_json",
|
||||
[](std::optional<bool> includeCollectives,
|
||||
std::optional<bool> onlyActive) {
|
||||
return py::bytes(::c10d::dump_nccl_trace_json(
|
||||
includeCollectives.value_or(true), onlyActive.value_or(false)));
|
||||
},
|
||||
py::arg("includeCollectives") = std::optional<bool>(),
|
||||
py::arg("onlyActive") = std::optional<bool>(),
|
||||
R"(
|
||||
Arguments:
|
||||
includeCollectives(bool, optional): Whether to include collective work traces. Default is True.
|
||||
onlyActive (bool, optional): Whether to only include active collective work traces. Default is False.
|
||||
Returns:
|
||||
Stringified json work traces.
|
||||
Default settings return everything - i.e. contains NCCL comm dumps and collective traces.
|
||||
)");
|
||||
module.def(
|
||||
"_dump_nccl_trace",
|
||||
[](std::optional<bool> includeCollectives,
|
||||
|
Reference in New Issue
Block a user