13 Commits

Author SHA1 Message Date
d8c8ba2440 Fix unused Python variables in test/[e-z]* (#136964)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136964
Approved by: https://github.com/justinchuby, https://github.com/albanD
2024-12-18 23:02:30 +00:00
9459952175 [ci, 3.13] update tensorboard version for 3.13 to fix broken tests (#141572)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/141572
Approved by: https://github.com/StrongerXi, https://github.com/atalman
ghstack dependencies: #141409, #142003
2024-12-05 00:24:07 +00:00
c93dd531d3 format test_monitor.py and test_tensorboard.py (#142003)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/142003
Approved by: https://github.com/StrongerXi, https://github.com/atalman
ghstack dependencies: #141409
2024-12-05 00:23:54 +00:00
fca2dba7ca [pytorch][counters] Pybind for WaitCounter (#132357)
Summary:
Basic pybind integration for WaitCounter providing a guard API.
Also fixes broken copy/move constructor in WaitGuard (it wasn't really used with the macro-based C++ API).

Test Plan: unit test

Differential Revision: D60557660

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132357
Approved by: https://github.com/jamesperng, https://github.com/asiab4
2024-08-02 16:08:10 +00:00
dc38646c58 Revert "[pytorch][counters] Pybind for WaitCounter (#132167)"
This reverts commit 2c7bd61afa4b762e00b26bbde43685de080af32a.

Reverted https://github.com/pytorch/pytorch/pull/132167 on behalf of https://github.com/clee2000 due to broke test_public_bindings.py::TestPublicBindings::test_correct_module_names [GH job link](https://github.com/pytorch/pytorch/actions/runs/10183687967/job/28172929836) [HUD commit link](2c7bd61afa) not tested on PR due to bad TD ([comment](https://github.com/pytorch/pytorch/pull/132167#issuecomment-2261328275))
2024-07-31 19:51:56 +00:00
2c7bd61afa [pytorch][counters] Pybind for WaitCounter (#132167)
Summary:
Basic pybind integration for WaitCounter providing a guard API.
Also fixes broken copy/move constructor in WaitGuard (it wasn't really used with the macro-based C++ API).

Test Plan: unit test

Reviewed By: asiab4

Differential Revision: D60463979

Pull Request resolved: https://github.com/pytorch/pytorch/pull/132167
Approved by: https://github.com/asiab4
2024-07-31 16:04:40 +00:00
fb06ed36d1 Change dynamo_test_failures.py to silently run skipped tests (#117401)
- We silently run skipped tests and then raise a skip message with the
  error message (if any)
- Instead of raising expectedFailure, we raise a skip message with the
  error message (if any)

We log the skip messages in CI, so this will let us read the logs and do
some basic triaging of the failure messages.

Test Plan:
- existing tests. I hope that there are no tests that cause each other
  to fail.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117401
Approved by: https://github.com/voznesenskym
ghstack dependencies: #117391, #117400
2024-01-17 02:48:19 +00:00
6208c2800e torch/monitor: merge Interval and FixedCount stats (#72009)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72009

This simplifies the Stats interface by merging IntervalStat and FixedCountStat into a single Stat w/ a specific window size duration and an optional max samples per window. This allows for the original intention of having comparably sized windows (for statistical purposes) while also having a consistent output bandwidth.

Test Plan:
```
buck test //caffe2/test:monitor //caffe2/test/cpp/monitor:monitor
```

Reviewed By: kiukchung

Differential Revision: D33822956

fbshipit-source-id: a74782492421be613a1a8b14341b6fb2e8eeb8b4
(cherry picked from commit 293b94e0b4646521ffe047e5222c4bba7e688464)
2022-01-30 23:21:59 +00:00
7aa4a1f63e torch/monitor: TensorboardEventHandler (#71658)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71658

This adds the beginnings of a TensorboardEventHandler which will log stats to Tensorboard.

Test Plan: buck test //caffe2/test:monitor

Reviewed By: edward-io

Differential Revision: D33719954

fbshipit-source-id: e9847c1319255ce0d9cf2d85d8b54b7a3c681bd2
(cherry picked from commit 5c8520a6baea51db02e4e29d0210b3ced60fa18d)
2022-01-27 08:33:55 +00:00
26d54b4076 monitor: add docstrings to pybind interface (#71481)
Summary:
This adds argument names and docstrings so the docs are a lot more understandable.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71481

Test Plan:
docs/tests CI should suffice

![Screenshot 2022-01-19 at 16-35-10 torch monitor — PyTorch master documentation](https://user-images.githubusercontent.com/909104/150240882-e69cfa17-e2be-4569-8ced-71979a89b369.png)

Reviewed By: edward-io

Differential Revision: D33661255

Pulled By: d4l3k

fbshipit-source-id: 686835dfe331b92a51f4409ec37f8ee6211e49d3
(cherry picked from commit 0a6accda1bec839bbc9387d80caa51194e81d828)
2022-01-21 23:04:33 +00:00
b40dbdc49f Fix test ownership lint (#71554)
Summary:
I noticed after creating https://github.com/pytorch/pytorch/issues/71553 that the test ownership lint was not working properly.

This fixes my egregious mistake and fixes the broken lints.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/71554

Reviewed By: malfet

Differential Revision: D33690732

Pulled By: janeyx99

fbshipit-source-id: ba4dfbcd98038e4afd63e326832ae40935d2501e
(cherry picked from commit 1bbc3d343ac143f10b3d4052496812fccfd9e853)
2022-01-21 18:24:42 +00:00
2eb4b05b94 torch/monitor: make tests more robust on windows (#71581)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71581

Fixes https://github.com/pytorch/pytorch/issues/71553

Test Plan:
add ciflow/windows to CI

  buck test //caffe2/test:monitor -- --stress-runs 100 test_interval_sec

I don't have a windows machine so need to rely on CI to test

Reviewed By: edward-io

Differential Revision: D33691540

fbshipit-source-id: 69f28f1dfa7243e4eeda642f9bef6d5d168381d2
(cherry picked from commit 5d24dc7c2f5e8e0f48fdd602b1eaa3a8e6929715)
2022-01-20 14:33:41 -08:00
bfe1abd3b5 torch/monitor: add pybind (#69567)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/69567

This exposes torch.monitor events and stats via pybind11 to the underlying C++ implementation.

* The registration interface is a tad different since it takes a lambda function in Python where as in C++ it's a full class.
* This has a small amount of changes to the counter interfaces since there's no way to create an initializer list at runtime so they now also take a vector.
* Only double based stats are provided in Python since it's intended more for high level stats where float imprecision shouldn't be an issue. This can be changed down the line if need arises.

```
events = []

def handler(event):
    events.append(event)

handle = register_event_handler(handler)

log_event(Event(type="torch.monitor.TestEvent", timestamp=datetime.now(), metadata={"foo": 1.0}))
```

D32969391 is now included in this diff.
This cleans up the naming for events. type is now name, message is gone, and metadata is renamed data.

Test Plan: buck test //caffe2/test:monitor //caffe2/test/cpp/monitor:monitor

Reviewed By: kiukchung

Differential Revision: D32924141

fbshipit-source-id: 563304c2e3261a4754e40cca39fc64c5a04b43e8
2022-01-12 13:35:11 -08:00