Summary:
add_histogram fails for this data type. Updating conversion code to handle it.
Stack trace for the failure -
`
[trainer0]Traceback (most recent call last):
[trainer0] File "<torch_package_0>.tensorboard/logging/summary_v2.py", line 203, in unscriptable_record_summary
[trainer0] unscriptable_histogram(name, t, step, ranks)
[trainer0] File "<torch_package_0>.tensorboard/logging/fx_v1.py", line 146, in unscriptable_histogram
[trainer0] Adhoc.writer().add_histogram(tag, x, step.int())
[trainer0] File "/tmp/aienv/images/aienv_image_09slg3j1/torch/utils/tensorboard/writer.py", line 40, in wrapper
[trainer0] resp = super_method(*args, **kwargs)
[trainer0] File "/tmp/aienv/images/aienv_image_09slg3j1/torch/utils/tensorboard/writer_oss.py", line 526, in add_histogram
[trainer0] histogram(tag, values, bins, max_bins=max_bins), global_step, walltime
[trainer0] File "/tmp/aienv/images/aienv_image_09slg3j1/torch/utils/tensorboard/summary.py", line 482, in histogram
[trainer0] values = make_np(values)
[trainer0] File "/tmp/aienv/images/aienv_image_09slg3j1/torch/utils/tensorboard/_convert_np.py", line 23, in make_np
[trainer0] return _prepare_pytorch(x)
[trainer0] File "/tmp/aienv/images/aienv_image_09slg3j1/torch/utils/tensorboard/_convert_np.py", line 30, in _prepare_pytorch
[trainer0] x = x.detach().cpu().numpy()
[trainer0]TypeError: Got unsupported ScalarType BFloat16
`
Test Plan: Updated unit test that was failing before but passes after this change.
Reviewed By: hamzajzmati, jcarreiro
Differential Revision: D53841197
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120087
Approved by: https://github.com/jcarreiro, https://github.com/yanboliang
https://github.com/pytorch/pytorch/pull/108351/files is failing on mac and windows because we don't have the dependency
It is available on linux because it is included in .ci/docker/requirements-docs.txt
Adding skips to make it green.
Here are some outputs for future debugging
https://github.com/pytorch/pytorch/actions/runs/6192933622/job/16813841625https://ossci-raw-job-status.s3.amazonaws.com/log/16813841625
```
2023-09-15T02:09:43.2397460Z =================================== FAILURES ===================================
2023-09-15T02:09:43.2397650Z ______________________ TestTensorBoardSummary.test_audio _______________________
2023-09-15T02:09:43.2397830Z Traceback (most recent call last):
2023-09-15T02:09:43.2398090Z File "/Users/ec2-user/runner/_work/pytorch/pytorch/test/test_tensorboard.py", line 417, in test_audio
2023-09-15T02:09:43.2398390Z self.assertTrue(compare_proto(summary.audio('dummy', tensor_N(shape=(42,))), self))
2023-09-15T02:09:43.2398720Z File "/Users/ec2-user/runner/_work/_temp/conda_environment_6192933622/lib/python3.9/unittest/case.py", line 688, in assertTrue
2023-09-15T02:09:43.2399100Z ##[endgroup]
2023-09-15T02:09:43.2399240Z raise self.failureException(msg)
2023-09-15T02:09:43.2399400Z AssertionError: False is not true
2023-09-15T02:09:43.2399490Z
2023-09-15T02:09:43.2399590Z To execute this test, run the following from the base repo dir:
2023-09-15T02:09:43.2399820Z python test/test_tensorboard.py -k test_audio
2023-09-15T02:09:43.2399930Z
```
https://github.com/pytorch/pytorch/actions/runs/6192933622/job/16814065258https://ossci-raw-job-status.s3.amazonaws.com/log/16814065258
```
2023-09-15T02:38:44.6284979Z ================================== FAILURES ===================================
2023-09-15T02:38:44.6285295Z ______________________ TestTensorBoardNumpy.test_scalar _______________________
2023-09-15T02:38:44.6285556Z Traceback (most recent call last):
2023-09-15T02:38:44.6285915Z File "C:\actions-runner\_work\pytorch\pytorch\test\test_tensorboard.py", line 794, in test_scalar
2023-09-15T02:38:44.6286325Z res = make_np(np.float128(1.00008 + 9))
2023-09-15T02:38:44.6286705Z File "C:\Jenkins\Miniconda3\lib\site-packages\numpy\__init__.py", line 315, in __getattr__
2023-09-15T02:38:44.6287700Z raise AttributeError("module {!r} has no attribute "
2023-09-15T02:38:44.6288060Z AttributeError: module 'numpy' has no attribute 'float128'
2023-09-15T02:38:44.6288241Z
2023-09-15T02:38:44.6288390Z To execute this test, run the following from the base repo dir:
2023-09-15T02:38:44.6288679Z python test\test_tensorboard.py -k test_scalar
2023-09-15T02:38:44.6288846Z
```
https://github.com/pytorch/pytorch/actions/runs/6193449301/job/16815113985https://ossci-raw-job-status.s3.amazonaws.com/log/16815113985
```
2023-09-15T03:25:53.7797550Z =================================== FAILURES ===================================
2023-09-15T03:25:53.7797790Z __________________ TestTensorBoardSummary.test_histogram_auto __________________
2023-09-15T03:25:53.7798000Z Traceback (most recent call last):
2023-09-15T03:25:53.7798310Z File "/Users/ec2-user/runner/_work/pytorch/pytorch/test/test_tensorboard.py", line 426, in test_histogram_auto
2023-09-15T03:25:53.7798690Z self.assertTrue(compare_proto(summary.histogram('dummy', tensor_N(shape=(1024,)), bins='auto', max_bins=5), self))
2023-09-15T03:25:53.7799090Z File "/Users/ec2-user/runner/_work/_temp/conda_environment_6193449301/lib/python3.9/unittest/case.py", line 688, in assertTrue
2023-09-15T03:25:53.7799430Z raise self.failureException(msg)
2023-09-15T03:25:53.7799610Z AssertionError: False is not true
2023-09-15T03:25:53.7799720Z
2023-09-15T03:25:53.7799840Z To execute this test, run the following from the base repo dir:
2023-09-15T03:25:53.7800170Z python test/test_tensorboard.py -k test_histogram_auto
2023-09-15T03:25:53.7800310Z
2023-09-15T03:25:53.7800430Z This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
2023-09-15T03:25:53.7800870Z - generated xml file: /Users/ec2-user/runner/_work/pytorch/pytorch/test/test-reports/python-pytest/test_tensorboard/test_tensorboard-aef95b5e2d69c061.xml -
2023-09-15T03:25:53.7801200Z =========================== short test summary info ============================
```
https://github.com/pytorch/pytorch/actions/runs/6193576371/job/16815396352https://ossci-raw-job-status.s3.amazonaws.com/log/16815396352
```
2023-09-15T03:47:02.9430070Z _________________ TestTensorBoardSummary.test_histogram_doane __________________
2023-09-15T03:47:02.9430250Z Traceback (most recent call last):
2023-09-15T03:47:02.9430520Z File "/Users/ec2-user/runner/_work/pytorch/pytorch/test/test_tensorboard.py", line 433, in test_histogram_doane
2023-09-15T03:47:02.9430850Z self.assertTrue(compare_proto(summary.histogram('dummy', tensor_N(shape=(1024,)), bins='doane', max_bins=5), self))
2023-09-15T03:47:02.9431180Z File "/Users/ec2-user/runner/_work/_temp/conda_environment_6193576371/lib/python3.9/unittest/case.py", line 688, in assertTrue
2023-09-15T03:47:02.9431390Z raise self.failureException(msg)
2023-09-15T03:47:02.9431550Z AssertionError: False is not true
2023-09-15T03:47:02.9431640Z
2023-09-15T03:47:02.9431730Z To execute this test, run the following from the base repo dir:
2023-09-15T03:47:02.9432000Z python test/test_tensorboard.py -k test_histogram_doane
2023-09-15T03:47:02.9432120Z
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/109349
Approved by: https://github.com/huydhn
Summary:
The `tensor_proto` function in the TensorBoard summary writer code doesn't correctly encode `torch.bfloat16` tensors; it tries to use a data type of `DT_BFLOAT` when creating the protobuf, but `DT_BFLOAT` is not a valid enum value (see `types.proto`). The correct value to use when encoding tensors of this type is `DT_BLOAT16`. This diff updates the type map in the summary code to use the correct type.
While fixing this error, I also noticed the wrong field of the protobuf was being used when encoding tensors of this type; per the docs in the proto file, the DT_HALF and DT_BFLOAT16 types should use the `half_val` field, not `float_val`. Since this might confuse folks trying to read this data from storage in the future, I've updated the code to correctly use to `half_val` field for these cases. Note that there's no real size advantage from doing this, since both the `half_val` and `float_val` fields are 32 bits long.
Test Plan:
Added a parameterized unit test that tests encoding tensors with `torch.half`, `torch.float16`, and `torch.bfloat16` data types.
# Before this change
The test fails with an `ValueError` due to the incorrect enum label:
```
======================================================================
ERROR: test_bfloat16_tensor_proto (test_tensorboard.TestTensorProtoSummary)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/data/users/jcarreiro/fbsource/buck-out/v2/gen/fbcode/f88b3f368c9334db/caffe2/test/__tensorboard__/tensorboard#link-tree/torch/testing/_internal/common_utils.py", line 2382, in wrapper
method(*args, **kwargs)
File "/data/users/jcarreiro/fbsource/buck-out/v2/gen/fbcode/f88b3f368c9334db/caffe2/test/__tensorboard__/tensorboard#link-tree/test_tensorboard.py", line 871, in test_bfloat16_tensor_proto
tensor_proto(
File "/data/users/jcarreiro/fbsource/buck-out/v2/gen/fbcode/f88b3f368c9334db/caffe2/test/__tensorboard__/tensorboard#link-tree/torch/utils/tensorboard/summary.py", line 400, in tensor_proto
tensor_proto = TensorProto(**tensor_proto_args)
ValueError: unknown enum label "DT_BFLOAT"
To execute this test, run the following from the base repo dir:
python test/__tensorboard__/tensorboard#link-tree/test_tensorboard.py -k test_bfloat16_tensor_proto
This message can be suppressed by setting PYTORCH_PRINT_REPRO_ON_FAILURE=0
----------------------------------------------------------------------
```
# After this change
The test passes.
Reviewed By: tanvigupta17
Differential Revision: D48828958
Pull Request resolved: https://github.com/pytorch/pytorch/pull/108351
Approved by: https://github.com/hamzajzmati, https://github.com/XilunWu
The old `temp_dir` is created under `PWD`. But `PWD` may not be writable and in general is not a good place to create temporary directories. Use the standard `tempfile` instead.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89826
Approved by: https://github.com/soumith
crossref is a new strategy for performing tests when you want
to run a normal PyTorch API call, separately run some variation of
the API call (e.g., same thing but all the arguments are meta tensors)
and then cross-reference the results to see that they are consistent.
Any logic you add to CrossRefMode will get run on *every* PyTorch API
call that is called in the course of PyTorch's test suite. This can
be a good choice for correctness testing if OpInfo testing is not
exhaustive enough.
For now, the crossref test doesn't do anything except verify that
we can validly push a mode onto the torch function mode stack for all
functions.
Signed-off-by: Edward Z. Yang <ezyangfb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75988
Approved by: https://github.com/seemethere
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/71708
In Python 3.2, a number of asserts were deprecated.
In Python 3.11, these asserts are deleted completely. The files in this change still use the deprecated asserts.
Switch over to the supported syntax for 3.2 onwards.
Test Plan: Tested on the internal test suite runner.
Reviewed By: ajtulloch
Differential Revision: D33503694
fbshipit-source-id: a150f296033260acf8365d77b837ce0679f57361
(cherry picked from commit abf60ed97409265222915d8265aaabedd625fd93)
Summary:
CAFFE2 has been deprecated for a while, but still included in every PyTorch build.
We should stop building it by default, although CI should still validate that caffe2 code is buildable.
Build even fewer dependencies when compiling mobile builds without Caffe2
Introduce `TEST_CAFFE2` in torch.common.utils
Skip `TestQuantizedEmbeddingOps` and `TestJit.test_old_models_bc` is code is compiled without Caffe2
Should be landed after https://github.com/pytorch/builder/pull/864
Pull Request resolved: https://github.com/pytorch/pytorch/pull/66658
Reviewed By: driazati, seemethere, janeyx99
Differential Revision: D31669156
Pulled By: malfet
fbshipit-source-id: 1cc45e2d402daf913a4685eb9f841cc3863e458d
Summary:
As this diff shows, currently there are a couple hundred instances of raw `noqa` in the codebase, which just ignore all errors on a given line. That isn't great, so this PR changes all existing instances of that antipattern to qualify the `noqa` with respect to a specific error code, and adds a lint to prevent more of this from happening in the future.
Interestingly, some of the examples the `noqa` lint catches are genuine attempts to qualify the `noqa` with a specific error code, such as these two:
```
test/jit/test_misc.py:27: print(f"{hello + ' ' + test}, I'm a {test}") # noqa E999
test/jit/test_misc.py:28: print(f"format blank") # noqa F541
```
However, those are still wrong because they are [missing a colon](https://flake8.pycqa.org/en/3.9.1/user/violations.html#in-line-ignoring-errors), which actually causes the error code to be completely ignored:
- If you change them to anything else, the warnings will still be suppressed.
- If you add the necessary colons then it is revealed that `E261` was also being suppressed, unintentionally:
```
test/jit/test_misc.py:27:57: E261 at least two spaces before inline comment
test/jit/test_misc.py:28:35: E261 at least two spaces before inline comment
```
I did try using [flake8-noqa](https://pypi.org/project/flake8-noqa/) instead of a custom `git grep` lint, but it didn't seem to work. This PR is definitely missing some of the functionality that flake8-noqa is supposed to provide, though, so if someone can figure out how to use it, we should do that instead.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56272
Test Plan:
CI should pass on the tip of this PR, and we know that the lint works because the following CI run (before this PR was finished) failed:
- https://github.com/pytorch/pytorch/runs/2365189927
Reviewed By: janeyx99
Differential Revision: D27830127
Pulled By: samestep
fbshipit-source-id: d6dcf4f945ebd18cd76c46a07f3b408296864fcb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/40720
Add support for populating domain_discrete field in TensorBoard add_hparams API
Test Plan: Unit test test_hparams_domain_discrete
Reviewed By: edward-io
Differential Revision: D22291347
fbshipit-source-id: 78db9f62661c9fe36cd08d563db0e7021c01428d
Summary: Add support for populating domain_discrete field in TensorBoard add_hparams API
Test Plan: Unit test test_hparams_domain_discrete
Reviewed By: edward-io
Differential Revision: D22227939
fbshipit-source-id: d2f0cd8e5632cbcc578466ff3cd587ee74f847af
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/39483
I fixed all of the new errors that occurred because of the upgrade.
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Differential Revision: D21884575
Pulled By: ezyang
fbshipit-source-id: 45c8e1f1ecb410c8d7c46dd3922ad70e982a0685
Summary:
This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument.
In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872
Differential Revision: D21740237
Pulled By: mruberry
fbshipit-source-id: acbc027aa1d7877a49664d94db9a5fff91a07042
Summary:
This updates assertEqual and assertEqual-like functions to either require both or neither of atol and rtol be specified. This should improve clarity around handling precision in the test suite, and it allows us to remove the legacy positional atol argument from assertEqual. In addition, the "message" kwarg is replace with a kwarg-only "msg" argument whose name is consistent with unittest's assertEqual argument.
In the future we could make "msg" an optional third positional argument to be more consistent with unittest's assertEqual, but requiring it be specified should be clear, and we can easily update the signature to make "msg" an optional positional argument in the future, too.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/38872
Differential Revision: D21717199
Pulled By: mruberry
fbshipit-source-id: 9feb856f94eee911b44f6c7140a1d07c1b026d3a
Summary:
The root cause of incorrect rendering is that numbers are treated as a string if the data type is not specified. Therefore the data is sort based on the first digit.
closes https://github.com/pytorch/pytorch/issues/29906
cc orionr sanekmelnikov
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31544
Differential Revision: D21105403
Pulled By: natalialunova
fbshipit-source-id: a676ff5ab94c5bdb653615d43219604e54747e56
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35615
Python 2 has reached end-of-life and is no longer supported by PyTorch.
Now we can clean up a lot of cruft that we put in place to support it.
These changes were all done manually, and I skipped anything that seemed
like it would take more than a few seconds, so I think it makes sense to
review it manually as well (though using side-by-side view and ignoring
whitespace change might be helpful).
Test Plan: CI
Differential Revision: D20842886
Pulled By: dreiss
fbshipit-source-id: 8cad4e87c45895e7ce3938a88e61157a79504aed
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30445
Create distributed and rpc directories under caffe/test for better management
of unit tests.
Differential Revision: D18702786
fbshipit-source-id: e9daeed0cfb846ef68806f6decfcb57c0e0e3606
Summary:
When calling the add_images() method on the tensorboard SummaryWriter with a uint8 NCHW tensor, the tensor is incorrectly scaled, resulting in overflow behavior. This leads to incorrect images being displayed in tensorboard.
Issue: https://github.com/pytorch/pytorch/issues/31459
Local Testing (ran this code with and without the PR changes and printed scale_factor):
import torch
import torchvision
from torch.utils.tensorboard import SummaryWriter
writer = SummaryWriter()
x=torch.tensor([[[[1, 2, 3], [4, 5, 6]]]], dtype=torch.uint8)
writer.add_images("images", x)
Before- scale_factor: 255, After- scale_factor: 1
Pull Request resolved: https://github.com/pytorch/pytorch/pull/31778
Differential Revision: D19289189
Pulled By: anjali411
fbshipit-source-id: 350a1650337244deae4fd8f8b7fb0e354ae6986b
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30298
This diff fixes test_tensorboard for python2:
- proto serialization is different in py2 vs py3 (e.g. for bytes) -> simple string comparison will fail for test_pytorch_graph. Modified to make graph comparison field by field
Reviewed By: J0Nreynolds
Differential Revision: D18654691
fbshipit-source-id: fdbca32e9a7fc2ea70a040bb825eab8a48d0dfe4
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30286
add_hparams() in torch.utils.tensorboard.writer produced the following error
python3.7/site-packages/torch/utils/tensorboard/writer.py", line 294, in add_hparams
with SummaryWriter(log_dir=os.path.join(self.file_writer.get_logdir(), str(time.time()))) as w_hp:
AttributeError: 'NoneType' object has no attribute 'get_logdir'
Other methods such as add_scalar() and add_histogram() use self._get_file_writer() instead of self.file_writer directly.
Test Plan:
```
writer = summary_writer()
writer.add_hparams({"a": 0, "b": 0}, {"hparam/test_accuracy": 0.5}))
writer.flush()
writer.close()
```
Reviewed By: J0Nreynolds, sanekmelnikov
Differential Revision: D18650610
fbshipit-source-id: 1039dd2067d37913a8a131c8b372491a63154899
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/30244
This makes several small changes to the tensorboard graph parsing methods to address the recent changes to the PyTorch JIT trace/graph.
- Inline graph to get information for all nodes
- Assign and propagate scope names to GetAttr nodes
- Prune all useless GetAttr nodes (any with a ClassType output type - tensors and primitives are kept)
- Create output nodes so output tensor shape can be examined
Reviewed By: sanekmelnikov
Differential Revision: D18556323
fbshipit-source-id: b73a809bacfa554c3fe9c4ae3563525f57539874
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26395
This diff makes each SummaryWriter write into its own unique directory.
Reviewed By: orionr
Differential Revision: D17441500
fbshipit-source-id: d284fcf0e7e7a7214e644349e345f1de0e1a1aba
Summary:
To give better signal to the user, we will now always create the TensorBoard tests classes and just disable tests if TensorBoard is not installed.
cc lanpa sanekmelnikov natalialunova pietern
[test macos]
Pull Request resolved: https://github.com/pytorch/pytorch/pull/26005
Reviewed By: sanekmelnikov
Differential Revision: D17352430
Pulled By: orionr
fbshipit-source-id: 87a592064f4768ffded76a3d666a8e508a1ef164
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/25993
These imports fail the test suite if they're not installed, even if we
don't end up testing tensorboard.
[test macos]
Test Plan: Imported from OSS
Differential Revision: D17318588
Pulled By: pietern
fbshipit-source-id: febad497ecb3fd292317f68fc2439acd893ccd67
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/24040
This diff fixes failed test in test_tensorboard.py:
- fixed test_image_with_boxes: tests compares serialized protobuf Summary object with image against expected serialized protobuf in file. Turns out that comparing images string by string might not work (e.g. if they were serialized with different versions of image library) - images can be equal, though due to differences in metadata or compression methods actual strings might differ. Changed to compare images using == from PIL.Image
Reviewed By: orionr
Differential Revision: D16715831
fbshipit-source-id: 7dd4a7cfc8e63767ed727656f1891edd273d95da