This PR applies clang-tidy readability checks to jit sources and all headers in the code base.
`readability-redundant-inline-specifier` is suppressed because it incurs too many changes. `readability-redundant-inline-specifier` is used to detect redundant inline specifiers on function and variable declarations. There are many in-class method definitions that are marked inline.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164652
Approved by: https://github.com/Skylion007
This PR applies clang-tidy readability checks to jit sources and all headers in the code base.
`readability-redundant-inline-specifier` is suppressed because it incurs too many changes. `readability-redundant-inline-specifier` is used to detect redundant inline specifiers on function and variable declarations. There are many in-class method definitions that are marked inline.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164652
Approved by: https://github.com/Skylion007
This commit fixes a memory leak caused by creating a new PyListObject using PyDict_Items() and not releasing that list later. This often prevented the entire model from being de-allocated even when all python references to it have gone out of scope.
Here is a repro script:
```python
import psutil, torch, transformers, gc, os, sys
import math
# Size in MB
model_size = 512
kB = 1024
MB = kB * kB
precision_size = 4 # bytes per float
activation_size = math.floor(math.sqrt(model_size * MB / precision_size))
class Net(torch.nn.Module):
def __init__(self, activation_size):
super(Net, self).__init__()
self.linear = torch.nn.Linear(activation_size, activation_size)
def forward(self, x):
return {"result": self.linear(x)}
def collect_and_report(s):
gc.collect()
print(s)
#print("psutil: ", psutil.virtual_memory().percent)
print("CPU MB used by this process: ", psutil.Process(os.getpid()).memory_info().rss / 1024 ** 2)
print("GPU MB allocated by pytorch: ", torch.cuda.memory_allocated(0) / 1024 ** 2)
print()
def run_test(device_str):
device = torch.device(device_str)
dummy_input = torch.zeros(activation_size, requires_grad=True).to(device)
collect_and_report("Before loading model: ")
model = Net(activation_size).to(device)
collect_and_report("After loading model: ")
torch.onnx.export(model, dummy_input, "dummy.onnx")
collect_and_report("After exporting model: ")
del model
collect_and_report("After deleting model:")
print("Running CPU test: ")
run_test("cpu")
print("Running GPU test: ")
run_test("cuda")
```
Results with this commit:
```
Running CPU test:
Before loading model:
CPU MB used by this process: 346.5
GPU MB allocated by pytorch: 0.0
After loading model:
CPU MB used by this process: 861.078125
GPU MB allocated by pytorch: 0.0
After exporting model:
CPU MB used by this process: 880.12890625
GPU MB allocated by pytorch: 0.0
After deleting model:
CPU MB used by this process: 880.12890625
GPU MB allocated by pytorch: 0.0
Running GPU test:
Before loading model:
CPU MB used by this process: 991.9375
GPU MB allocated by pytorch: 0.04443359375
After loading model:
CPU MB used by this process: 992.19140625
GPU MB allocated by pytorch: 512.0888671875
After exporting model:
CPU MB used by this process: 1026.64453125
GPU MB allocated by pytorch: 520.25830078125
After deleting model:
CPU MB used by this process: 1026.64453125
GPU MB allocated by pytorch: 520.25830078125
```
With this commit:
```
Running CPU test:
Before loading model:
CPU MB used by this process: 372.7734375
GPU MB allocated by pytorch: 0.0
After loading model:
CPU MB used by this process: 887.18359375
GPU MB allocated by pytorch: 0.0
After exporting model:
CPU MB used by this process: 918.96875
GPU MB allocated by pytorch: 0.0
After deleting model:
CPU MB used by this process: 407.3671875
GPU MB allocated by pytorch: 0.0
Running GPU test:
Before loading model:
CPU MB used by this process: 516.6875
GPU MB allocated by pytorch: 0.04443359375
After loading model:
CPU MB used by this process: 516.75390625
GPU MB allocated by pytorch: 512.0888671875
After exporting model:
CPU MB used by this process: 554.25390625
GPU MB allocated by pytorch: 520.2138671875
After deleting model:
CPU MB used by this process: 554.25390625
GPU MB allocated by pytorch: 8.16943359375
```
Fixes#106976
Pull Request resolved: https://github.com/pytorch/pytorch/pull/107244
Approved by: https://github.com/BowenBao, https://github.com/kit1980
As we live in C++17 world
This is a functional no-op, just
- `s/namespace at { namespace native {/namespace at::native {/`
- `s/namespace torch { namespace jit {/namespace torch::jit {/`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92100
Approved by: https://github.com/izaitsevfb
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/73284
Some important ops won't support optional type until opset 16,
so we can't fully test things end-to-end, but I believe this should
be all that's needed. Once ONNX Runtime supports opset 16,
we can do more testing and fix any remaining bugs.
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D34625646
Pulled By: malfet
fbshipit-source-id: 537fcbc1e9d87686cc61f5bd66a997e99cec287b
Co-authored-by: BowenBao <bowbao@microsoft.com>
Co-authored-by: neginraoof <neginmr@utexas.edu>
Co-authored-by: Nikita Shulga <nshulga@fb.com>
(cherry picked from commit 822e79f31ae54d73407f34f166b654f4ba115ea5)
Summary:
* Minor: spelling, grammar.
* Add calls to `GRAPH_DUMP()` where they were missing.
* Add or expand a few comments.
* Move a few comments to seemingly more appropriate spots.
* In canonicalize_graph_fuser_ops.cpp inline `runnableInputs()` since it
was only called in one place and had a misleading comment and
confusing name.
* In `PeepholeOptimizeImpl::optimizeBlock()`, set `changed = true;` when
removing `aten::is_complex`. Pretty sure its absence was a bug.
* Delete unused `_jit_pass_remove_inplace_ops` and and its
implementation `RemoveInplaceOps()`.
* In `preprocessCaffe2Ops()`, remove redundant check for nested optional
types. It was already checked in `checkONNXCompatibility()`.
* In `EncoderBase::AddAttribute`, log the unexpected attribute kind.
I don't remember the repro case now but I did hit this error at some
point and this additional logging made it easier to understand.
* In `fuseConvBatchNorm()` in eval_peephole.cpp, consistently use
camelCase instead of snake_case for local variables.
* Add curly braces around the bodies of if and loops.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60390
Reviewed By: Krovatkin
Differential Revision: D29523283
Pulled By: SplitInfinity
fbshipit-source-id: 4e16c5648616f53da07d68dab7fdf252e06a0752
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/55799
I'm going to change the implementation of cdata soon so I need to
abstract over cdata access with a function. Additionally, many
users are casting manually casting to THPVariable to access
the member so I can remove these unsafe casts in the client code
(the implementation, of course, is still doing an unsafe cast.)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Test Plan: Imported from OSS
Reviewed By: albanD
Differential Revision: D27712130
Pulled By: ezyang
fbshipit-source-id: 95fcc013bf3913d67f2c634068eb5b3aab144cb3
Summary:
Enables the use of NoneType arguments to inputs tuple in the export API
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45792
Reviewed By: heitorschueroff
Differential Revision: D24312784
Pulled By: bzinodev
fbshipit-source-id: 1717e856b56062add371af7dc09cdd9c7b5646da
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/35115
This commit runs the newly added tools/clang_format.py on the JIT
codebase and includes all of the formatting changes thus produced.
Testing:
Ran the script, CI.
Test Plan: Imported from OSS
Reviewed By: eellison
Differential Revision: D20568523
Pulled By: SplitInfinity
fbshipit-source-id: e09bdb982ccf090eecfb7c7b461b8d0681eef82b