This PR enables all PIE rules on ruff, there are already some enabled rules from this family, the new added rules are
```
PIE796 Enum contains duplicate value: {value}
PIE808 Unnecessary start argument in range
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165814
Approved by: https://github.com/ezyang
This PR enables all PIE rules on ruff, there are already some enabled rules from this family, the new added rules are
```
PIE796 Enum contains duplicate value: {value}
PIE808 Unnecessary start argument in range
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/165814
Approved by: https://github.com/ezyang
This PR is to enable the following tests rocm.
test/test_jit.py::TestBackends::test_save_load
test/test_jit.py::TestBackends::test_execution
test/test_jit.py::TestBackends::test_errors
test/test_jit.py::TestCUDA::test_current_stream
Verified that the tests pass on AMD gfx90a and gfx942 arch.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/164582
Approved by: https://github.com/jeffdaily
`grad_dtype` is a new attribute on Tensor to control gradient dtype:
- Access/setting is leaf-only.
- grad_dtype is respected when (1) when assigning to .grad, and (2) in the engine after the previous node produces incoming gradients for AccumulateGrad. (See table below for details)
- Not setting grad_dtype preserves the current behavior. Accessing it returns `t.dtype`
- `grad_dtype` cannot be set when there is already a `.grad` present and the dtypes conflict.
| `grad_dtype` setting | Setting `.grad` manually | Incoming gradient from autograd engine |
|-----------------------|--------------------------|-----------------------------------------|
| **Default (tensor’s dtype)** | `.grad` must match tensor’s dtype | Engine casts incoming grad to tensor’s dtype |
| **Set to specific dtype** | `.grad` must match that dtype | Engine casts incoming grad to the specified dtype |
| **Set to `None`** | `.grad` may be any dtype | Engine does not cast; accepts incoming grad dtype as-is |
Pull Request resolved: https://github.com/pytorch/pytorch/pull/162815
Approved by: https://github.com/albanD
Context: During jit.script, the TorchScript frontend maintains a callstack of Python frames, which is used to present the corresponding user code in case TorchScript errors. The callstack is maintained via ErrorReport::CallStack RAII guards. Before recursing into a function, an ErrorReport::CallStack guard is created and the CallStack guard pushes the frame information onto a thread_local callstack (a list of calls); and after exiting, the frame information is popped off the callstack. Note that the CallStack guards are also sometimes used in python via pybindings.
The problem is that sometimes another thread can obtain a reference to the CallStack guard (if it's a Python CallStack guard). **This means that the destructor for a CallStack guard can be called from a different thread than the constructor was called**. When this happens, it causes a segfault.
This PR makes the callstack vector thread-safe to access, and each CallStack guard will store a reference to the callstack vector onto which it pushed. When the CallStack guard is destructed, it pops off the appropriate callstack vector. Although this could potentially lead to mangled callstacks, it should prevent segfaults.
Added a test `test_thread_safe_error_stacks` which segfaults prior to these changes, and no longer segfaults.
Differential Revision: [D80054972](https://our.internmc.facebook.com/intern/diff/D80054972)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/160386
Approved by: https://github.com/eellison
Summary:
Add support for torch._check() in TorchScript jit.script frontend.
* It will be special cased to behave like torch._assert, turned into an if + raise exception.
Test Plan:
Unit tests
Rollback Plan:
Differential Revision: D79744604
Pull Request resolved: https://github.com/pytorch/pytorch/pull/159988
Approved by: https://github.com/davidberard98
This PR corrects a small spelling error in `test/jit/test_alias_analysis.py`.
- "initalized" → "initialized"
This is a minor comment correction and does not affect functionality or logic.
Thank you for maintaining this amazing codebase.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157628
Approved by: https://github.com/Skylion007
This PR fixes a minor typo in `test/jit/test_modules.py`:
- Before: `intialized`
- After: `initialized`
There are no functional code changes — this is a comment-only fix to improve clarity and consistency.
Thank you to the PyTorch team for maintaining this outstanding project.
Please let me know if anything else is needed.
With appreciation,
Abhishek Nandy
[@abhitorch81](https://github.com/abhitorch81)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/157226
Approved by: https://github.com/Skylion007
This PR is part of a series attempting to re-submit https://github.com/pytorch/pytorch/pull/134592 as smaller PRs.
In jit tests:
- Add and use a common raise_on_run_directly method for when a user runs a test file directly which should not be run this way. Print the file which the user should have run.
- Raise a RuntimeError on tests which have been disabled (not run)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154725
Approved by: https://github.com/clee2000
Summary:
**TL;DR**: make DCE faster by replacing a Set<Value*> with a MemoryLocations sparse bitset (representing all the memory locations stored by the collection of all values in the set).
**Details**
The goal of this PR is to optimize this function from AliasDb:
```
bool AliasDb::writesToAlias(Node* n, const ValueSet& vs) const {
const auto writtenTo = getWrites(n);
if (writtenTo.empty()) {
return false;
}
MemoryLocations locs;
for (const auto v : vs) {
auto it = elementMap_.find(v);
if (it != elementMap_.end()) {
const auto& vlocs = memoryDAG_->getMemoryLocations(it->second);
if (writtenTo.intersects(vlocs)) {
return true;
}
}
}
return false;
}
```
In the DCE use case, we have a ValueSet of live values, into which we insert `Value*`s; and sometimes need to check whether a node mutates any of the live values using `writesToAlias`.
Looping through all the values in the ValueSet and indexing into the elementMap_ is slow; so if we can pre-compute the MemoryLocations set, this speeds up the function. In some large model examples, I see ~15-25x speedups from this change.
**Implementation**: To avoid exposing too many details of AliasDb, I introduce a friend class `ValueAndMemoryLocationSet`, which is an insert-only set of Values, which also maintains the corresponding MemoryLocations.
Then in AliasDb, I use `ValueAndMemoryLocationSet` if we're using AliasDb for analysis, and otherwise use a `Set<Value*>` if we don't have AliasDb.
Test Plan: Rely on unit tests.
Differential Revision: D74827086
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153645
Approved by: https://github.com/eellison
The same code is repeated multiple times with slightly different implementations.
Use the existing function for brevity and consistency.
In the function the code from `test_export` is used which does a single `load_library` with cleaner conditions
Pull Request resolved: https://github.com/pytorch/pytorch/pull/148082
Approved by: https://github.com/angelayi
Summary:
`c10::AttributeError` is not automatically converted to Python AttributeError, it needs some special macros (e.g. `HANDLE_TH_ERRORS`).
Some Python functions like `hasattr` rely on the type of the throw exception to be correct.
We don't need the fully generality of those macros, so just do a targeted error type conversion here.
Test Plan: added unit test
Differential Revision: D69197217
Pull Request resolved: https://github.com/pytorch/pytorch/pull/146516
Approved by: https://github.com/zdevito
Changes by apply order:
1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`.
2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`.
3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first.
`.parent{...}.absolute()` -> `.absolute().parent{...}`
4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.)
`.parent.parent.parent.parent` -> `.parents[3]`
5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~
~`.parents[3]` -> `.parents[4 - 1]`~
6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374
Approved by: https://github.com/justinchuby, https://github.com/malfet
Changes by apply order:
1. Replace all `".."` and `os.pardir` usage with `os.path.dirname(...)`.
2. Replace nested `os.path.dirname(os.path.dirname(...))` call with `str(Path(...).parent.parent)`.
3. Reorder `.absolute()` ~/ `.resolve()`~ and `.parent`: always resolve the path first.
`.parent{...}.absolute()` -> `.absolute().parent{...}`
4. Replace chained `.parent x N` with `.parents[${N - 1}]`: the code is easier to read (see 5.)
`.parent.parent.parent.parent` -> `.parents[3]`
5. ~Replace `.parents[${N - 1}]` with `.parents[${N} - 1]`: the code is easier to read and does not introduce any runtime overhead.~
~`.parents[3]` -> `.parents[4 - 1]`~
6. ~Replace `.parents[2 - 1]` with `.parent.parent`: because the code is shorter and easier to read.~
Pull Request resolved: https://github.com/pytorch/pytorch/pull/129374
Approved by: https://github.com/justinchuby, https://github.com/malfet
Add similar semantics for creating a buffer object similar to creating a parameter. This is done by introducing a new Buffer class that can be used for type disambiguation. The underlying functionality of registering a buffer remains the same as the register_buffer method has not been changed. The persistent parameter in the Buffer type is to indicate whether a buffer object should be persistent or not. Other non-test changes have to do with getting the new Buffer type recognized by inductor and dynamo. Remaining changes are test changes to make sure that the Buffer type can be used as a drop in replacement for register_buffer as it just leads to register_buffer being called. The addition of this new functionality still allows for normal tensors to be used as buffers so these changes are intended to be backwards compatible.
Fixes#35735
Co-authored-by: Mikayla Gawarecki <mikaylagawarecki@gmail.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/125971
Approved by: https://github.com/albanD, https://github.com/anijain2305, https://github.com/mlazos