https://github.com/pytorch/pytorch/pull/126586 tried to reset dynamo before each unit test. That PR get reverted a couple of times because we see post-land test failures that we don't see before merge. This PR only reset dynamo before each tests in `test_ops_gradients.py` to make it easier to land.
Eventually after we reset dynamo in each individual test files, we can move the change to the base class (TestCase) and remove the change in individual test files.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131397
Approved by: https://github.com/zou3519
ghstack dependencies: #131551, #131388, #131372
https://github.com/pytorch/pytorch/pull/126586 tried to reset dynamo before each unit test. That PR get reverted a couple of times because we see post-land test failures that we don't see before merge. This PR only reset dynamo before each tests in `test_module.py` to make it easier to land.
Eventually after we reset dynamo in each individual test files, we can move the change to the base class (TestCase) and remove the change in individual test files.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131372
Approved by: https://github.com/zou3519
ghstack dependencies: #131551, #131388
https://github.com/pytorch/pytorch/pull/126586 tried to reset dynamo before each unit test. That PR get reverted a couple of times because we see post-land test failures that we don't see before merge. This PR only reset dynamo before each tests in `test_torch.py` to make it easier to land.
Eventually after we reset dynamo in each individual test files, we can move the change to the base class (TestCase) and remove the change in individual test files.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/131388
Approved by: https://github.com/zou3519
ghstack dependencies: #131551
The search code to find the dynamo skip files wasn't working properly when used with pytest and multiple files:
```
pytest a.py b.py
```
because pytest would point `__main__` at itself instead of the individual file. (This worked fine when only running a single file test)
Change the scanning code to look for the skip directory relative to its own file first.
While in there add/update some comments and log a warning when the directory wasn't found (instead of a hard crash).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120521
Approved by: https://github.com/oulgen
This PR removes and adds some failures and successes that were hidden in the past week (ish).
https://github.com/pytorch/pytorch/pull/119408 (47182a8f4b5e36e280ca3595ba134f53499d2dc9) accidentally removed environment variables on rerun (see PR body of https://github.com/pytorch/pytorch/pull/120251 for slightly more details).
Enabling testing with dynamo is set using an env var, so if a test failed with dynamo, it would rerun without the dynamo env var set, making it pass on retry. Normally, the flaky test bot would catch this and make an issue for the test, but the CI env var controls whether or not xml test reports get made, and that also got removed on rerun, so the xmls weren't made either.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120271
Approved by: https://github.com/DanilBaibak, https://github.com/zou3519
Fix: https://github.com/pytorch/pytorch/issues/119779 by properly graph breaking a proper fix is to handle quantized tensors for full complete solution.
if when generating a fake tensor, UnsupportedFakeTensorException is thrown, then its handled and converted into a
Unimplemented in inside wrap_fake_exception which is then translated to a graph break.
However run_node used to convert UnsupportedFakeTensorException into a runtime error, creating runtime
errors instead of graph breaks whenever generating a fake tensor for a quantized tensor fails.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/120026
Approved by: https://github.com/jansel
partially address https://github.com/pytorch/pytorch/issues/118785
This diff fixes three things:
1. add get_function to FunctoolsPartialVariable note that it will be available only if all args constant otherwise,
it would throw unimplemented in the call to asPythonConstant.
2. NamedTupleVariable takes args dispatched not as list ex: NamedTuple(a, b, c) vs NamedTuple([a, b, c]),
hence fix that by specializing asProxy.
3. A call to create_arg from within create_proxy, changes a python NamedTuple to a function call node without
associating an example value! Updated get_fake_values_from_nodes to handle such case.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119435
Approved by: https://github.com/jansel, https://github.com/anijain2305
ghstack dependencies: #119314
Finally we have this PR to merge allow_in_graph/inline/skip trace rules into ```trace_rules.lookup_inner```, where we can define and lookup trace rules at both function level and file level. Going forward, this is the central place that we define and consulte Dynamo trace rule for any function.
* ```trace_rules.looup``` is the API can return allow_in_graph, inline or skip.
* ```skipfiles.check``` is the API can return inline or skip, since we have multiple places that only do inline/skip check.
* I'll move ```skipfiles.check``` to ```trace_rules.check``` as one of the follow-ups.
* Both functions consulte ```trace_rules.lookup_inner``` to get the tracing rule.
To avoid a single big PR, I left a few items as the follow-ups:
* Remove ```skipfiles.py``` and merge the code into ```trace_rules.py```.
* We do double check in ```symbolic_convert.check_inlineable```, will refactor and simplify it. We should only do inline/skip check before generating ```SkipFilesVariable``` and ```UserFunctionVariable```.
* Rename ```SkipFilesVariable``` as ```SkipFunctionVariable```, since we only handle functions.
* The inline/skip reasons are not logged for some cases, since the new lookup framework doesn't always return inline/skip reasons. I'll refactor loggings to record the inline/skip reason in next step.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118971
Approved by: https://github.com/jansel
Fix https://github.com/pytorch/pytorch/issues/118787
In the compiled function, calls to random() are replaced with a single function call
to a function that generates all the random variables .
The random calls encountered during compilation used to be tracked inside a variable
stored inside the instruction translator. And when there are nested translators, the tracked
calls used to get lost when the inner instructions translator popped out.
This diff fixes that by moving the tracked calla to the output graph which is shared across translators that are generating the same function.
More details about the issue and why this solution is picked are in the github issue above.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/119218
Approved by: https://github.com/jansel, https://github.com/anijain2305
The torch "fake" ndarray had some mismatches vs numpy.ndarray which caused test_sparse_to_sparse_compressed to fail under dynamo.
This also fixes (because the test now hits it) a problem where unpacking a sequence with the incorrect number of args would assert in dynamo instead of graph breaking (because it would throw an exception). Added a unit test for this condition.
Fixed:
- torch._numpy._ndarray.astype() (actually used by the test)
- torch._numpy._ndarray.put() (drive-by discovery)
- torch._numpy._ndarray.view() (drive-by discovery)
(burndown item 7)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117952
Approved by: https://github.com/yanboliang
ghstack dependencies: #117951
test_compressed_layout_conversions_coverage verifies torch's conversions between different memory layouts using numpy as a reference. Since numpy doesn't support BSC format it just skipped that. Instead fake it by using a transposed BSR format.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117951
Approved by: https://github.com/zou3519
Make variables in dict lazy and remove DICT_KEYS guard.
We build the keys of a dict depth-first and we rely on the guards of
each element in the dict to create the correct guards. This allows us to
remove the rather buggy DICT_KEYS guard and make the guard lazy.
The guards are not completely lazy yet, as we instantiate them in
`_HashableTracker._eq_impl` but it should be possible to make them
truly lazy.
Also, adding new types to the supported types within keys should be less
error prone.
This is marginally less efficient when we graph break, but in turn we
should graph break much less. It also makes the dicts code easier to maintain
(removes `is_hashable_python_var`).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/117625
Approved by: https://github.com/jansel, https://github.com/peterbell10, https://github.com/anijain2305
ghstack dependencies: #117982, #118098, #117983
```
def f():
def g():
return ()
print(g.__name__)
f()
```
The following script should print `g` (with or without torch.compile),
but prints `f.<locals>.g` with torch.compile.
The problem looks like we use the co_qualname when reconstructing the
NestedUserFunctionVariable. I switched this over to use the co_name.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118768
Approved by: https://github.com/yanboliang, https://github.com/jansel
Before the pr, we have an error for the following code
```python
def k(x):
with torch.inference_mode():
x = x + 1
return x
torch.compile(k, backend="eager", fullgraph=True)(x)
```
error message:
```
Traceback (most recent call last):
....
return InferenceModeVariable.create(tx, args[0].as_python_constant())
torch._dynamo.exc.InternalTorchDynamoError: list index out of range
```
This pr supports the case when torch.inference_mode is not provided any argument (i.e. default to True).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118427
Approved by: https://github.com/yanboliang, https://github.com/jansel
Dynamo creates Tensors when tracing through numpy ufuncs like np.sin, np.minimum etc. When running, np functions generally return Tensors when run with `torch.compile`. However, we currently require when normalizing `out` arguments that the input is an ndarray. This creates assertion errors when running torch.compile on any numpy function with an out argument:
```
def test_numpy_ufunc_out(self):
@torch.compile(backend="eager")
def foo():
x = np.arange(5)
out = np.empty((x.shape[0], x.shape[0]))
res_out = np.sin(x, out=out)
assert res_out is out
foo()
```
Failure with stack trace: https://gist.github.com/jamesjwu/68e217638d735678b3de968584dba23f
Instead, we can wrap tensors in an ndarray in normalize_outarray to handle the case correctly. Fixing this resolves ~220 tests under dynamo_test_failures, but also exposes a followup bug.
In the presence of a graph break, ndarrays don't preserve their id, which can affect assertions and `is` checks between numpy arrays:
```
def test_x_and_out_broadcast(self, ufunc):
x = self.get_x(ufunc)
out = np.empty((x.shape[0], x.shape[0]))
x_b = np.broadcast_to(x, out.shape)
# ufunc is just np.sin here
res_out = ufunc(x, out=out)
res_bcast = ufunc(x_b)
# passes
assert res_out is out
graph_break()
# fails
assert res_out is out
```
Regular tensors preserve their id because Dynamo caches their example tensor values across a graph break. However, with ndarrays, we only store their converted tensor values, and construct new ndarrays around those values:
eebe7e1d37/torch/_dynamo/variables/builder.py (L1083)
Added a test with expected failure to showcase this — we can then fix that issue separately.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118248
Approved by: https://github.com/lezcano
This is a lot of files changed! Don't panic! Here's how it works:
* Previously, we set `follow_imports = silent` for our mypy.ini configuration. Per https://mypy.readthedocs.io/en/stable/running_mypy.html#follow-imports, what this does is whenever we have an import to a module which is not listed as a file to be typechecked in mypy, we typecheck it as normal but suppress all errors that occurred in that file.
* When mypy is run inside lintrunner, the list of files is precisely the files covered by the glob in lintrunner.toml, but with files in excludes excluded.
* The top-level directive `# mypy: ignore-errors` instructs mypy to typecheck the file as normal, but ignore all errors.
* Therefore, it should be equivalent to set `follow_imports = normal`, if we put `# mypy: ignore-errors` on all files that were previously excluded from the file list.
* Having done this, we can remove the exclude list from .lintrunner.toml, since excluding a file from typechecking is baked into the files themselves.
* torch/_dynamo and torch/_inductor were previously in the exclude list, because they were covered by MYPYINDUCTOR. It is not OK to mark these as `# mypy: ignore-errors` as this will impede typechecking on the alternate configuration. So they are temporarily being checked twice, but I am suppressing the errors in these files as the configurations are not quite the same. I plan to unify the configurations so this is only a temporary state.
* There were some straggler type errors after these changes somehow, so I fixed them as needed. There weren't that many.
In the future, to start type checking a file, just remove the ignore-errors directive from the top of the file.
The codemod was done with this script authored by GPT-4:
```
import glob
exclude_patterns = [
...
]
for pattern in exclude_patterns:
for filepath in glob.glob(pattern, recursive=True):
if filepath.endswith('.py'):
with open(filepath, 'r+') as f:
content = f.read()
f.seek(0, 0)
f.write('# mypy: ignore-errors\n\n' + content)
```
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118414
Approved by: https://github.com/thiagocrepaldi, https://github.com/albanD
Relying on object lifetimes in Python is a bad idea due to reference
cycles. Previously, when a torch.library.Library object gets destroyed,
it clears all the registrations associated with it, but it's unclear
when it actually gets destroyed due to the existence of refcycles.
This PR:
- adds torch::Library::clear(), which deterministically releases all of
the RAII registration handles of the torch::Library object
- adds a new `torch.library._scoped_library` context manager, which creates
a library and cleans it up at the end of the scope using the previous item.
All tests (unless they already handle library lifetimes) should use
this new API
- Rewrites some flaky tests to use `_scoped_library`.
In the future we'll probably migrate all of our torch.library tests to
use `_scoped_library`, but that's kind of annoying because we have
multiple thousands of LOC
I'm hoping this will deflake those tests; we'll see.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/118318
Approved by: https://github.com/albanD