Twice this week I have had people confuse "operator defined with Python
operator registration aka torch.library" and "PyOperator which is used
to define control flow operators and other operators that cannot be
represented in JIT schema." Renaming PyOperator for clarity.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97493
Approved by: https://github.com/SherlockNoMad
Preferring dash over underscore in command-line options. Add `--command-arg-name` to the argument parser. The old arguments with underscores `--command_arg_name` are kept for backward compatibility.
Both dashes and underscores are used in the PyTorch codebase. Some argument parsers only have dashes or only have underscores in arguments. For example, the `torchrun` utility for distributed training only accepts underscore arguments (e.g., `--master_port`). The dashes are more common in other command-line tools. And it looks to be the default choice in the Python standard library:
`argparse.BooleanOptionalAction`: 4a9dff0e5a/Lib/argparse.py (L893-L895)
```python
class BooleanOptionalAction(Action):
def __init__(...):
if option_string.startswith('--'):
option_string = '--no-' + option_string[2:]
_option_strings.append(option_string)
```
It adds `--no-argname`, not `--no_argname`. Also typing `_` need to press the shift or the caps-lock key than `-`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94505
Approved by: https://github.com/ezyang, https://github.com/seemethere
Optimize unnecessary collection cast calls, unnecessary calls to list, tuple, and dict, and simplify calls to the sorted builtin. This should strictly improve speed and improve readability.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/94323
Approved by: https://github.com/albanD
Not only is this change usually shorter and more readable, it also can yield better performance. size() is not always a constant time operation (such as on LinkedLists), but empty() always is.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/93236
Approved by: https://github.com/malfet
This PR:
- adds deprecation warnings when calling the functorch APIs
- adds documentation saying that those APIs are deprecated
It does this by creating thin wrappers around the original APIs that (1)
raise deprecation warnings and (2) have an additional line in their
documentation that they are deprecated.
NB:
- Python surpresses DeprecationWarning, so we use UserWarning instead.
Test Plan:
- New tests
- the functorch.* APIs are still tested for correctness because that's
what test/functorch/* use (as opposed to directly calling the
torch.func.* APIs)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92279
Approved by: https://github.com/albanD, https://github.com/soulitzer
Adds a PyInstDecoder object that handles the differences in bytecode
added in 3.11. Basically some instructions have inline caches which
change the size of the instruction, so calculating the next instruction
is slightly different.
fixes#91246
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91290
Approved by: https://github.com/albanD
In control_flow.cond(), we unwrap arguments' proxy by using
get_proxy_slot() call which call a lambda in the end to get the stored
proxy. For SymInt and SymFloat we hide the proxy under a thunk instead
of storing proxy on .proxy attribute diretly, therefore we need to
special case SymInt for unwrapping here.
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91907
Approved by: https://github.com/ezyang
3 fixes made to control_flow.map:
1. argument list won't accept torch.nn.Module anymore, only Tensors.
2. during tracing we call new_empty from the returned sample output
instead xs to correctly inherit tensor metadata.
3. for FakeTensorMode we implement map() using new_empty() as well
instead of torch.stack() to preserve symbolic shape output.
Fixes #ISSUE_NUMBER
Pull Request resolved: https://github.com/pytorch/pytorch/pull/91906
Approved by: https://github.com/tugsbayasgalan
The main changes are:
1. Remove outdated checks for old compiler versions because they can't support C++17.
2. Remove outdated CMake checks because it now requires 3.18.
3. Remove outdated CUDA checks because we are moving to CUDA 11.
Almost all changes are in CMake files for easy audition.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90599
Approved by: https://github.com/soumith
This PR adds functionalization path for torch.cond. As it is the first pass, we only functionalize for very restrictive use cases. We explicitly restrict following:
- Output of each branch aliasing input
- In-place mutation on inputs given to each branch
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89966
Approved by: https://github.com/zou3519
Variable length arguments can overflow the arena being used to keep overhead
low for torch dims. If we hit this case, we know the amount of work being done
is already relatively big, so we just fallback to standard memory allocation.
Fixes#88586
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88596
Approved by: https://github.com/ezyang
This will be the last disruptive functorch internals change.
Why are we moving these files?
- As a part of rationalizing functorch we are moving the code in
functorch/_src to torch/_functorch
- This is so that we can offer the functorch APIs as native PyTorch APIs
(coming soon) and resolve some internal build issues.
Why are we moving all of these files at once?
- It's better to break developers all at once rather than many times
Test Plan:
- wait for tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/90091
Approved by: https://github.com/anijain2305, https://github.com/ezyang
See code comment for details. I also had to do some extra fixes:
* `run_functionalized_fw_and_collect_metadata` now is able to handle duplicated arguments
* `aot_wrapper_dedupe` now always returns boxed compiled functions
* `aot_wrapper_dedupe` is now applied to inference compiler along with autograd compiler (preexisting)
Fixes https://github.com/pytorch/torchdynamo/issues/1939
Fixes DebertaV2ForQuestionAnswering DebertaForMaskedLM DebertaForQuestionAnswering DebertaV2ForMaskedLM
Repro command:
```
python benchmarks/dynamo/huggingface.py --performance --float32 -dcuda --training --inductor --no-skip --dashboard --only DebertaForQuestionAnswering --cold_start_latency
```
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/89896
Approved by: https://github.com/bdhirsh
This will be the last disruptive functorch internals change.
Why are we moving these files?
- As a part of rationalizing functorch we are moving the code in
functorch/_src to torch/_functorch
- This is so that we can offer the functorch APIs as native PyTorch APIs
(coming soon) and resolve some internal build issues.
Why are we moving all of these files at once?
- It's better to break developers all at once rather than many times
Test Plan:
- wait for tests
Pull Request resolved: https://github.com/pytorch/pytorch/pull/88756
Approved by: https://github.com/ezyang