pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 13:44:15 +08:00

Author	SHA1	Message	Date
Guilherme Leobas	5ec9c0bc4a	Fix `linearize(grad(...))` call (#133364 ) Fixes #124550 Also moves `graph.eliminate_dead_code()` call to a few lines after `_inline_module(...)` in `const_fold.py` * Test plan: Add a new test on `test_eager_transforms.py` to ensure the reported issue was indeed fixed Pull Request resolved: https://github.com/pytorch/pytorch/pull/133364 Approved by: https://github.com/zou3519	2024-08-15 17:55:36 +00:00
PyTorch MergeBot	cbee9c1fd2	Revert "Deprecate `torch._utils.is_compiling()` and `torch._dynamo.external_utils.is_compiling()` (#127690 )" This reverts commit 0e7e61f7cec82a43f2de52b83eff152d703be7a3. Reverted https://github.com/pytorch/pytorch/pull/127690 on behalf of https://github.com/kit1980 due to breaking internal builds ([comment](https://github.com/pytorch/pytorch/pull/127690#issuecomment-2272370386))	2024-08-07 00:05:20 +00:00
Xuehai Pan	0e7e61f7ce	Deprecate `torch._utils.is_compiling()` and `torch._dynamo.external_utils.is_compiling()` (#127690 ) This PR is split from PR #126898. - #126898 ------ Pull Request resolved: https://github.com/pytorch/pytorch/pull/127690 Approved by: https://github.com/Skylion007, https://github.com/malfet	2024-08-03 09:43:38 +00:00
Xuehai Pan	e7eeee473c	[BE][Easy][14/19] enforce style for empty lines in import segments in `torch/_[a-c]/` and `torch/_[e-h]/` and `torch/_[j-z]*/` (#129765 ) See https://github.com/pytorch/pytorch/pull/129751#issue-2380881501. Most changes are auto-generated by linter. You can review these PRs via: ```bash git diff --ignore-all-space --ignore-blank-lines HEAD~1 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/129765 Approved by: https://github.com/ezyang	2024-07-31 10:42:50 +00:00
rzou	207fb96155	[functorch] saved tensor hooks error should only apply to grad, vjp transforms. (#131191 ) There's no reason to ban them for vmap or jvp, because without the {grad, vjp} transforms those just act above PyTorch autograd, which will end up saving regular Tensors. Test Plan: - some tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/131191 Approved by: https://github.com/drisspg	2024-07-19 23:16:27 +00:00
Guilherme Leobas	9818283da1	re-enable jacrev/jacfwd/hessian after #128028 landed (#128622 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128622 Approved by: https://github.com/zou3519	2024-06-18 17:08:58 +00:00
Guilherme Leobas	4460e481bc	Disable jacrev/jacfwd/hessian if compiling with dynamo (#128255 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/128255 Approved by: https://github.com/zou3519	2024-06-10 20:47:53 +00:00
PyTorch MergeBot	90bb510ece	Revert "Deprecate `torch._utils.is_compiling()` and `torch._dynamo.external_utils.is_compiling()` (#127690 )" This reverts commit 348b181a97abc2e636a6c18e5880a78e5d1dab94. Reverted https://github.com/pytorch/pytorch/pull/127690 on behalf of https://github.com/clee2000 due to sorry I think https://github.com/pytorch/pytorch/pull/126898#issuecomment-2142884456 is still relevant, I will reach out to them to see what needs to be done in internal to get this remerged ([comment](https://github.com/pytorch/pytorch/pull/127690#issuecomment-2159248859))	2024-06-10 20:44:42 +00:00
Xuehai Pan	348b181a97	Deprecate `torch._utils.is_compiling()` and `torch._dynamo.external_utils.is_compiling()` (#127690 ) This PR is split from PR #126898. - #126898 ------ Pull Request resolved: https://github.com/pytorch/pytorch/pull/127690 Approved by: https://github.com/Skylion007	2024-06-08 15:25:03 +00:00
PyTorch MergeBot	033e733021	Revert "[BE] wrap deprecated function/class with `typing_extensions.deprecated` (#126898 )" This reverts commit 749a132fb0a8325cbad4734a563aa459ca611991. Reverted https://github.com/pytorch/pytorch/pull/126898 on behalf of https://github.com/fbgheith due to switching typing-extensions=4.3.0 to 4.9.0 causes internal failure ([comment](https://github.com/pytorch/pytorch/pull/126898#issuecomment-2142884456))	2024-05-31 19:47:24 +00:00
Xuehai Pan	749a132fb0	[BE] wrap deprecated function/class with `typing_extensions.deprecated` (#126898 ) Use `typing_extensions.deprecated` for deprecation annotation if possible. Otherwise, add `category=FutureWarning` to `warnings.warn("message")` if the category is missing. Note that only warnings that their messages contain `[Dd]eprecat(ed\|ion)` are updated in this PR. UPDATE: Use `FutureWarning` instead of `DeprecationWarning`. Resolves #126888 - #126888 Pull Request resolved: https://github.com/pytorch/pytorch/pull/126898 Approved by: https://github.com/albanD	2024-05-29 12:09:27 +00:00
Guilherme Leobas	763dc26e59	[Dynamo] Add dynamo support to torch.func.linearize (#123118 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/123118 Approved by: https://github.com/zou3519	2024-04-23 21:31:49 +00:00
Xuehai Pan	73f0ecc1ac	[BE] UFMT directory `torch/_functorch` (#123723 ) Part of #123062 - #123062 Pull Request resolved: https://github.com/pytorch/pytorch/pull/123723 Approved by: https://github.com/Skylion007	2024-04-12 08:04:51 +00:00
Guilherme Leobas	933d3a7829	Allow dynamo to inline through "hessian" (#121410 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121410 Approved by: https://github.com/zou3519	2024-03-27 21:39:37 +00:00
Guilherme Leobas	4eaa000acc	Teach dynamo about torch.func.jvp (#119926 ) List of changes: - Replace JVP_NESTING by torch._C._functorch.maybe_current_level() - Remove all increment nesting functions from wrap_fx_proxy_cls - fwAD.make_dual receives the dual_level as keyword argument - Add jvp_increment_nesting, set_fwd_grad_enabled and dual_level context managers to dynamo Pull Request resolved: https://github.com/pytorch/pytorch/pull/119926 Approved by: https://github.com/zou3519	2024-03-22 20:25:47 +00:00
PyTorch MergeBot	0696db8202	Revert "Teach dynamo about torch.func.jvp (#119926 )" This reverts commit 17489784b635187316c6c856c5fe6b6a28d8a15a. Reverted https://github.com/pytorch/pytorch/pull/119926 on behalf of https://github.com/peterbell10 due to broken mac jobs on main ([comment](https://github.com/pytorch/pytorch/pull/119926#issuecomment-2010327997))	2024-03-20 18:34:43 +00:00
Guilherme Leobas	17489784b6	Teach dynamo about torch.func.jvp (#119926 ) List of changes: - Replace JVP_NESTING by torch._C._functorch.maybe_current_level() - Remove all increment nesting functions from wrap_fx_proxy_cls - fwAD.make_dual receives the dual_level as keyword argument - Add jvp_increment_nesting, set_fwd_grad_enabled and dual_level context managers to dynamo Pull Request resolved: https://github.com/pytorch/pytorch/pull/119926 Approved by: https://github.com/zou3519	2024-03-20 13:09:19 +00:00
PyTorch MergeBot	36e5c1dcab	Revert "Teach dynamo about torch.func.jvp (#119926 )" This reverts commit edd04b7c16cc6715411119bb7db234a9df59065f. Reverted https://github.com/pytorch/pytorch/pull/119926 on behalf of https://github.com/jeanschmidt due to lots of breakages in pull jobs, checking if reverting this one will help ([comment](https://github.com/pytorch/pytorch/pull/119926#issuecomment-2007915919))	2024-03-19 18:59:46 +00:00
Guilherme Leobas	edd04b7c16	Teach dynamo about torch.func.jvp (#119926 ) List of changes: - Replace JVP_NESTING by torch._C._functorch.maybe_current_level() - Remove all increment nesting functions from wrap_fx_proxy_cls - fwAD.make_dual receives the dual_level as keyword argument - Add jvp_increment_nesting, set_fwd_grad_enabled and dual_level context managers to dynamo Pull Request resolved: https://github.com/pytorch/pytorch/pull/119926 Approved by: https://github.com/zou3519	2024-03-19 13:06:42 +00:00
Guilherme Leobas	fd35aafc26	Teach dynamo about vjp (#119405 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/119405 Approved by: https://github.com/zou3519 ghstack dependencies: #118407	2024-03-01 00:21:10 +00:00
Guilherme Leobas	491c2b4665	Let torch dynamo inline torch.func.grad (#118407 ) When dynamo sees torch.func.grad, it tries to inline all frames related to. Pull Request resolved: https://github.com/pytorch/pytorch/pull/118407 Approved by: https://github.com/zou3519	2024-02-28 20:05:00 +00:00
Edward Z. Yang	9bce208dfb	Replace follow_imports = silent with normal (#118414 ) This is a lot of files changed! Don't panic! Here's how it works: * Previously, we set `follow_imports = silent` for our mypy.ini configuration. Per https://mypy.readthedocs.io/en/stable/running_mypy.html#follow-imports, what this does is whenever we have an import to a module which is not listed as a file to be typechecked in mypy, we typecheck it as normal but suppress all errors that occurred in that file. * When mypy is run inside lintrunner, the list of files is precisely the files covered by the glob in lintrunner.toml, but with files in excludes excluded. * The top-level directive `# mypy: ignore-errors` instructs mypy to typecheck the file as normal, but ignore all errors. * Therefore, it should be equivalent to set `follow_imports = normal`, if we put `# mypy: ignore-errors` on all files that were previously excluded from the file list. * Having done this, we can remove the exclude list from .lintrunner.toml, since excluding a file from typechecking is baked into the files themselves. * torch/_dynamo and torch/_inductor were previously in the exclude list, because they were covered by MYPYINDUCTOR. It is not OK to mark these as `# mypy: ignore-errors` as this will impede typechecking on the alternate configuration. So they are temporarily being checked twice, but I am suppressing the errors in these files as the configurations are not quite the same. I plan to unify the configurations so this is only a temporary state. * There were some straggler type errors after these changes somehow, so I fixed them as needed. There weren't that many. In the future, to start type checking a file, just remove the ignore-errors directive from the top of the file. The codemod was done with this script authored by GPT-4: ``` import glob exclude_patterns = [ ... ] for pattern in exclude_patterns: for filepath in glob.glob(pattern, recursive=True): if filepath.endswith('.py'): with open(filepath, 'r+') as f: content = f.read() f.seek(0, 0) f.write('# mypy: ignore-errors\n\n' + content) ``` Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/118414 Approved by: https://github.com/thiagocrepaldi, https://github.com/albanD	2024-01-27 02:44:11 +00:00
Peter Bell	66c32d099a	Use `pytree.arg_tree_leaves` everywhere (#112394 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/112394 Approved by: https://github.com/lezcano ghstack dependencies: #112391, #112392, #112393	2023-10-31 15:57:06 +00:00
Peter Bell	bbd5b935e4	Use `pytree.tree_leaves` everywhere (#112324 ) This changes all the instances I could find of `tree_flatten(...)[0]` or `x, _ = tree_flatten` to use `tree_leaves`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/112324 Approved by: https://github.com/lezcano ghstack dependencies: #112327, #112323	2023-10-30 03:39:04 +00:00
Xuehai Pan	a7a0955790	[pytree][BE] reorganize imports and format code style and update type hints (#112268 ) Reland PR: - #112109 Pull Request resolved: https://github.com/pytorch/pytorch/pull/112268 Approved by: https://github.com/Skylion007	2023-10-28 16:30:24 +00:00
Kazuaki Ishizaki	6d7744ca46	Fix typo under torch/_functorch directory (#111067 ) This PR fixes typo the the of comments and exception messages in files under `torch/_functorch` directory. Pull Request resolved: https://github.com/pytorch/pytorch/pull/111067 Approved by: https://github.com/Skylion007	2023-10-11 23:09:36 +00:00
Brian Hirsh	238fb66085	python functionalization: support higher order ops (#108656 ) We now have two types of functionalization, C++ Functionalization (through the `Functionalize` dispatch key), and python functionalization (through the `FunctionalTensorMode` torch_dispatch mode). This means that all higher order ops need custom functionalization rules for the python variant too. I added them here, as well as a helper function `dispatch_functionalize()` - equivalent to `torch.func.functionalize()`, except that it uses `FunctionalTensorMode`. In theory we could have secretly switched `torch.func.functionalize` to use `FunctionalTensorMode`. This would be BC-breaking, though, since `FunctionalTensorMode` isn't composable with the other functorch transforms (the functorch layer-mode stack doesn't know how to re-order torch_dispatch modes arbitrarily). Pull Request resolved: https://github.com/pytorch/pytorch/pull/108656 Approved by: https://github.com/zou3519 ghstack dependencies: #109024, #109248	2023-09-20 04:37:31 +00:00
kshitij12345	cce2c52b0b	[pt2] support vmap (#101707 ) Teach dynamo about `vmap` Pull Request resolved: https://github.com/pytorch/pytorch/pull/101707 Approved by: https://github.com/zou3519	2023-08-09 03:39:33 +00:00
Justin Chu	8a688277a2	[BE] Enable ruff's UP rules and autoformat dynamo / functorch and refs (#105432 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/105432 Approved by: https://github.com/ezyang	2023-07-19 13:48:44 +00:00
kshitij12345	d552c271db	[pt2] grad support (#102264 ) Teach dynamo about grad Pull Request resolved: https://github.com/pytorch/pytorch/pull/102264 Approved by: https://github.com/zou3519	2023-06-21 10:13:09 +00:00
PyTorch MergeBot	e737a8486f	Revert "[pt2] grad support (#102264 )" This reverts commit 85b83954c8820fc7473d8e7b68325fa8ed5753dc. Reverted https://github.com/pytorch/pytorch/pull/102264 on behalf of https://github.com/huydhn due to This is failing in trunk `85b83954c8` and looks like a landrace ([comment](https://github.com/pytorch/pytorch/pull/102264#issuecomment-1600001309))	2023-06-21 03:02:55 +00:00
kshitij12345	85b83954c8	[pt2] grad support (#102264 ) Teach dynamo about grad Pull Request resolved: https://github.com/pytorch/pytorch/pull/102264 Approved by: https://github.com/zou3519	2023-06-21 01:37:08 +00:00
Aaron Gokaslan	47dca20d80	[BE] Enable flake8-comprehension rule C417 (#97880 ) Enables flake8-comprehension rule C417. Ruff autogenerated these fixes to the codebase. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97880 Approved by: https://github.com/ezyang, https://github.com/kit1980, https://github.com/albanD	2023-03-30 14:34:24 +00:00
kshitij12345	2b369eb3c2	[fix] jacrev and jacfwd : support non-tensor args again (#97746 ) Fixes https://github.com/pytorch/pytorch/issues/97636 The code to check if argument tensor are complex assumed that all arguments are tensor (which is not the case) which lead to the error. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97746 Approved by: https://github.com/zou3519	2023-03-28 16:37:33 +00:00
Kshiteej K	3fc4bc115f	[functorch] jacrev, jacfwd error for complex input or output (#94805 ) Related: https://github.com/pytorch/pytorch/issues/94397, https://github.com/pytorch/pytorch/issues/94397#issuecomment-1428452756 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94805 Approved by: https://github.com/lezcano	2023-02-14 16:13:37 +00:00
kshitij12345	4f3858c6d8	[functorch] linearize (#94173 ) Fixes https://github.com/pytorch/functorch/issues/724 TODO: * [x] Docs NOTE: `const_fold` pass raises UserWarning -> https://github.com/pytorch/pytorch/issues/94374 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94173 Approved by: https://github.com/Chillee	2023-02-09 15:45:08 +00:00
PyTorch MergeBot	e0e4f1a890	Revert "[functorch] linearize (#94173 )" This reverts commit b6b9e1e6e043ae4b9f41fbbee4f2a9e9a7e7d3d7. Reverted https://github.com/pytorch/pytorch/pull/94173 on behalf of https://github.com/kshitij12345 due to Broke lint runner	2023-02-09 09:22:39 +00:00
Kshiteej K	b6b9e1e6e0	[functorch] linearize (#94173 ) Fixes https://github.com/pytorch/functorch/issues/724 TODO: * [x] Docs NOTE: `const_fold` pass raises UserWarning -> https://github.com/pytorch/pytorch/issues/94374 Pull Request resolved: https://github.com/pytorch/pytorch/pull/94173 Approved by: https://github.com/Chillee	2023-02-09 08:57:05 +00:00
joncrall	ad782ff7df	Enable xdoctest runner in CI for real this time (#83816 ) Builds on #83317 and enables running the doctests. Just need to figure out what is causing the failures. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83816 Approved by: https://github.com/ezyang, https://github.com/malfet	2022-12-29 05:32:42 +00:00
Kshiteej K	3fdbf824ae	[functorch] jacrev: chunk_size=1 without vmap (#91326 ) As discussed at https://github.com/pytorch/pytorch/pull/91157#discussion_r1053679272 Pull Request resolved: https://github.com/pytorch/pytorch/pull/91326 Approved by: https://github.com/zou3519	2022-12-28 04:56:25 +00:00
kshitij12345	4437d0d161	[functorch] vmap: chunk_size support (#91157 ) Ref: https://github.com/pytorch/functorch/issues/680 We introduce a kwarg `chunk_size` in vmap. Also, we leverage most of the code from `chunk_vmap` (except for chunking the input based on `chunk_size`) Benchmarks from https://github.com/pytorch/functorch/pull/774 apply. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91157 Approved by: https://github.com/zou3519	2022-12-22 19:45:45 +00:00
Brian Hirsh	c47bdd7522	_scatter ops should preserve input stride/storage_offset (#91029 ) It turns out that we do* need to update *_scatter ops to return the exact same strides as their inputs. I added a test to `test/test_functionalization.py`, which now trips thanks to Ed's functionalization stride debugging check. It only actually ends up tripping silent correctness if you try to .backward() on that function. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91029 Approved by: https://github.com/ezyang	2022-12-22 19:41:53 +00:00
Richard Zou	fb2e1878cb	[torch.func] alias torch.func.vmap as torch.vmap (#91026 ) This PR also redirects torch.vmap to torch.func.vmap instead of the old vmap prototype. Test Plan: - tests - view docs preview Pull Request resolved: https://github.com/pytorch/pytorch/pull/91026 Approved by: https://github.com/albanD, https://github.com/samdow	2022-12-21 20:51:49 +00:00
Richard Zou	41846e205e	[torch.func] Setup torch.func, populate it with all transforms (#91016 ) This PR sets up torch.func and populates it with the following APIs: - grad - grad_and_value - vjp - jvp - jacrev - jacfwd - hessian - functionalize - vmap It also renames all instances of `functorch` in the APIs for those docs to `torch.func`. We rewrite the `__module__` fields on some of the above APIs so that the APIs fit PyTorch's public api definition. - For an API to be public, it must have a `__module__` that points to a public PyTorch submodule. However, `torch._functorch.eager_transforms` is not public due to the leading underscore. - The solution is to rewrite `__module__` to point to where the API is exposed (torch.func). This is what both Numpy and JAX do for their APIs. - h/t pmeier in https://github.com/pytorch/pytorch/issues/90284#issuecomment-1348595246 for idea and code - The helper function, `exposed_in`, is confined to torch._functorch/utils for now because we're not completely sure if this should be the long-term solution. Implication for functorch.* APIs: - functorch.grad is the same object as torch.func.grad - this means that the functorch.grad docstring is actually the torch.func.grad docstring and will refer to torch.func instead of functorch. - This isn't really a problem since the plan on record is to deprecate functorch in favor of torch.func. We can fix these if we really want, but I'm not sure if a solution is worth maintaining. Test Plan: - view docs preview Future: - vmap should actually just be torch.vmap. This requires an extra step where I need to test internal callsites, so, I'm separating it into a different PR. - make_fx should be in torch.func to be consistent with `import functorch`. This one is a bit more of a headache to deal with w.r.t. public api, so going to deal with it separately. - beef up func.rst with everything else currently on the functorch documention website. func.rst is currently just an empty shell. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91016 Approved by: https://github.com/samdow	2022-12-20 00:00:52 +00:00
Richard Zou	cad1ce6158	Stop using :attr: in functorch docs (#91015 ) We're using :attr: wrong. :attr: refers to an attribute of a Python object, not the parameter to a function: - https://www.sphinx-doc.org/en/master/usage/restructuredtext/domains.html#role-py-attr This leads to some weird things when moving to torch.func: sphinx decides to link torch.func for :attr:`func` Test Plan: - docs preview. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91015 Approved by: https://github.com/samdow	2022-12-20 00:00:52 +00:00
Kshiteej K	f02e93b584	jacrev : Support chunked computation (#89376 ) Ref: https://github.com/pytorch/functorch/issues/680 We introduce a kwarg `chunk_size` in `jacrev` to control whether the Jacobian computation should be chunked and if so then `chunk_size` will dictate the maximum size of the chunks used. We try two approaches, * Stacked Approach: Append the intermediate computation to a list and then stack those results. * Pre-allocation Approach: Pre-allocate a zeros tensor and copy chunked computation into it. For Memory Benchmark, see https://github.com/pytorch/pytorch/pull/89376#issuecomment-1348479098 Benchmark CPU : Performs better with more chunks/ smaller chunk_size. NOTE: There seems to be a lot of noise for shape `(64, 64)`. <details> ``` [----------------------------------------------- jacrev : device cpu : chunks 2 -----------------------------------------------] \| with chunk_size and stacked \| without chunk_size \| with chunk_size and pre-allocated 1 threads: --------------------------------------------------------------------------------------------------------------------- (64, 64) : chunk_size 2080 \| 76.2 \| 50.9 \| 80.1 (128, 128) : chunk_size 8256 \| 1172.8 \| 783.3 \| 1225.5 (128, 144) : chunk_size 9288 \| 1475.1 \| 990.4 \| 1548.3 (144, 144) : chunk_size 10440 \| 1871.3 \| 1254.4 \| 1971.2 Times are in milliseconds (ms). [----------------------------------------------- jacrev : device cpu : chunks 3 ----------------------------------------------] \| with chunk_size and stacked \| without chunk_size \| with chunk_size and pre-allocated 1 threads: -------------------------------------------------------------------------------------------------------------------- (64, 64) : chunk_size 1386 \| 39.9 \| 25.8 \| 58.8 (128, 128) : chunk_size 5504 \| 1182.6 \| 782.2 \| 1229.7 (128, 144) : chunk_size 6192 \| 1483.6 \| 995.4 \| 1550.6 (144, 144) : chunk_size 6960 \| 1879.1 \| 1257.7 \| 1960.5 Times are in milliseconds (ms). [----------------------------------------------- jacrev : device cpu : chunks 4 ----------------------------------------------] \| with chunk_size and stacked \| without chunk_size \| with chunk_size and pre-allocated 1 threads: -------------------------------------------------------------------------------------------------------------------- (64, 64) : chunk_size 1040 \| 41.7 \| 50.6 \| 29.1 (128, 128) : chunk_size 4128 \| 1171.6 \| 782.3 \| 1226.7 (128, 144) : chunk_size 4644 \| 1482.2 \| 994.6 \| 1550.9 (144, 144) : chunk_size 5220 \| 1870.2 \| 1254.5 \| 1961.4 Times are in milliseconds (ms). [--------------------------------------------- jacrev : device cpu : chunks 100 ---------------------------------------------] \| with chunk_size and stacked \| without chunk_size \| with chunk_size and pre-allocated 1 threads: ------------------------------------------------------------------------------------------------------------------- (64, 64) : chunk_size 41 \| 46.8 \| 50.5 \| 46.4 (128, 128) : chunk_size 165 \| 622.2 \| 775.2 \| 656.0 (128, 144) : chunk_size 185 \| 803.9 \| 987.3 \| 866.9 (144, 144) : chunk_size 208 \| 1021.1 \| 1251.2 \| 1088.2 Times are in milliseconds (ms). [--------------------------------------------- jacrev : device cpu : chunks 200 ---------------------------------------------] \| with chunk_size and stacked \| without chunk_size \| with chunk_size and pre-allocated 1 threads: ------------------------------------------------------------------------------------------------------------------- (64, 64) : chunk_size 20 \| 60.9 \| 50.2 \| 62.3 (128, 128) : chunk_size 82 \| 583.1 \| 779.4 \| 634.3 (128, 144) : chunk_size 92 \| 834.1 \| 1005.8 \| 472.3 (144, 144) : chunk_size 104 \| 1053.6 \| 1277.0 \| 1033.9 Times are in milliseconds (ms). [--------------------------------------------- jacrev : device cpu : chunks 300 --------------------------------------------] \| with chunk_size and stacked \| without chunk_size \| with chunk_size and pre-allocated 1 threads: ------------------------------------------------------------------------------------------------------------------ (64, 64) : chunk_size 13 \| 77.7 \| 50.4 \| 79.6 (128, 128) : chunk_size 55 \| 578.9 \| 782.3 \| 626.9 (128, 144) : chunk_size 61 \| 718.2 \| 1024.9 \| 800.4 (144, 144) : chunk_size 69 \| 919.7 \| 1313.7 \| 1023.0 Times are in milliseconds (ms). ``` </details> Benchmark CUDA: Performs better with less chunks/bigger chunk_size. <details> ``` [--------------------------------------------- jacrev : device cuda:1 : chunks 2 ----------------------------------------------] \| with chunk_size and stacked \| without chunk_size \| with chunk_size and pre-allocated 1 threads: --------------------------------------------------------------------------------------------------------------------- (64, 64) : chunk_size 2080 \| 1485.7 \| 923.8 \| 1632.3 (128, 128) : chunk_size 8256 \| 25390.2 \| 14103.2 \| 33557.4 (128, 144) : chunk_size 9288 \| 801.7 \| 16854.1 \| 42894.6 (144, 144) : chunk_size 10440 \| 1003.5 \| 21386.5 \| 59648.5 Times are in microseconds (us). 3 / 3 : Shape (144, 144) : Device cuda:1 : chunks: 3 [--------------------------------------------- jacrev : device cuda:1 : chunks 3 ---------------------------------------------] \| with chunk_size and stacked \| without chunk_size \| with chunk_size and pre-allocated 1 threads: -------------------------------------------------------------------------------------------------------------------- (64, 64) : chunk_size 1386 \| 1474.5 \| 924.5 \| 1655.5 (128, 128) : chunk_size 5504 \| 25368.9 \| 10156.0 \| 34022.1 (128, 144) : chunk_size 6192 \| 25223.0 \| 12933.7 \| 56418.5 (144, 144) : chunk_size 6960 \| 24729.3 \| 16367.4 \| 68744.7 Times are in microseconds (us). 3 / 3 : Shape (144, 144) : Device cuda:1 : chunks: 4 [--------------------------------------------- jacrev : device cuda:1 : chunks 4 ---------------------------------------------] \| with chunk_size and stacked \| without chunk_size \| with chunk_size and pre-allocated 1 threads: -------------------------------------------------------------------------------------------------------------------- (64, 64) : chunk_size 1040 \| 1489.2 \| 924.4 \| 1679.6 (128, 128) : chunk_size 4128 \| 25370.4 \| 8987.4 \| 57201.3 (128, 144) : chunk_size 4644 \| 32239.1 \| 10136.2 \| 72406.5 (144, 144) : chunk_size 5220 \| 40994.3 \| 12867.8 \| 108653.4 Times are in microseconds (us). 3 / 3 : Shape (144, 144) : Device cuda:1 : chunks: 100 [------------------------------------------- jacrev : device cuda:1 : chunks 100 --------------------------------------------] \| with chunk_size and stacked \| without chunk_size \| with chunk_size and pre-allocated 1 threads: ------------------------------------------------------------------------------------------------------------------- (64, 64) : chunk_size 41 \| 21121.8 \| 924.2 \| 22753.5 (128, 128) : chunk_size 165 \| 23679.7 \| 14284.4 \| 26758.2 (128, 144) : chunk_size 185 \| 30082.3 \| 18063.3 \| 33553.5 (144, 144) : chunk_size 208 \| 38175.6 \| 22839.5 \| 42030.0 Times are in microseconds (us). ``` </details> Benchmark Script <details> ```python import functorch import torch import itertools import time from torch.utils.benchmark import Timer from torch.utils.benchmark import Compare import sys import pickle from torch import profiler import math def prod(l): prod = 1 for el in l: prod = el return prod def fn(x, y): return x + y, x.sum(0) shapes = ((64, 64), (128, 128), (128, 144), (144, 144)) for device in ('cpu', 'cuda:1'): if device == 'cuda:1': chunks = (2, 3, 4, 100,) else: chunks = (2, 3, 4, 100, 200, 300) for chunk in chunks: results = [] for shape in shapes: x = torch.zeros(shape, dtype=torch.float, device=device) y = x.sum() chunk_size = (prod(shape) + prod(shape[1:])) // chunk jacrev_fn_chunked = functorch.jacrev(fn, (0, 1), chunk_size=chunk_size) jacrev_fn_chunked_pre = functorch.jacrev(fn, (0, 1), chunk_size=chunk_size, _preallocate_and_copy=True) jacrev_fn = functorch.jacrev(fn, (0, 1), chunk_size=None) tasks = [("jacrev_fn_chunked(x, y)", "with chunk_size and stacked"), ("jacrev_fn(x, y)", "without chunk_size"), ("jacrev_fn_chunked_pre(x, y)", "with chunk_size and pre-allocated"),] timers = [Timer(stmt=stmt, label=f"jacrev : device {device} : chunks {chunk}", sub_label=f"{(shape)} : chunk_size {chunk_size}", description=desc, globals=globals()) for stmt, desc in tasks] for i, timer in enumerate(timers): results.append( timer.blocked_autorange(min_run_time=2.) ) print(f"\r{i + 1} / {len(timers)} : Shape {shape} : Device {device} : chunks: {chunk}", end="") sys.stdout.flush() print() comparison = Compare(results) comparison.print() ``` </details> Pull Request resolved: https://github.com/pytorch/pytorch/pull/89376 Approved by: https://github.com/zou3519	2022-12-19 20:04:21 +00:00
Richard Zou	24c3ad7851	Move private forward grad mode helpers to torch.autograd.forward_ad (#90240 ) Motivation - These were previously defined in functorch. They are not functorch-specific, so I'm moving them to torch.autograd.forward_ad and the autograd python bindings. - I need this to avoid some of my cyclic import problems. Should these be public APIs? Probably. Though this needs discussion, so punting it to the future. Test Plan: - moved the tests of these from test/functorch/test_eager_transforms.py to test/test_autograd.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/90240 Approved by: https://github.com/soulitzer	2022-12-13 14:14:02 +00:00
Richard Zou	4068c5467d	[Reland] Move functorch/_src to torch/_functorch (#88756 ) (#90091 ) This will be the last disruptive functorch internals change. Why are we moving these files? - As a part of rationalizing functorch we are moving the code in functorch/_src to torch/_functorch - This is so that we can offer the functorch APIs as native PyTorch APIs (coming soon) and resolve some internal build issues. Why are we moving all of these files at once? - It's better to break developers all at once rather than many times Test Plan: - wait for tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/90091 Approved by: https://github.com/anijain2305, https://github.com/ezyang	2022-12-03 14:17:15 +00:00
PyTorch MergeBot	218d9c6e09	Revert "Move functorch/_src to torch/_functorch (#88756 )" This reverts commit 52bc5c1cfe098fd4b4b13902b4fea83b455b9773. Reverted https://github.com/pytorch/pytorch/pull/88756 on behalf of https://github.com/clee2000 due to broke imports in tests `52bc5c1cfe` https://github.com/pytorch/pytorch/actions/runs/3574742513/jobs/6010814968 probably a landrace	2022-11-29 17:17:11 +00:00
Richard Zou	52bc5c1cfe	Move functorch/_src to torch/_functorch (#88756 ) This will be the last disruptive functorch internals change. Why are we moving these files? - As a part of rationalizing functorch we are moving the code in functorch/_src to torch/_functorch - This is so that we can offer the functorch APIs as native PyTorch APIs (coming soon) and resolve some internal build issues. Why are we moving all of these files at once? - It's better to break developers all at once rather than many times Test Plan: - wait for tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/88756 Approved by: https://github.com/ezyang	2022-11-29 13:55:42 +00:00

50 Commits