Commit Graph

20 Commits

Author SHA1 Message Date
72d2dba992 Add None return type to init (#132335)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/132335
Approved by: https://github.com/albanD
2024-08-01 15:26:45 +00:00
afe15d2d2f Flip default value for mypy disallow_untyped_defs [3/11] (#127840)
See #127836 for details.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/127840
Approved by: https://github.com/oulgen
2024-06-08 18:28:01 +00:00
79c5e33349 [BE] Enable ruff's UP rules and autoformat nn/ mps/ and torch/ (#105436)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/105436
Approved by: https://github.com/malfet, https://github.com/albanD
2023-07-21 07:38:46 +00:00
cbe5469e88 [PolishComment] Polish code comment, revelant->relevant (#85238)
Polish code comment, `revelant`->`relevant`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/85238
Approved by: https://github.com/kit1980
2022-09-19 19:43:14 +00:00
12cb26509a Apply ufmt to torch internal (#81643)
This is a big bang PR, merge conflicts are probably expected and will be addressed at merge.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/81643
Approved by: https://github.com/ezyang
2022-07-22 02:19:50 +00:00
ad8386c132 Lint fix
Pull Request resolved: https://github.com/pytorch/pytorch/pull/76876

Approved by: https://github.com/ngimel
2022-05-05 05:52:40 +00:00
54c75e1e8f Add "mps" device to PyTorch framework.
Remove the "mlc" device for Mac platforms.

This commit will be followed up with:

* adding MPS runtime components
* PyTorch ops for MPS device

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/76291
Approved by: https://github.com/albanD
2022-04-27 19:21:57 +00:00
1b7d7d9327 Reland: "free up dispatch key space (in C++)" (#74963)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74963

This is a re-land of D35192346 (9872a06d77) and D35192317 (a9216cde6c), which together are a diff that changes the internal representation of `DispatchKeySet` in pytorch core to free up the number of dispatch keys that we have available. See a more detailed description of the design in the original PR: https://github.com/pytorch/pytorch/pull/69633.

The original PR broke Milan workflows, which use a pytorch mobile build, and manifested as a memory corruption bug inside of `liboacrmerged.so`.

**Background: Existing Mobile Optimization**
Pytorch mobile builds have an existing optimization (here cc23725e89/c10/core/DispatchKey.h (L382) and here cc23725e89/aten/src/ATen/core/dispatch/OperatorEntry.h (L214)), which works as follows:

Every operator in pytorch has a "dispatch table" of function pointers, corresponding to all of the (up to 64) different kernels that we might dispatch to when we run an operator in pytorch (autograd, cpu, cuda, complex number support, etc).

In mobile builds, the size of that table is shrunk from 64 to 8 to save a bunch of space, because mobile doesn't end up using the functionality associated with most dispatch keys.

The dispatcher also has a notion of "fallback kernels", which are kernels that you can register to a particular dispatch key, but should be able to work for "any operator". The array of fallback kernels is defined here: cc23725e89/aten/src/ATen/core/dispatch/Dispatcher.h (L294).

The mobile-optimization currently does **not** extend to this array (it wouldn't be that useful anyway because there is only one array of fallback kernels globally - vs. there is a separate dispatch table of function pointers per operator). So the per-operator tables on mobile are size 8, while the fallback table is size 64.

**The Bug**
This PR actually makes it difficult to enable that optimization separately for the per-operator arrays vs. the fallback array, and incidentally shrunk the size of the fallback array from 64 to 8 for mobile (that happened on this line: https://github.com/pytorch/pytorch/pull/69633/files#diff-f735cd7aa68f15b624100cbc4bb3b5ea76ffc7c9d3bec3b0ccabaa09609e5319R294).

That isn't a problem by itself (since mobile doesn't actually use any of the fallbacks that can no longer be stored). However, pytorch core will still register all of those fallback kernels on startup in mobile builds, even if they aren't used. When we tried to register one of those fallbacks on startup, it would try to dump the kernel somewhere in memory past the bounds of the (now smaller) array inside of the `Dispatcher` object, `backendFallbackKernels_`.

**Why didn't this problem show up in OSS CI? Why didn't it break other internal mobile workflows aside from Milan?**

Ideally, this failure would show up as part of the OSS signal on GitHub, since we already have mobile OSS builds. Given that it was another memory corruption issue that only affected Milan (subset of mobile), I'm not sure what's specific about Milan's builds that caused it only to manifest there. dreiss I wonder if there's another flavor of mobile builds we could run in OSS CI that could potentially help catch this?

**The debugging experience was pretty difficult**

Debugging the Milan-specific failure was made difficult by the following:

(1) lack of CI
- the original Milan failure didn't surface on my original diff, because the Milan job(s) that failed weren't triggered to run on pytorch changes. There's probably a balance to strike here, since those jobs will only be useful if they aren't flaky, and if they can produce reliable failure logs for debugging.

(2) It's difficult to get a repro.
- my work laptop doesn't have the right specs to run the Milan development workflow (not enough disk space)
- There is an existing OnDemand workflow for Milan, but it appears to be relatively new, and after a bunch of help from MarcioPorto, we ran into issues forwarding the log output from Milan tests on the emulator back to the terminal (see the original discussion here: https://fb.workplace.com/groups/OnDemandFRL/permalink/1424937774645433/)

(3) Lack of stack-traces.
- Most Milan failures didn't include actionable stack traces. phding generously helped me debug by running my suggested patches locally, and reporting back if there were any failures. The failing test didn't include a stack trace though (just the line where the crash appeared), so I ended up making some educated guesses about what the issue was based on the area of the crash.
ghstack-source-id: 152688542

Test Plan: Confirmed with phding that the broken Milan workflow from the previous version of this diff is now passing.

Reviewed By: phding, albanD

Differential Revision: D35222806

fbshipit-source-id: 0ad115a0f768bc8ea5d4c203b2990254c7092d30
(cherry picked from commit 002b91966f11fd55ab3fa3801b636fa39a6dd12c)
2022-03-31 21:52:38 +00:00
9872a06d77 Back out "free up dispatch key space (in C++)" (#74859)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/74859

Original commit changeset: 6d1dd0fd8144

Original Phabricator Diff: D34227616 (2cbddc0e9b)
ghstack-source-id: 152381077

(Note: this ignores all push blocking failures!)

Test Plan:
Test on Milan with "get weather utterance"
buck build fbsourcefbandroid/mode/opt fbsourcefbandroid/mode/milan_build_rdk  //fbandroid/apps/wearable/system/speechservice:speechservice_target30_xhdpi_armv7_release_debug_keystore -c  pt.has_backtaces=1

Reviewed By: phding

Differential Revision: D35192346

fbshipit-source-id: b962de5d5effaf23f9aa8afd3ef36f8c6383de5b
(cherry picked from commit 913e3027a11457aaa2d97a9d89ebc6133b14213c)
2022-03-29 15:39:17 +00:00
2cbddc0e9b free up dispatch key space (in C++) (#72827)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72827

Reland of D34034848 (6690256021)
ghstack-source-id: 152161452

Test Plan: Confirm that Milan tests are passing

Reviewed By: ezyang

Differential Revision: D34227616

fbshipit-source-id: 6d1dd0fd8144dfbd9e194cd7564cce017e7db968
(cherry picked from commit e5c1b29fedd5c2a0bad810cedc94aa784136b6aa)
2022-03-25 17:04:51 +00:00
22ccf448e8 Revert D34034848: free up dispatch key space (in C++)
Test Plan: revert-hammer

Differential Revision:
D34034848 (6690256021)

Original commit changeset: 9677ee2c0a1a

Original Phabricator Diff: D34034848 (6690256021)

fbshipit-source-id: fd50943d915ef813bb9f9ab278fb582429eea3b1
(cherry picked from commit 3acefee1cdb89bc051d1ef2e9deb5698d2bd85c3)
2022-02-14 23:29:00 +00:00
6690256021 free up dispatch key space (in C++) (#72402)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72402

The original PR had an array-out-of-bounds access in `DispatchKeyExtractor.cpp`, that wasn't caught by ASAN and appeared to only manifest in a subset of android internal tests. After fixing the OOB access (and adding more asserts), I confirmed that the android internal test passes.

Reland of D33255193 (20b8653dfa)
ghstack-source-id: 148830728

Test Plan:
Steps to test:

(1) connect to a mobile OD

(2) run `one_world android emulator android-29` in a terminal to start the android emulator

(3) In a separate terminal, run the test: `buck test //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test -c test.external_runner=tpx -- --regex 'testBIXRayModel.*PyTorchBIXRayInstrumentationTest' --force-remote-execution --run-disabled`

I also ran `buck test fbandroid/mode/dbg //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test`, which failed before and passed after the PR.

Reviewed By: albanD

Differential Revision: D34034848

fbshipit-source-id: 9677ee2c0a1afd1183896f7055009445712523c5
(cherry picked from commit 9ab9b12d355540ad0923c6869ed088ff6c21490c)
2022-02-14 16:02:29 +00:00
791e7df7d9 Back out "free up dispatch key space (in C++)"
Summary: I think this diff stack broke all the related tasks below.

Test Plan:
For our failing tests:

buck test //fbandroid/instrumentation_tests/com/facebook/pytorch/bi_xray:instrumentation_test -c test.external_runner=tpx -- --regex 'testBIXRayModel.*PyTorchBIXRayInstrumentationTest' --force-remote-execution --run-disabled

For the ubn:

Not really sure what to do, trying to build the app and see if I can use an effect?

Reviewed By: shoumikhin

Differential Revision: D34018849

fbshipit-source-id: 3571718cb6621931af931b494e0a70d6e0164e65
(cherry picked from commit 3cc63cb2ea2664dd1063b190614f2034cce5f2d0)
2022-02-05 01:25:42 +00:00
20b8653dfa free up dispatch key space (in C++) (#69633)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/69633

Test Plan: Imported from OSS

Reviewed By: albanD

Differential Revision: D33255193

Pulled By: bdhirsh

fbshipit-source-id: 79773e9c15bf4f2f27675121a49ff5ffd1375238
(cherry picked from commit eac0b1300569e035f3de28a1f0fdce03f60bd270)
2022-02-04 17:57:38 +00:00
1022443168 Revert D30279364: [codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: revert-hammer

Differential Revision:
D30279364 (b004307252)

Original commit changeset: c1ed77dfe43a

fbshipit-source-id: eab50857675c51e0088391af06ec0ecb14e2347e
2021-08-12 11:45:01 -07:00
b004307252 [codemod][lint][fbcode/c*] Enable BLACK by default
Test Plan: manual inspection & sandcastle

Reviewed By: zertosh

Differential Revision: D30279364

fbshipit-source-id: c1ed77dfe43a3bde358f92737cd5535ae5d13c9a
2021-08-12 10:58:35 -07:00
b176feec1e Add device and key for lazy tensors (#61621)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/61621

Test Plan: CI

Reviewed By: mruberry

Differential Revision: D29912934

Pulled By: asuhan

fbshipit-source-id: 493c32063a3e756d93cbf1d876563a35eaafb537
2021-07-26 23:00:22 -07:00
13b1ca9466 Rename DefaultBackend to CompositeExplicitAutograd (#54470)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54470

```
git grep -l 'DefaultBackend' | xargs sed -i 's/DefaultBackend/CompositeExplicitAutograd/g'
```

Plus a quick fixup in native/README.md

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: bdhirsh

Differential Revision: D27253240

Pulled By: ezyang

fbshipit-source-id: 964df951ea8b52fa72937f3cc66aeaf49a702e6f
2021-03-26 10:53:30 -07:00
145bc5cd51 Rename Math to CompositeImplicitAutograd (#54466)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/54466

I had to very carefully audit all the use sites since there are a lot
of other uses of the string Math; I did most of the conversion by
grepping for all occurrences of Math and then doing a search
replace.

I also updated documentation for clarity.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Test Plan: Imported from OSS

Reviewed By: ngimel

Differential Revision: D27253239

Pulled By: ezyang

fbshipit-source-id: afb485d07ff39575742a4f0e1e205179b60bc953
2021-03-24 13:49:24 -07:00
a51b9a823c Improve docs around Math/DefaultBackend & add PythonDispatcher class. (#50854)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/50854

Test Plan: Imported from OSS

Reviewed By: bhosmer

Differential Revision: D26008542

Pulled By: ailzhang

fbshipit-source-id: e9c0aa97ac2537ff612f5faf348fcb613da09479
2021-01-25 23:10:36 -08:00