pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Yu, Guangye	1b121d636e	Fix AllocatorConfig parse roundup division bug (#165304 ) * #165288 Pull Request resolved: https://github.com/pytorch/pytorch/pull/165304 Approved by: https://github.com/albanD ghstack dependencies: #165288, #165289, #165291, #165298	2025-10-19 15:34:44 +00:00
Yu, Guangye	7ee45f7503	Restore AcceleratorAllocatorConfig to avoid potential regression (#165129 ) # Motivation This PR aims to restore `AcceleratorAllocatorConfig` to avoid the potential regression mentioned in https://github.com/pytorch/pytorch/pull/160666#issue-3323270375 These code change would be reverted in the following PR https://github.com/pytorch/pytorch/pull/165304 Pull Request resolved: https://github.com/pytorch/pytorch/pull/165129 Approved by: https://github.com/albanD	2025-10-16 15:26:17 +00:00
Scott Wolchok	b0a3e58dd7	Add inline fast paths for SymInt operators (#161586 ) If SymInt::maybe_as_int() returns non-empty, then we get an inline fast path. The philosophy here (as with the previous PR) is to preserve performance in the "plain old ints" case. Observed time spent in SymInt functions in computeStorageNBytes to drop (and not cost shift elsewhere in the function) after this change, profiling detach() using code similar to the benchmark from #160580 and Linux perf. Differential Revision: [D81530107](https://our.internmc.facebook.com/intern/diff/D81530107) Pull Request resolved: https://github.com/pytorch/pytorch/pull/161586 Approved by: https://github.com/ezyang ghstack dependencies: #161466	2025-09-03 06:54:47 +00:00
Yu, Guangye	c8cf811995	Enable AcceleratorAllocatorConfig key check (#157908 ) # Motivation Add a mechanism to ensure raise the key if the key is unrecognized in allocator config. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157908 Approved by: https://github.com/albanD ghstack dependencies: #149601	2025-07-30 06:36:56 +00:00
Yu, Guangye	914b1a3873	Introduce AcceleratorAllocatorConfig as the common class (#149601 ) # Motivation This PR aims to generalize `AllocatorConfig` to be device-agnostic. Introduce the class `AcceleratorAllocatorConfig` to clarify its scope as a configuration manager for accelerator backends (e.g., CUDA, XPU). The another name `AllocatorConfig` is now reserved for a potential future base class that can unify configuration handling for both CPU and accelerator allocators, should similar requirements arise for the CPU path. # Design Rule ## Overall This class configures memory allocation for both device and host memory. A single `AcceleratorAllocatorConfig` instance is shared across all accelerator backends, such as CUDA and XPU, under the assumption that relevant environment variables apply uniformly to all accelerators. Device-specific configuration extensions are supported via hooks (see `registerDeviceConfigParserHook`). Introduce a new class `ConfigTokenizer` to help process the env variable config key-value pair ## Naming Convention: - Public API names in `AcceleratorAllocatorConfig` should be device-generic. - Members prefixed with `pinned_` are specific to the host/pinned allocator. - Environment variable names should be generic across backends. - Comma-separated key-value pairs in the format: `key:value`. Use square brackets `[]` for list values Example: `key1:123, key2:[val1,val2]` ## Environment Variables: - The default environment variable for configuration is `PYTORCH_ALLOC_CONF`. - For backward compatibility, `PYTORCH_CUDA_ALLOC_CONF` and `PYTORCH_HIP_ALLOC_CONF` are also supported with lower priority. Differential Revision: [D79011786](https://our.internmc.facebook.com/intern/diff/D79011786) Pull Request resolved: https://github.com/pytorch/pytorch/pull/149601 Approved by: https://github.com/albanD	2025-07-30 06:36:46 +00:00
PyTorch MergeBot	46915b1361	Revert "Introduce AcceleratorAllocatorConfig as the common class (#149601 )" This reverts commit 1e8e9f745e43fa38bbfc7b67b30bc66c0e7ebbd6. Reverted https://github.com/pytorch/pytorch/pull/149601 on behalf of https://github.com/huydhn due to See https://github.com/pytorch/pytorch/pull/149601#discussion_r2208325379 ([comment](https://github.com/pytorch/pytorch/pull/149601#issuecomment-3074965720))	2025-07-15 18:40:59 +00:00
PyTorch MergeBot	f2ecf6145f	Revert "Enable AcceleratorAllocatorConfig key check (#157908 )" This reverts commit 65fcca4f8c97de82d35d51ad9b790d10433e9b91. Reverted https://github.com/pytorch/pytorch/pull/157908 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing internally per https://github.com/pytorch/pytorch/pull/157908#discussion_r2208204782 ([comment](https://github.com/pytorch/pytorch/pull/157908#issuecomment-3074833696))	2025-07-15 18:17:43 +00:00
Yu, Guangye	65fcca4f8c	Enable AcceleratorAllocatorConfig key check (#157908 ) # Motivation Add a mechanism to ensure raise the key if the key is unrecognized in allocator config. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157908 Approved by: https://github.com/albanD ghstack dependencies: #149601	2025-07-11 02:11:08 +00:00
Yu, Guangye	1e8e9f745e	Introduce AcceleratorAllocatorConfig as the common class (#149601 ) # Motivation This PR aims to generalize `AllocatorConfig` to be device-agnostic. Introduce the class `AcceleratorAllocatorConfig` to clarify its scope as a configuration manager for accelerator backends (e.g., CUDA, XPU). The another name `AllocatorConfig` is now reserved for a potential future base class that can unify configuration handling for both CPU and accelerator allocators, should similar requirements arise for the CPU path. # Design Rule ## Overall This class configures memory allocation for both device and host memory. A single `AcceleratorAllocatorConfig` instance is shared across all accelerator backends, such as CUDA and XPU, under the assumption that relevant environment variables apply uniformly to all accelerators. Device-specific configuration extensions are supported via hooks (see `registerDeviceConfigParserHook`). Introduce a new class `ConfigTokenizer` to help process the env variable config key-value pair ## Naming Convention: - Public API names in `AcceleratorAllocatorConfig` should be device-generic. - Members prefixed with `pinned_` are specific to the host/pinned allocator. - Environment variable names should be generic across backends. - Comma-separated key-value pairs in the format: `key:value`. Use square brackets `[]` for list values Example: `key1:123, key2:[val1,val2]` ## Environment Variables: - The default environment variable for configuration is `PYTORCH_ALLOC_CONF`. - For backward compatibility, `PYTORCH_CUDA_ALLOC_CONF` and `PYTORCH_HIP_ALLOC_CONF` are also supported with lower priority. Pull Request resolved: https://github.com/pytorch/pytorch/pull/149601 Approved by: https://github.com/albanD	2025-07-10 07:05:39 +00:00
PyTorch MergeBot	86251eff40	Revert "Introduce AcceleratorAllocatorConfig as the common class (#149601 )" This reverts commit 55108074c0795be3b617d3b13b06794f63e1f8ca. Reverted https://github.com/pytorch/pytorch/pull/149601 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/149601#issuecomment-3050628047))	2025-07-09 00:07:31 +00:00
Yu, Guangye	55108074c0	Introduce AcceleratorAllocatorConfig as the common class (#149601 ) # Motivation This PR aims to generalize `AllocatorConfig` to be device-agnostic. Introduce the class `AcceleratorAllocatorConfig` to clarify its scope as a configuration manager for accelerator backends (e.g., CUDA, XPU). The another name `AllocatorConfig` is now reserved for a potential future base class that can unify configuration handling for both CPU and accelerator allocators, should similar requirements arise for the CPU path. # Design Rule ## Overall This class configures memory allocation for both device and host memory. A single `AcceleratorAllocatorConfig` instance is shared across all accelerator backends, such as CUDA and XPU, under the assumption that relevant environment variables apply uniformly to all accelerators. Device-specific configuration extensions are supported via hooks (see `registerDeviceConfigParserHook`). Introduce a new class `ConfigTokenizer` to help process the env variable config key-value pair ## Naming Convention: - Public API names in `AcceleratorAllocatorConfig` should be device-generic. - Members prefixed with `pinned_` are specific to the host/pinned allocator. - Environment variable names should be generic across backends. - Comma-separated key-value pairs in the format: `key:value`. Use square brackets `[]` for list values Example: `key1:123, key2:[val1,val2]` ## Environment Variables: - The default environment variable for configuration is `PYTORCH_ALLOC_CONF`. - For backward compatibility, `PYTORCH_CUDA_ALLOC_CONF` and `PYTORCH_HIP_ALLOC_CONF` are also supported with lower priority. Pull Request resolved: https://github.com/pytorch/pytorch/pull/149601 Approved by: https://github.com/albanD	2025-07-08 08:40:47 +00:00
Scott Wolchok	8eb21dffa9	consolidate ATen/test/dispatch_key_set_test.cpp with rest of DispatchKeySet tests (#151697 ) Doesn't seem to be a reason to have two test files for this. Differential Revision: [D73274020](https://our.internmc.facebook.com/intern/diff/D73274020/) Pull Request resolved: https://github.com/pytorch/pytorch/pull/151697 Approved by: https://github.com/Skylion007 ghstack dependencies: #151626, #151627, #151628, #151629, #151630	2025-04-21 02:58:12 +00:00
cyy	8fa81a6066	Enable misc-use-internal-linkage check and apply fixes (#148948 ) Enables clang-tidy rule [`misc-use-internal-linkage`](https://clang.llvm.org/extra/clang-tidy/checks/misc/use-internal-linkage.html). This new check was introduced in Clang-Tidy 18 and is available due to recent update of Clang-Tidy 19. The check marks functions and variables used only in the translation unit as static. Therefore undesired symbols are not leaked into other units, more link time optimisations are possible and the resulting binaries may be smaller. The detected violations were mostly fixed by using static. In other cases, the symbols were indeed consumed by others files, then their declaring headers were included. Still some declarations were wrong and have been fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148948 Approved by: https://github.com/Skylion007	2025-03-12 14:22:56 +00:00
cyy	203dd18c5c	Bump Clang-tidy to 19.1.4 (#148648 ) Because Clang-tidy 19 has more powerful clang-analyzer checks to detect subtle bugs. New checks such as misc-use-internal-linkage can help identify potential static variables or functions, thus reducing binary sizes. Some new checks are disabled temporarily for later enabling. Additional warnings have been fixed or suppressed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148648 Approved by: https://github.com/Skylion007	2025-03-10 17:32:30 +00:00
cyy	45ed7c13fa	Remove unneeded std::make_optional (#141567 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/141567 Approved by: https://github.com/albanD	2024-11-28 00:05:21 +00:00
cyy	383d9e3de6	[4/N] Fix cppcoreguidelines-special-member-functions warnings (#139027 ) Follows #138796 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139027 Approved by: https://github.com/ezyang	2024-10-29 00:18:18 +00:00
cyy	0c0d8c8ff0	[1/N] Fix extra warnings brought by clang-tidy-17 (#137407 ) Before we can use clang-tidy-17 Pull Request resolved: https://github.com/pytorch/pytorch/pull/137407 Approved by: https://github.com/Skylion007, https://github.com/aaronenyeshi	2024-10-07 17:53:59 +00:00
Yuanhao Ji	44dadf2506	[Fix] Check name when registering privateuse1 backend (#134071 ) do some checks when registering privateuse1 backend to avoid using in-tree deivce names Pull Request resolved: https://github.com/pytorch/pytorch/pull/134071 Approved by: https://github.com/albanD	2024-08-27 20:28:30 +00:00
Yuanhao Ji	343071cd96	Fix privateuse1 backend name case (#132980 ) ### Problem `get_privateuse1_backend(bool lower_case)` always returns a lower case name and `lower_case` is not used. Pull Request resolved: https://github.com/pytorch/pytorch/pull/132980 Approved by: https://github.com/albanD	2024-08-10 07:39:54 +00:00
cyy	043e41f4f4	[10/N] Use std::nullopt and std::make_optional (#132364 ) Follows #130674 Pull Request resolved: https://github.com/pytorch/pytorch/pull/132364 Approved by: https://github.com/ezyang	2024-08-01 07:02:35 +00:00
cyy	28f6ae2718	[9/N] Replace c10::optional with std::optional (#130674 ) Follows #130509 Pull Request resolved: https://github.com/pytorch/pytorch/pull/130674 Approved by: https://github.com/Skylion007	2024-07-15 00:48:43 +00:00
cyy	f4dcf2ae93	[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301 Approved by: https://github.com/ezyang, https://github.com/r-barnes	2024-07-08 07:03:53 +00:00
PyTorch MergeBot	846bb30e13	Revert "[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 )" This reverts commit bd72e28314d8d63bb347becb8309f5ac7761c6b5. Reverted https://github.com/pytorch/pytorch/pull/128301 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it fails XLA build `bd72e28314`. Please rebase your PR before relanding because I think the failure is hidden by an unrelated broken trunk XLA failure from your current base commit ([comment](https://github.com/pytorch/pytorch/pull/128301#issuecomment-2169035822))	2024-06-15 01:58:20 +00:00
cyy	bd72e28314	[1/N] Change #include <c10/util/Optional.h> to #include <optional> (#128301 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/128301 Approved by: https://github.com/ezyang	2024-06-14 23:21:01 +00:00
Kiuk Chung	8629939a51	[torch/c10] Add C10_UBSAN_ENABLED macro and use it to disable SymInt_… (#127967 ) Adds `C10_UBSAN_ENABLED` macro and use it to disable `SymIntTest::Overflows` (fails under `signed-integer-overflow` UBSAN check). Also cleans up UBSAN guard in `jit/test_misc.cpp` to use `C10_UBSAN_ENABLED` and the existing `C10_ASAN_ENABLED` instead of locally defining `HAS_ASANUBSAN`. > NOTE: This should fix `SymIntTest::Overflows` failing under ubsan in fbcode too... Pull Request resolved: https://github.com/pytorch/pytorch/pull/127967 Approved by: https://github.com/atalman, https://github.com/d4l3k, https://github.com/malfet	2024-06-14 16:01:12 +00:00
Pearu Peterson	70d4d109f2	Make SparseCsr a functionality dispatch key (#120703 ) As in the title. To enable meta and fake tensor support for sparse compressed tensors in compliance with the meta/fake tensor support for sparse COO tensor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/120703 Approved by: https://github.com/ezyang	2024-03-01 13:28:46 +00:00
Nikita Shulga	90b3cf33ac	[C10] Make Scalar constructable from longs (#118149 ) On Linux and Mac `int64_t` is an alias to either `long` (Linux) or `long long` (Mac) Because of that, attempt to construct `c10::Scalar` from the other type will fail with `conversion from ‘long long int’ to ‘c10::Scalar’ is ambiguous`. I.e. attempt to compile: ```cpp int main() { c10::Scalar s = 1L; } ``` on MacOS failed with: ``` foo.cpp:3:15: error: conversion from 'long' to 'c10::Scalar' is ambiguous c10::Scalar s = 1L; ^ ~~ /Users/nshulga/git/pytorch/pytorch/torch/include/c10/core/Scalar.h:59:7: note: candidate constructor DEFINE_IMPLICIT_CTOR) ^ /Users/nshulga/git/pytorch/pytorch/torch/include/c10/core/Scalar.h:59:7: note: candidate constructor /Users/nshulga/git/pytorch/pytorch/torch/include/c10/core/Scalar.h:59:7: note: candidate constructor /Users/nshulga/git/pytorch/pytorch/torch/include/c10/core/Scalar.h:59:7: note: candidate constructor /Users/nshulga/git/pytorch/pytorch/torch/include/c10/core/Scalar.h:59:7: note: candidate constructor /Users/nshulga/git/pytorch/pytorch/torch/include/c10/core/Scalar.h:59:7: note: candidate constructor /Users/nshulga/git/pytorch/pytorch/torch/include/c10/core/Scalar.h:59:7: note: candidate constructor /Users/nshulga/git/pytorch/pytorch/torch/include/c10/core/Scalar.h:62:3: note: candidate constructor Scalar(uint16_t vv) : Scalar(vv, true) {} ^ /Users/nshulga/git/pytorch/pytorch/torch/include/c10/core/Scalar.h:63:3: note: candidate constructor Scalar(uint32_t vv) : Scalar(vv, true) {} ^ /Users/nshulga/git/pytorch/pytorch/torch/include/c10/core/Scalar.h:64:3: note: candidate constructor Scalar(uint64_t vv) { ^ ``` Prevent this by providing missing constructors when needed. Alas one can not use SFINAE, as template constructors on Scalar mess up a lot of implicit conversions, so I use `static_asserts` to detect early on if premise for constructing this class holds. Add ScalarTest::LongsAndLongLongs that is essentially a compile time test Discovered while trying to enable AOTI on MacOS Pull Request resolved: https://github.com/pytorch/pytorch/pull/118149 Approved by: https://github.com/ezyang, https://github.com/albanD ghstack dependencies: #118077, #118076	2024-01-24 17:32:29 +00:00
Edward Yang	b4a35632f9	Add function to materialize COW storages (#117053 ) Summary: From Kurt Mohler, see https://github.com/pytorch/pytorch/pull/113396 (manually imported due to ghimport problems) Test Plan: sandcastle, OSS CI Differential Revision: D52610522 Pull Request resolved: https://github.com/pytorch/pytorch/pull/117053 Approved by: https://github.com/malfet, https://github.com/kurtamohler	2024-01-10 15:34:16 +00:00
Edward Z. Yang	f4e35e2c3d	Proposed mechanism for handling uint64_t in Scalar (#116595 ) Here's the problem: if we support unsigned integer types, and in particular if we support uint64_t, we need a way to represent these integers in Scalar. However, Scalar currently stores all integral values inside int64_t, which is not wide enough to accommodate all possible uint64_t values. So we need to do something to Scalar to support it. The obvious thing to do is add a uint64_t field to the union, and used it some situations. But when should we use it? The proposal is that we only use it if and only if the integer in question is not representable in int64_t. The historical precedent for this is our handling for uint8_t. Because this type is representable inside int64_t, we have historically stored it inside Scalar as an int64_t. In general, the concept behind Scalar is that it doesn't know the signedness/unsignedness/bitwidth of its input; in particular, we typically construct Scalar from Python int, which doesn't have any concept of how wide the integer is! So it doesn't make any sense to allow for a small integer like 255 to be representable under both the HAS_i tag and the HAS_u tag. So we forbid the latter case. Although I have proposed this, the PR as currently written just chokes when you pass it a uint64_t that's too big. There's some more logic that would have to be written out for this. I'm putting this out to start to get some agreement that this is the way to do it. Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/116595 Approved by: https://github.com/albanD	2024-01-08 22:02:03 +00:00
Nikita Shulga	f2c1fb3ee4	Fix crash in SymInt unary minus (#116160 ) Before this change `-SymInt(std::numeric_limits<int64_t>::min()) == 0` would reliably crash with null pointer dereference, as `data_` of the SymInt returned by `operator-` would be `0x8000000000000000`, because of the carry/overflow flags set by `negq`. Before the change x86_64 assembly generated for `4f02cc0670/c10/core/SymInt.cpp (L137)` looked as follows: ``` 0x7ffff7f2f490 <+115>: movq %rax, %rdx 0x7ffff7f2f493 <+118>: negq %rdx 0x7ffff7f2f496 <+121>: movq %rdx, (%rbp) 0x7ffff7f2f49a <+125>: movabsq $0x4000000000000000, %rdx ; imm = 0x4000000000000000 0x7ffff7f2f4a4 <+135>: cmpq %rdx, %rax 0x7ffff7f2f4a7 <+138>: jle 0x7ffff7f2f520 ; <+259> at SymInt.cpp:141:1 ``` `negq %rfx` correspond to unary minus and `cmpq %rdx, 0x4000000000000000` are inverted `check_range` `b6d0d0819a/c10/core/SymInt.h (L247-L249)` Flags raised by `negq` will affect the results of `cmpq`, and as result value would not be allocated on heap, but rather preserved as `nullptr`. Not sure if it's worth benchmarking, but perhaps using `__builtin_sub_overflow` would be faster as it does not require an extra comparison, just guarantees that overflow flags is cleared after the op. Pull Request resolved: https://github.com/pytorch/pytorch/pull/116160 Approved by: https://github.com/Skylion007, https://github.com/colesbury	2023-12-20 20:12:57 +00:00
soulitzer	4d8ad4fb82	Move SingletonSymNodeImpl from c10 to aten (#114895 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/114895 Approved by: https://github.com/jbschlosser	2023-12-13 20:01:18 +00:00
PyTorch MergeBot	f36d09fcb7	Revert "Add function to materialize COW storages (#113396 )" This reverts commit e2f090086bd494ee7b25da5b8e4f48d6cf61cc98. Reverted https://github.com/pytorch/pytorch/pull/113396 on behalf of https://github.com/DanilBaibak due to Break internal build ([comment](https://github.com/pytorch/pytorch/pull/113396#issuecomment-1818769090))	2023-11-20 10:26:01 +00:00
Kurt Mohler	e2f090086b	Add function to materialize COW storages (#113396 ) Part of #109833 Pull Request resolved: https://github.com/pytorch/pytorch/pull/113396 Approved by: https://github.com/ezyang	2023-11-17 01:58:51 +00:00
Kurt Mohler	a267d95c2a	Reland: Add `lazy_clone_storage` to create COW storages (#111579 ) Relands #110192 NOTE: COW storages do not actually copy on write yet, they just have the COW deleter and deleter context applied to them Part of #109833 Pull Request resolved: https://github.com/pytorch/pytorch/pull/111579 Approved by: https://github.com/ezyang	2023-10-20 15:49:59 +00:00
PyTorch MergeBot	97a513ed07	Revert "Add `lazy_clone_storage` to create COW storages (#110192 )" This reverts commit 1c308144177d6e1663e41aae32a89e1c49b8b3b4. Reverted https://github.com/pytorch/pytorch/pull/110192 on behalf of https://github.com/jeanschmidt due to Breaking internal builds, @ezyang please support the author providing further details ([comment](https://github.com/pytorch/pytorch/pull/110192#issuecomment-1765157285))	2023-10-16 19:43:20 +00:00
Kurt Mohler	1c30814417	Add `lazy_clone_storage` to create COW storages (#110192 ) This PR relands #110022 but accounts for the changes in #110191. Also, the function for creating COW storages is called `lazy_clone_storage` in this PR, instead of `try_ensure` NOTE: COW storages do not actually copy on write yet, they just have the COW deleter and deleter context applied to them Part of #109833 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110192 Approved by: https://github.com/ezyang	2023-10-14 00:53:21 +00:00
PyTorch MergeBot	482782406a	Revert "Add `lazy_clone_storage` to create COW storages (#110192 )" This reverts commit 33f151348684bd74fbc9939f00c39408ef92074d. Reverted https://github.com/pytorch/pytorch/pull/110192 on behalf of https://github.com/kit1980 due to revert to work around some importing issues ([comment](https://github.com/pytorch/pytorch/pull/110192#issuecomment-1762430374))	2023-10-14 00:48:45 +00:00
Kurt Mohler	33f1513486	Add `lazy_clone_storage` to create COW storages (#110192 ) This PR relands #110022 but accounts for the changes in #110191. Also, the function for creating COW storages is called `lazy_clone_storage` in this PR, instead of `try_ensure` NOTE: COW storages do not actually copy on write yet, they just have the COW deleter and deleter context applied to them Part of #109833 Differential Revision: [D50265134](https://our.internmc.facebook.com/intern/diff/D50265134) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110192 Approved by: https://github.com/ezyang	2023-10-13 15:33:40 +00:00
soulitzer	fda0a965c7	[reland] Support SingletonSymNode mul with coefficient (#110673 ) reland of https://github.com/pytorch/pytorch/pull/110369 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110673 Approved by: https://github.com/ezyang	2023-10-10 19:37:17 +00:00
soulitzer	69ea214cc2	[reland] Update singleton int to error when inequality relation is undefined (#110672 ) reland of https://github.com/pytorch/pytorch/pull/110044 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110672 Approved by: https://github.com/ezyang	2023-10-06 17:50:25 +00:00
PyTorch MergeBot	330db8278b	Revert "Update singleton int to error when inequality relation is undefined (#110044 )" This reverts commit 07331c65e6b47f41475fc0d81ba03917f39b55dd. Reverted https://github.com/pytorch/pytorch/pull/110044 on behalf of https://github.com/PaliC due to bottom diff is causing a plethora of internal failures ([comment](https://github.com/pytorch/pytorch/pull/110044#issuecomment-1749805209))	2023-10-05 23:55:37 +00:00
PyTorch MergeBot	1c3fae46ee	Revert "Support SingletonSymNode mul with coefficient (#110369 )" This reverts commit eb8feb8ff8610d53d92773c2d7dce05c2196d672. Reverted https://github.com/pytorch/pytorch/pull/110369 on behalf of https://github.com/PaliC due to bottom diff is causing a plethora of internal failures ([comment](https://github.com/pytorch/pytorch/pull/110369#issuecomment-1749802899))	2023-10-05 23:51:28 +00:00
soulitzer	eb8feb8ff8	Support SingletonSymNode mul with coefficient (#110369 ) We want to be able to use SingletonSymNode to represent strides for Jagged layout tensor. The following is for 3D, but easily generalizable to higher dimensions. Constraints: - [B, x, D] (where x represents the "variably lengthed dim") can be strided in two ways [x, 1, sum(x)] and [dx, d, 1]. We need two different placeholder values depending on how the jagged tensor is strided. - When doing operations we need the strides of output tensors to be expressable in terms of the strides and sizes of the inner tensors. Given [B, x, D] @ [D, D'], the output strides is [x * D', D', 1] rather than some opaque [x2, D', 1]. This constraint exists because if I'm tracing, I need a symint to represent the output stride. This symint needs to come from somewhere; I get it in several ways: (1) create a constant, (2) unbacked symint, (3) create a new input using a source, (4) output of an operation on an existing symint. It is clear that (4) is what we want here, which brings us to the design below. Design: Given the two constraints, the most straightforward way to implement this is actually to update SingletonSymNode to include some scalar factor, i.e. Morally, SingletonSymNode represents `factor * [s_0, s_1, …, s_n]` This enables us to symbolically compute strides from sizes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110369 Approved by: https://github.com/ezyang ghstack dependencies: #110044	2023-10-04 22:56:15 +00:00
soulitzer	07331c65e6	Update singleton int to error when inequality relation is undefined (#110044 ) Previously, something like j0 >= 3, would return False. In sympy however, it is not possible to make it so that both j0 >= 3 and j0 < 3 return False. In sympy, you only get to dispatch on Ge, and the remaining are derived, e.g. defining Ge(j0 >= 3) to be False would force Lt(j0, 3) to be True, which is not what we want. In this PR, we make it so that both j0 >=3 and j0 < 3 error, so that in a future PR when we create the symbolic counterpart of this singleton, the behaviors can be the same. Pull Request resolved: https://github.com/pytorch/pytorch/pull/110044 Approved by: https://github.com/ezyang	2023-10-04 22:55:53 +00:00
cyy	55905c4a1a	[2/N] Enable clang-tidy to c10/test/*cpp (#110270 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/110270 Approved by: https://github.com/Skylion007, https://github.com/kit1980	2023-10-01 07:36:23 +00:00
cyy	3dc479e70b	[1/N] Apply clang-tidy to c10/test/*cpp (#109278 ) This series of PR enables clang-tidy checks in c10/test. We aim to finally add the path to lintrunner.toml Pull Request resolved: https://github.com/pytorch/pytorch/pull/109278 Approved by: https://github.com/kit1980	2023-09-29 02:20:57 +00:00
Kurt Mohler	f2c360e3e5	Reorganize and rename COW files and APIs (#110191 ) This PR does the following: * Combine `cow/context.<h/cpp>` and `cow/deleter.<h/cpp>` into `cow/COWDeleter.<h/cpp>` * Rename `Context` to `COWDeleterContext` * Rename `delete_context` to `cow_deleter` * Remove the separate `impl_cow_context` bazel library, combining it with the base c10 core library * Rename `context_test.cpp` to `cow_test.cpp` Pull Request resolved: https://github.com/pytorch/pytorch/pull/110191 Approved by: https://github.com/ezyang	2023-09-28 17:50:44 +00:00
PyTorch MergeBot	1265400ba6	Revert "Reland: implement a function to convert a storage to copy-on-write (#110022 )" This reverts commit dddf07e56a9a798ae27d976d697c3d434cf63a5b. Reverted https://github.com/pytorch/pytorch/pull/110022 on behalf of https://github.com/atalman due to New tests are failing in internal CI ([comment](https://github.com/pytorch/pytorch/pull/110022#issuecomment-1737584693))	2023-09-27 15:05:41 +00:00
mikey dagitses	dddf07e56a	Reland: implement a function to convert a storage to copy-on-write (#110022 ) Relands #100819 In addition, the `impl_cow_context` library is combined into the base c10 core library, and COW unit tests are combined into just one binary. Part of #109833 Pull Request resolved: https://github.com/pytorch/pytorch/pull/110022 Approved by: https://github.com/ezyang	2023-09-26 03:33:18 +00:00
soulitzer	4667a5c948	Update SingletonSymNode to allow more comparisons (#108315 ) In this PR: - {in,}equality between singleton and plain ints returns false instead of erroring - Morally define the semantic of j0 > c to be as if j0 represented an array [s_0, s_1, ... s_n] and s_k > c for all k - Just like for equality, we don't actually want to do the comparison one by one, instead j0 is constrained to some range [min, max]. By default this range is [2, int64_t::max] so that it acts like a size and passes 0/1 specialization checks. - In the future, we can define some API to allow users to constrain the range of their singletons Pull Request resolved: https://github.com/pytorch/pytorch/pull/108315 Approved by: https://github.com/ezyang	2023-09-13 01:58:02 +00:00

1 2 3

111 Commits