pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 21:49:24 +08:00

Author	SHA1	Message	Date
Yanbo Liang	d719f0276d	[Dynamo] Fix nested function resume execution (#100426 ) Fixes #99665 Let me explain the root cause using the unit test I added: * This bug is triggered when: * ```wrapped``` is a nested function. * ```wrapped``` is in another module which is different from the main function ```fn```. * There is a graph break inside of ```wrapped```. * The root cause is when resuming nested function, actually we are using the outermost function(```fn``` in my example)'s global variables, but ```wrapped``` calls ```inner_func``` which is not part of ```fn```'s globals, so we have to set correct globals when nested function resume execution. Pull Request resolved: https://github.com/pytorch/pytorch/pull/100426 Approved by: https://github.com/jansel	2023-05-06 05:04:50 +00:00
mingfeima	a8429342df	fix mul/div overflow issue on CPU float16 (#98820 ) Fix https://github.com/pytorch/pytorch/issues/63482 and https://github.com/pytorch/pytorch/issues/98691 The above two issues have the same root cause: binary_ops will create TensorIterator with the flag `promote_inputs_to_common_dtype` on, which will convert both input tensors to the common_dtype_ (the logic is bypassed on CUDA), which might overflow on Half. If one of the inputs is a scalar with abs value larger than ~65000, it will overflow. This patch will try to fetch the scalar value from the `original_tensor_base` which records the original scalar input value, then in the `cpu_kernel_vec` the TensorIterator is treated as an unary Op. So previously, CPU and CUDA would have different behaviors for such scenario. This is aligned with this patch, test cases added for both CPU and CUDA device. The following is the results: #### before: ``` >>> torch.tensor([3388.], dtype=torch.half).div(524288.0) tensor([0.], dtype=torch.float16) >>> torch.tensor([0.01], dtype=torch.float16) * torch.tensor(65536, dtype=torch.float32) tensor([inf], dtype=torch.float16) ``` #### after: ``` >>> torch.tensor([3388.], dtype=torch.half).div(524288.0) tensor([0.0065], dtype=torch.float16) >>> torch.tensor([0.01], dtype=torch.float16) * torch.tensor(65536, dtype=torch.float32) tensor([655.5000], dtype=torch.float16) ``` Also need to update `RRelu` implementation, to use float to store the intermediate results, otherwise the following test case would fail: ``` . build/bin/test_api --gtest_filter=ModulesTest.RReLU ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/98820 Approved by: https://github.com/jgong5, https://github.com/ngimel	2023-04-17 07:12:53 +00:00
Aaron Gokaslan	47dca20d80	[BE] Enable flake8-comprehension rule C417 (#97880 ) Enables flake8-comprehension rule C417. Ruff autogenerated these fixes to the codebase. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97880 Approved by: https://github.com/ezyang, https://github.com/kit1980, https://github.com/albanD	2023-03-30 14:34:24 +00:00
BJ Hargrave	b59a60ddff	Fix CPU bitwise shifts for out-of-limit shift values (#96659 ) Negative shift values and positive shift values greater than the bit size of the dtype (limit `0..bits`) now yield expected results which are consistent with numpy. Left shift with an out-of-limit shift value result in a value of `0`. Right shift with an out-of-limit shift value results in a value of `-1` for negative inputs and `0` for non-negative inputs (sign preserving). Fixes https://github.com/pytorch/pytorch/issues/70904 Pull Request resolved: https://github.com/pytorch/pytorch/pull/96659 Approved by: https://github.com/ngimel, https://github.com/albanD, https://github.com/zou3519, https://github.com/jgong5, https://github.com/malfet	2023-03-17 21:35:34 +00:00
mfkasim1	975333d80c	Logaddexp for complex in CPU (#95717 ) Continuation of PR #93153 where I implemented logaddexp for complex, but didn't expose it to `torch.logaddexp`. So this PR is to expose the complex logaddexp to `torch.logaddexp`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/95717 Approved by: https://github.com/lezcano	2023-03-01 20:37:46 +00:00
Xuehai Pan	b005ec62b9	[BE] Remove dependency on `six` and `future` (#94709 ) Remove the Python 2 and 3 compatibility library [six](https://pypi.org/project/six) and [future](https://pypi.org/project/future) and `torch._six`. We only support Python 3.8+ now. It's time to retire them. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94709 Approved by: https://github.com/malfet, https://github.com/Skylion007	2023-02-14 09:14:14 +00:00
Aaron Gokaslan	67d9790985	[BE] Apply almost all remaining flake8-comprehension checks (#94676 ) Applies the remaining flake8-comprehension fixes and checks. This changes replace all remaining unnecessary generator expressions with list/dict/set comprehensions which are more succinct, performant, and better supported by our torch.jit compiler. It also removes useless generators such as 'set(a for a in b)`, resolving it into just the set call. Pull Request resolved: https://github.com/pytorch/pytorch/pull/94676 Approved by: https://github.com/ezyang	2023-02-12 01:01:25 +00:00
Muhammad Firmansyah Kasim	aaa27a6b6d	Vectorized more stable complex division (#93277 ) Fixes #92043 and completing #92539 by implementing the vectorized more stable complex division. I implement this using the internal `abs_` function to avoid branching. I also re-implement the internal `abs_` to make it more stable. Pull Request resolved: https://github.com/pytorch/pytorch/pull/93277 Approved by: https://github.com/peterbell10, https://github.com/lezcano	2023-02-03 11:48:20 +00:00
mfkasim1	1f55f3b0de	Solving the under/overflow for complex division (#92539 ) Fixes #92043. I'm following numpy's implementation as suggested by @min-jean-cho. I found out that this implementation still produces overflow if we're working with numbers greater than `finfo.max / 2`, but this is still much better than the previous implementation where it gets overflow with numbers greater than `finfo.max ** 0.5`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/92539 Approved by: https://github.com/lezcano	2023-01-26 01:14:06 +00:00
Natalia Gimelshein	f1b78224ca	Fix type promotion for 2 wrapped scalar args (#87845 ) Fixes #76801 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87845 Approved by: https://github.com/SherlockNoMad, https://github.com/mruberry	2022-10-27 15:53:11 +00:00
CaoE	cca909645f	Add bfloat16 support for lerp on CPU (#84327 ) ### Description Add bfloat16 support for lerp on CPU ### Testing single core: <html> <body> <!--StartFragment--> op \| shape \|fp32 forward/ms\|bf16 forward/s\|fb32 backward/s\| bf16 backward/s -- \| -- \| -- \| -- \| -- \| -- lerp (tensor) \| [10, 128, 10, 124] \| 0.005489 \| 0.000613 \| 0.006658 \| 0.003385 \| [10, 128, 20, 124] \| 0.011057 \| 0.001204 \| 0.016032 \| 0.007869 \| [10, 128, 30, 124] \| 0.016691 \| 0.001954 \| 0.025549 \| 0.012823 \| \| \| \| \| lerp (scalar) \| [10, 128, 10, 124] \| 0.001096 \| 0.000507 \| 0.002024 \| 0.001479 \| [10, 128, 20, 124] \| 0.00247 \| 0.000997 \| 0.005468 \| 0.002907 \| [10, 128, 30, 124] \| 0.004178 \| 0.001513 \| 0.009775 \| 0.004859 <!--EndFragment--> </body> </html> single socket (28cores): <html> <body> <!--StartFragment--> op \| shape \| fp32 forward/s\| bf16 forward/s\| fb32backward/s\| bf16 backward/s -- \| -- \| -- \| -- \| -- \| -- lerp (tensor) \| [10, 128, 10, 124] \| 0.000236 \| 3.93E-05 \| 0.000494 \| 0.000235 \| [10, 128, 20, 124] \| 0.000525 \| 7.39E-05 \| 0.002485 \| 0.000638 \| [10, 128, 30, 124] \| 0.000801 \| 0.000121 \| 0.004235 \| 0.001529 \| \| \| \| \| lerp (scalar) \| [10, 128, 10, 124] \| 5.90E-05 \| 3.32E-05 \| 0.000129 \| 0.000116 \| [10, 128, 20, 124] \| 0.000155 \| 5.87E-05 \| 0.000368 \| 0.000206 \| [10, 128, 30, 124] \| 0.000324 \| 9.04E-05 \| 0.001322 \| 0.000313 <!--EndFragment--> </body> </html> Pull Request resolved: https://github.com/pytorch/pytorch/pull/84327 Approved by: https://github.com/frank-wei	2022-09-29 01:16:16 +00:00
Natalia Gimelshein	a998a8eb10	Fix segfault for `out` with a large number of dims (#85294 ) Fixes #85166, #85167, #79218, #85251 Pull Request resolved: https://github.com/pytorch/pytorch/pull/85294 Approved by: https://github.com/malfet	2022-09-19 22:00:43 +00:00
Mario Lezcano	88d3acd6b1	Fix and improve the efficiency of the backward of xlog* functions. (#82713 ) That is `xlogy`, `special.xlogy`, `special.xlog1py`. Fixes https://github.com/pytorch/pytorch/issues/80770 Fixes https://github.com/pytorch/pytorch/issues/74279 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82713 Approved by: https://github.com/albanD	2022-08-18 21:55:42 +00:00
Sergii Dymchenko	a0b3854548	Change seperate -> separate (#83056 ) One instance was caught by Meta-internal "exact-word-misspell" linter in D38505529. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83056 Approved by: https://github.com/huydhn, https://github.com/seemethere	2022-08-09 23:11:34 +00:00
Peter Bell	5e3d1ef49f	Allow ufunc OpInfos to have no reference (#82348 ) The `ref` property was moved down from `{Unary,Binary}UfuncInfo` into `OpInfo` quite some time ago, but `OpInfo` uses `None` to signal no reference is available while the others use `_NOTHING`. This makes everything consistently use `None`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82348 Approved by: https://github.com/ngimel	2022-08-09 04:38:17 +00:00
PyTorch MergeBot	814c19b266	Revert "Allow ufunc OpInfos to have no reference (#82348 )" This reverts commit 566d734396eb77e1781b5cfa14390e02ed7b2409. Reverted https://github.com/pytorch/pytorch/pull/82348 on behalf of https://github.com/peterbell10 due to This stack broke macos tests on trunk	2022-08-06 21:09:09 +00:00
Peter Bell	566d734396	Allow ufunc OpInfos to have no reference (#82348 ) The `ref` property was moved down from `{Unary,Binary}UfuncInfo` into `OpInfo` quite some time ago, but `OpInfo` uses `None` to signal no reference is available while the others use `_NOTHING`. This makes everything consistently use `None`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82348 Approved by: https://github.com/ngimel	2022-08-06 20:01:39 +00:00
Natalia Gimelshein	ac8c6d09d1	Don't check overflow for scalar arg in comparison ops (#78881 ) That check was potentially synchronizing (people were running into synchronization in real workloads) and mostly unneeded. Type promotion takes care of comparing to values that cannot be safely converted to the type of other argument. This now works for `comp(int_tensor, float('inf'))` as expected. For `comp(uint8_tensor, large_int)` it silently wraps integer to uint8 whereas it would error out before, but this also happens with other arithmetic ops. Also adds reference np implementation to comparison op OpInfos, and additional reference inputs to test newly enabled behavior. Fixes #76805 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78881 Approved by: https://github.com/mruberry	2022-06-05 01:43:51 +00:00
Kshiteej K	849b08f14b	[reland][chalf] where(cpu and cuda), pow(cuda) (#78665 ) Reland: https://github.com/pytorch/pytorch/pull/77640 Ref: #74537 Pull Request resolved: https://github.com/pytorch/pytorch/pull/78665 Approved by: https://github.com/ngimel	2022-06-02 18:04:06 +00:00
PyTorch MergeBot	4bb8db85e9	Revert "[chalf] where(cpu and cuda), pow(cuda) (#77640 )" This reverts commit 3697cf7f76fcad845a1f38643d8b92febf5bc5a3. Reverted https://github.com/pytorch/pytorch/pull/77640 on behalf of https://github.com/mruberry due to as it broke ROCM on trunk	2022-06-01 19:39:38 +00:00
kshitij12345	3697cf7f76	[chalf] where(cpu and cuda), pow(cuda) (#77640 ) Ref: #74537 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77640 Approved by: https://github.com/anjali411, https://github.com/ngimel	2022-06-01 18:35:53 +00:00
Mike Ruberry	089203f8bc	Updates floor_divide to perform floor division (#78411 ) Fixes https://github.com/pytorch/pytorch/issues/43874 This PR changes floor_divide to perform floor division instead of truncation division. This is a BC-breaking change, but it's a "bug fix," and we've already warned users for several releases this behavior would change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/78411 Approved by: https://github.com/ngimel	2022-05-29 21:28:45 +00:00
kshitij12345	efb2c093fc	[fix] complex type promotion (#77524 ) Fixes https://github.com/pytorch/pytorch/issues/76803 Before Fix: ```python >> a = torch.randn((2, 2), dtype=torch.float) >> b = torch.tensor(1, dtype=torch.cdouble) >> (a + b).dtype torch.complex128 ``` After Fix: ```python >> a = torch.randn((2, 2), dtype=torch.float) >> b = torch.tensor(1, dtype=torch.cdouble) >> (a + b).dtype torch.complex64 ``` Note: This is a BC Breaking change. Pull Request resolved: https://github.com/pytorch/pytorch/pull/77524 Approved by: https://github.com/anjali411, https://github.com/mruberry	2022-05-20 10:23:56 +00:00
Xiang Gao	f274558018	Bitwise ops improvements (#77621 ) - Bitwise shift remove floating point support - Bitwise and, or, xor add (scalar, tensor) overload - Use `test_ops.py` to test these ops, including error cases Pull Request resolved: https://github.com/pytorch/pytorch/pull/77621 Approved by: https://github.com/ngimel	2022-05-17 21:16:42 +00:00
Andrew M. James	39d3a7ffe5	Connect Tensor.__ipow__ to pow_ method The `pow_` method should be connected to `Tensor.__ipow__` so that the operator `**=` works correctly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76900 Approved by: https://github.com/mruberry	2022-05-15 15:06:30 +00:00
Jiayi Sun	cdf9572c52	add BFloat16 operators on CPU: histc, atan2 (#72695 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/72695 Approved by: https://github.com/VitalyFedyunin, https://github.com/frank-wei	2022-05-12 17:45:08 +00:00
Xiang Gao	cc9d0f309e	lshift and rshift stop support floating types (#77146 ) Fixes #74358 Pull Request resolved: https://github.com/pytorch/pytorch/pull/77146 Approved by: https://github.com/ngimel	2022-05-11 22:29:30 +00:00
Mike Ruberry	c031643e39	Adds decorators for Python References and extends Python Reference testing (#76945 ) This PR does the following... Tests: - fixes test_type_promotion in test_binary_ufuncs to correctly generate scalar cpu tensors - fixes test_python_reference_consistency to use the Python Reference's reference inputs - extends Python reference testing to test_conj_view, test_neg_view, and test_neg_conj_view - adds a NaN propagation sample input for elementwise unary and binary operations - fixes the UnaryUfuncInfo class to properly register its reference inputs - Updates the Python Reference OpInfos to skip error inputs when their behavior on scalar inputs is inconsistent with their reference operators Code organization: - moves elementwise type promotion functionality to prims.utils Prims & Refs: - fixes scalar cpu tensor handling by having them pass through broadcasting and device and shape checks - adds two decorators, `elementwise_type_promotion_wrapper` and `out_wrapper`, the former allows for elementwise type promotion to be automated and the latter automatically adds the out kwarg and handles it properly cc @ezyang who also had some thoughts on cpu scalar tensor handling cc @chillee -- might want to use this new decorator as we converge decompositions and references Pull Request resolved: https://github.com/pytorch/pytorch/pull/76945 Approved by: https://github.com/ngimel	2022-05-07 03:42:24 +00:00
Mike Ruberry	b557e102d8	Fixes prim type promotion and updates type promotion testing This PR fixes prim elementwise type promotion, tests elementwise binary references using `test_type_promotion` in the elementwise binary test suite, and updates that test with additional cases for float x complex and scalar type promotion. The following issues were discovered while working on this PR: - https://github.com/pytorch/pytorch/issues/76806 - https://github.com/pytorch/pytorch/issues/76805 - https://github.com/pytorch/pytorch/issues/76804 - https://github.com/pytorch/pytorch/issues/76803 - https://github.com/pytorch/pytorch/issues/76801 Pull Request resolved: https://github.com/pytorch/pytorch/pull/76809 Approved by: https://github.com/ngimel	2022-05-04 17:58:10 +00:00
Mike Ruberry	f6bbecf8b5	Adds python ref consistency test, elementwise unary reference inputs, and formats test files Per title. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76626 Approved by: https://github.com/ngimel	2022-05-01 22:42:46 +00:00
kshitij12345	1f1d0b30ce	[fix] mul chalf: cpu scalar and cuda tensor case https://github.com/pytorch/pytorch/pull/76158 added `chalf` support for `mul` on CUDA incorrectly. Following sample ```python torch.mul(torch.zeros(3, device='cuda'), 2.5) # CUDA Tensor and CPU Scalar ``` fails with ``` RuntimeError: iter.device(arg).is_cuda() INTERNAL ASSERT FAILED at "../aten/src/ATen/native/cuda/JitLoops.cuh":83, please report a bug to PyTorch. argument 2: expected a CUDA device but found cpu ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/76364 Approved by: https://github.com/mruberry	2022-04-30 06:31:52 +00:00
Mike Ruberry	4048d4cdd2	[primTorch] Prototype tracer and elementwise unary reference opinfo class Adds a prototype tracer with no caching support and the `ElementwiseUnaryPythonRefInfo` class. A reference for `floor` is added to test the latter, and the elementwise binary reference inputs are extended to also return noncontiguous inputs. The SampleInput transform operation has been updated to return an actual SampleInput instead of a tuple to facilitate uniform handling of (transformed) SampleInputs. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76388 Approved by: https://github.com/ngimel	2022-04-27 14:40:21 +00:00
Jeff Daily	24d06182f6	[ROCm] unskip test_fmod_remainder_by_zero_integral The test is updated to check for ROCm-specific undefined behavior for integral fmod division by zero. Fixes #48130. Pull Request resolved: https://github.com/pytorch/pytorch/pull/76119 Approved by: https://github.com/mruberry	2022-04-22 17:34:36 +00:00
Natalia Gimelshein	2f9a239ae9	do intermediate math in opmath_t in lerp Per title Pull Request resolved: https://github.com/pytorch/pytorch/pull/76062 Approved by: https://github.com/mruberry	2022-04-20 04:05:55 +00:00
Mike Ruberry	de949a0e59	Various OpInfo architecture improvements This PR makes the following improvements: - moves the custom skip list for test_normalize_operator_exhaustive in test_fx_experimental to use the typical OpInfo skip architecture. The skips were updated to xfails, and that identified some operators which were no longer failing the test - redundant tests with OpInfo-based testing in test_jit.py were removed - test_dtypes was improved so its error messages are clear and it makes test_nondifferentiable redundant; the latter test has been removed - OpInfo.supports_complex_autograd() is removed in favor of a more accurate and general test for whether the particular dtype is in the backward dtypes of the operator - gradchecks have been improved to verify that an operator doesn't support grad if it claims not to - gradchecks have been improved to test the gradient of all input tensors that require gradient - the concept of "default test dtypes" has been removed - excessive and mostly redundant out testing for elementwise unary operators has been removed - metadata for whether an op supports nuanced "safe casting" to out behavior has been removed from OpInfos - numerous skips have been converted to xfails - numerous OpInfos have had their metadata fixed based on the new checks - jit-specific utilities in common_methods_invocations.py have been moved to jit_programming_utils.py Pull Request resolved: https://github.com/pytorch/pytorch/pull/75951 Approved by: https://github.com/ngimel	2022-04-18 21:55:32 +00:00
Mike Ruberry	b09769992f	Improves the OpInfo out= tests Edit: OpInfos separated into their own PRs to debug an ASAN failure that doesn't identify the failing test properly. This PR now just updates the out tests. Adds OpInfos for: - nn.functional.smooth_l1_loss - nn.functional.l1_loss - nn.functional.pdist - nn.functional.binary_cross_entropy - nn.functional.triplet_margin_loss - nn.functional.triplet_margin_with_distance_loss - nn.functional.max_unpool{1, 2, 3}D - nn.functional.alpha_dropout - nn.functional.soft_margin_loss - nn.functional.multilabel_soft_margin_loss - nn.functional.multilabel_margin_loss - nn.functional.multi_margin_loss - nn.functional.margin_ranking_loss These OpInfos were taken from https://github.com/pytorch/pytorch/pull/67560, https://github.com/pytorch/pytorch/pull/67823, https://github.com/pytorch/pytorch/pull/68625, and https://github.com/pytorch/pytorch/pull/67079. The sample input update from https://github.com/pytorch/pytorch/pull/67017 is also rolled into this PR. cc @zou3519 @nikitaved @pmeier @vfdev-5 @dagitses Pull Request resolved: https://github.com/pytorch/pytorch/pull/75782 Approved by: https://github.com/ngimel	2022-04-15 06:16:01 +00:00
Richard Zou	791321e44d	Fix minimum, maximum forward-ad formula for float32 fixes https://github.com/pytorch/pytorch/issues/69034 Test Plan: - new tests Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/75277 Approved by: https://github.com/soulitzer	2022-04-07 13:58:39 +00:00
Nikita Shulga	bfac65dfe5	[testing] Update dispatch macros (#74977 ) This PR is reland of #74289 Co-authored-by: Khushi Agrawal <khushiagrawal411@gmail.com>	2022-03-30 14:13:21 -07:00
PyTorch MergeBot	2e4152b118	Revert "[testing] Update dispatch macros" This reverts commit eed19a0f38a81015ca50dd25e997b1c6e223d46b. Reverted https://github.com/pytorch/pytorch/pull/74289 on behalf of https://github.com/malfet	2022-03-30 19:52:37 +00:00
Khushi Agrawal	eed19a0f38	[testing] Update dispatch macros Hi, This PR is the follow-up PR of #71561. (the previous PR had a couple of merge conflicts and was reverted, this PR resolves that). Please take a look. Thanks! cc: @pmeier @mruberry @kshitij12345 Pull Request resolved: https://github.com/pytorch/pytorch/pull/74289 Approved by: https://github.com/pmeier, https://github.com/mruberry	2022-03-30 16:10:16 +00:00
Mike Ruberry	0aa3c39e5f	Extends OpInfo architecture with reference inputs, adds them for elementwise binary operators This PR extends our OpInfo test architecture with "reference inputs," an optional expansion of typical sample inputs that allows for more thorough testing. Currently only the elementwise binary operations implement an extended set of reference inputs. This PR also cleans up some smaller OpInfo-related issues, including several bugs, and it identified https://github.com/pytorch/pytorch/issues/74279. A reference inputs function can be specified for an OpInfo by filling in its "reference_inputs_func" metadata. If this is done it's recommended that the reference inputs function first call the sample inputs function, then produce additional sample inputs. See `reference_inputs_elementwise_binary` for an example of this pattern. In addition to implementing reference inputs for the elementwise binary operations, this PR improves their consistency and simplifies how their metadata is represented. The great majority now use a generic sample input function, and those that want extensions start by calling the generic sample input function and then adding additional samples. This removes many older sample input functions. The BinaryUfuncInfo subclass also now allows specifying scalar support more precisely, and reference inputs and error inputs are generated based on this metadata to ensure it's correct. cc @kshitij12345 @pmeier @zou3519 @Chillee Pull Request resolved: https://github.com/pytorch/pytorch/pull/74280 Approved by: https://github.com/ngimel	2022-03-21 03:24:16 +00:00
Nikita Shulga	ef066f0832	Revert D34856571: [pytorch][PR] Replace `get_all_` type macros with the ATen dispatch macros. Test Plan: revert-hammer Differential Revision: D34856571 (`3ded7b1da3`) Original commit changeset: 0dca038bcad5 Original Phabricator Diff: D34856571 (`3ded7b1da3`) fbshipit-source-id: 594553fa0b710d78beba59d5d2b646f1f1270386 (cherry picked from commit 8090eb9b12dcf452a9e7dc01792a66fb91b563b6)	2022-03-15 22:07:11 +00:00
Khushi Agrawal	3ded7b1da3	Replace `get_all_` type macros with the ATen dispatch macros. (#71561 ) Summary: Hi, Team! The PR is motivated from https://github.com/pytorch/pytorch/pull/71153#discussion_r782446738. It aims to replace `get_all` type macros with the ATen dispatch macros. The files it iterates over are: (Thanks, Lezcano, for the idea!!) <details> <summary> `test/test_autograd.py`</summary> <p> ```python 43:from torch.testing._internal.common_dtype import get_all_dtypes 8506: floating_dt = [dt for dt in get_all_dtypes() if dt.is_floating_point] ``` </p> </details> <details> <summary> `test/test_binary_ufuncs.py`</summary> <p> ```python 26: all_types_and_complex_and, integral_types_and, get_all_dtypes, get_all_int_dtypes, get_all_math_dtypes, 27: get_all_complex_dtypes, get_all_fp_dtypes, 935: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1035: dtypes(get_all_dtypes( 1488: dtypes((get_all_dtypes(include_bool=False, include_bfloat16=False))) 1879: dtypes(product(get_all_dtypes(include_complex=False), get_all_dtypes(include_complex=False))) 1887: dtypes((get_all_int_dtypes() + [torch.bool])) 1913: dtypes((get_all_fp_dtypes())) 1941: dtypes((get_all_fp_dtypes())) 1977: dtypes(product(get_all_complex_dtypes(), get_all_dtypes())) 2019: dtypes(product(get_all_fp_dtypes(), get_all_fp_dtypes())) 2048: dtypes(get_all_dtypes()) 2110: dtypes(product(get_all_dtypes(include_complex=False), 2111: get_all_dtypes(include_complex=False))) 2128: types = [torch.bool, torch.bfloat16] + get_all_int_dtypes() 2173: if dtypes[1] in get_all_fp_dtypes(): 2178: dtypes(product(get_all_fp_dtypes(), 2179: get_all_fp_dtypes())) 2260: dtypesIfCUDA(set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128}) 2261: dtypes(set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128}) 2273: dtypesIfCUDA(set(get_all_math_dtypes('cuda')) - {torch.complex64, torch.complex128}) 2274: dtypes(set(get_all_math_dtypes('cpu')) - {torch.complex64, torch.complex128}) 2307: dtypes(get_all_math_dtypes('cpu')) 2319: dtypes(get_all_fp_dtypes(include_bfloat16=False)) 2331: dtypes(get_all_int_dtypes()) 2356: dtypes(get_all_dtypes(include_bfloat16=False, include_bool=False, include_complex=False)) 2393: if dtype in get_all_int_dtypes(): 2614: dtypes(get_all_dtypes()) 2624: dtypes(tuple(itertools.combinations_with_replacement(get_all_dtypes(), 2))) 2806: dtypes(list(product(get_all_dtypes(include_complex=False), 2807: get_all_dtypes(include_complex=False)))) 2866: dtypes(list(product(get_all_complex_dtypes(), 2867: get_all_complex_dtypes()))) 2902: dtypes(product(get_all_dtypes(), get_all_dtypes())) 2906: dtypes(product(get_all_dtypes(), get_all_dtypes())) 2910: dtypes(product(get_all_dtypes(), get_all_dtypes())) 3019: dtypes = [torch.float, torch.double] + get_all_complex_dtypes() 3221: dtypes(get_all_dtypes(include_complex=False)) 3407: dtypes(list(product(get_all_dtypes(include_bool=False), 3408: get_all_dtypes(include_bool=False)))) 3504: dtypes(product(get_all_dtypes(include_complex=False, include_bfloat16=False), 3505: get_all_dtypes(include_complex=False, include_bfloat16=False))) 3516: if x.dtype in get_all_int_dtypes() + [torch.bool]: 3643: dtypes(product(get_all_dtypes(include_complex=False, 3645: get_all_dtypes(include_complex=False, ``` </p> </details> <details> <summary> `test/test_complex.py`</summary> <p> ```python 6:from torch.testing._internal.common_dtype import get_all_complex_dtypes 11: dtypes(get_all_complex_dtypes()) ``` </p> </details> <details> <summary> `test/test_foreach.py`</summary> <p> ```python 18: get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes, 142: if dtype in get_all_int_dtypes(): 179: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 201: disable_fastpath = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 205: disable_fastpath \|= dtype in get_all_int_dtypes() + [torch.bool] 211: disable_fastpath \|= dtype not in get_all_complex_dtypes() 241: bool_int_div = op.ref == torch.div and dtype in get_all_int_dtypes() + [torch.bool] 246: disable_fastpath \|= dtype in get_all_int_dtypes() + [torch.bool] 248: disable_fastpath \|= dtype not in get_all_complex_dtypes() 250: disable_fastpath \|= True and dtype not in get_all_complex_dtypes() 307: disable_fastpath = dtype in get_all_int_dtypes() + [torch.bool] 365: if opinfo.name == "_foreach_abs" and dtype in get_all_complex_dtypes(): 376: ops(foreach_unary_op_db, dtypes=get_all_dtypes()) 393: dtypes=get_all_dtypes(include_half=True, include_bfloat16=True, include_complex=False)) 401: ops(foreach_minmax_op_db, dtypes=get_all_fp_dtypes(include_bfloat16=True, include_half=True)) 426: if ord in (1, 2) and dtype in torch.testing.get_all_fp_dtypes(): 439: dtypes(get_all_dtypes()) 449: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 481: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 536: if dtype in get_all_int_dtypes() + [torch.bool] and foreach_op == torch._foreach_div: 545: ops(foreach_binary_op_db, dtypes=get_all_dtypes()) 637: ops(foreach_pointwise_op_db, allowed_dtypes=get_all_fp_dtypes(include_half=False, include_bfloat16=False)) ``` </p> </details> <details> <summary> `test/test_linalg.py`</summary> <p> ```python 29: all_types, floating_types, floating_and_complex_types, get_all_dtypes, get_all_int_dtypes, get_all_complex_dtypes, 30: get_all_fp_dtypes, 111: dtypes((get_all_dtypes())) 794: float_and_complex_dtypes = get_all_fp_dtypes() + get_all_complex_dtypes() 807: dtypes((get_all_int_dtypes())) 828: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 841: if dtype in get_all_complex_dtypes(): 844: dtypes(itertools.product(get_all_dtypes(), 845: get_all_dtypes())) 855: for dtypes0, dtypes1, dtypes2 in product(get_all_dtypes(), repeat=3): 5607: get_all_fp_dtypes(include_half=not CUDA9, include_bfloat16=(CUDA11OrLater and SM53OrLater))) 5608: dtypes((set(get_all_dtypes()) - {torch.half, torch.bool})) 5644: dtypes((get_all_complex_dtypes() + get_all_fp_dtypes())) 6255: dtypesIfCUDA(get_all_complex_dtypes(), 6256: get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)), 6292: dtypesIfCUDA(get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)))) 6323: dtypesIfCUDA(get_all_complex_dtypes(), 6324: get_all_fp_dtypes(include_bfloat16=(TEST_WITH_ROCM or (CUDA11OrLater and SM53OrLater)))) 6325: dtypes(get_all_complex_dtypes(), get_all_fp_dtypes()) 6358: dtypesIfCUDA(([torch.float, torch.double] + get_all_complex_dtypes())) 6556: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) 6668: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) 6741: dtypes(get_all_fp_dtypes(), get_all_complex_dtypes()) ``` </p> </details> <details> <summary> `test/test_nn.py`</summary> <p> ```python 37:from torch.testing._internal.common_dtype import integral_types, get_all_fp_dtypes, get_all_math_dtypes 50: onlyNativeDeviceTypes, deviceCountAtLeast, largeTensorTest, expectedFailureMeta, skipMeta, get_all_device_types, \ 8862: for device in get_all_device_types(): 9629: for dt1 in get_all_math_dtypes(device): 9630: for dt2 in get_all_math_dtypes(device): 9631: for dt3 in get_all_math_dtypes(device): 9648: for input_dtype in get_all_math_dtypes(device): 9664: for input_dtype in get_all_math_dtypes(device): 13015: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 13034: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 13159: dtypes(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 17400: dtypesIfCUDA(get_all_fp_dtypes(include_bfloat16=AMPERE_OR_ROCM)) 17768: dtypesIfCUDA(get_all_fp_dtypes()) 17773: dtypesIfCUDA(get_all_fp_dtypes()) 17778: dtypesIfCUDA(get_all_fp_dtypes()) 17783: dtypesIfCUDA(get_all_fp_dtypes()) 17788: dtypesIfCUDA(get_all_fp_dtypes()) 17793: dtypesIfCUDA(get_all_fp_dtypes()) 17798: dtypesIfCUDA(get_all_fp_dtypes()) 17963: dtypesIfCUDA(get_all_fp_dtypes()) 17977: dtypesIfCUDA(get_all_fp_dtypes()) 18684: def test_cross_entropy_loss_prob_target_all_reductions(self, device): ``` </p> </details> <details> <summary> `test/test_numpy_interop.py`</summary> <p> ```python 12:from torch.testing._internal.common_dtype import get_all_dtypes 399: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_ops.py`</summary> <p> ```python 12:from torch.testing._internal.common_dtype import floating_and_complex_types_and, get_all_dtypes 86: for dtype in get_all_dtypes(): ``` </p> </details> <details> <summary> `test/test_reductions.py`</summary> <p> ```python 16: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_complex_dtypes, get_all_fp_dtypes, 360: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 366: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 394: allowed_dtypes=get_all_dtypes(include_bfloat16=False)) 750: for dtype in [dtype for dtype in get_all_math_dtypes('cpu') if dtype != torch.float16]: 1404: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1457: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1458: get_all_complex_dtypes())) 1465: return dtype in get_all_int_dtypes() 1494: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1501: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1507: dtypes((get_all_complex_dtypes())) 1514: dtypes = list(get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False)) 1523: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False))) 1531: if dtype in get_all_fp_dtypes(): 1608: dtypes((get_all_dtypes(include_half=True, include_bfloat16=False, 1837: dtypes(get_all_dtypes(include_bool=False, include_complex=False)) 1855: dtypes((set(get_all_dtypes(include_bool=False, include_complex=False)) - {torch.uint8})) 3219: for dtype in get_all_dtypes(include_half=True, include_bfloat16=False, ``` </p> </details> <details> <summary> `test/test_serialization.py`</summary> <p> ```python 26:from torch.testing._internal.common_dtype import get_all_dtypes 586: for device, dtype in product(devices, get_all_dtypes()): 589: for other_dtype in get_all_dtypes(): ``` </p> </details> <details> <summary> `test/test_shape_ops.py`</summary> <p> ```python 18:from torch.testing._internal.common_dtype import get_all_dtypes 230: dtypes(get_all_dtypes(include_complex=False, include_bool=False, include_half=False, 232: dtypesIfCUDA(get_all_dtypes(include_complex=False, include_bool=False, include_bfloat16=False)) 344: dtypes(get_all_dtypes()) 443: dtypes(get_all_dtypes()) 461: dtypes(get_all_dtypes()) 570: dtypes(get_all_dtypes(include_complex=False)) ``` </p> </details> <details> <summary> `test/test_sort_and_select.py`</summary> <p> ```python 12: all_types, all_types_and, floating_types_and, get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, 136: dtypes(set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128}) 231: dtypes(set(get_all_dtypes()) - {torch.bool, torch.complex64, torch.complex128}) 296: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 647: dtypesIfCUDA(get_all_fp_dtypes()) 678: dtypesIfCUDA((get_all_dtypes(include_complex=False, 682: dtypes((get_all_dtypes(include_complex=False, include_bool=False, include_half=False, include_bfloat16=False))) 739: dtypesIfCPU(set(get_all_dtypes()) - {torch.complex64, torch.complex128}) 740: dtypes(set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128}) 799: dtypesIfCPU(set(get_all_dtypes()) - {torch.complex64, torch.complex128}) 800: dtypes(set(get_all_dtypes()) - {torch.bfloat16, torch.complex64, torch.complex128}) ``` </p> </details> <details> <summary> `test/test_sparse.py`</summary> <p> ```python 20:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes 29: floating_and_complex_types, floating_and_complex_types_and, get_all_dtypes, get_all_int_dtypes, 1963: return dtype in get_all_int_dtypes() 1994: dtypes(get_all_dtypes(include_bool=False, include_half=False, 2103: return dtype in get_all_int_dtypes() 2138: dtypes(get_all_dtypes(include_bool=False, include_half=False, 2626: all_sparse_dtypes = get_all_dtypes(include_complex=True) 2633: all_sparse_dtypes = get_all_dtypes(include_complex=True) 3230: dtypes(get_all_complex_dtypes(), 3231: get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 3234: get_all_fp_dtypes( ``` </p> </details> <details> <summary> `test/test_sparse_csr.py`</summary> <p> ```python 7:from torch.testing import get_all_complex_dtypes, get_all_fp_dtypes, floating_and_complex_types, make_tensor 17:from torch.testing._internal.common_dtype import floating_types, get_all_dtypes 120: dtypes(get_all_dtypes()) 133: dtypes(get_all_dtypes()) 150: dtypes(get_all_dtypes()) 180: dtypes(get_all_dtypes()) 201: dtypes(get_all_dtypes()) 210: dtypes(get_all_dtypes()) 225: dtypes(get_all_dtypes()) 244: dtypes(get_all_dtypes()) 263: dtypes(get_all_dtypes()) 285: dtypes(get_all_dtypes()) 411: dtypes(get_all_dtypes()) 482: dtypes(get_all_dtypes()) 502: dtypes(get_all_dtypes()) 562: dtypes(get_all_dtypes()) 588: dtypesIfCUDA(get_all_complex_dtypes(), 589: get_all_fp_dtypes(include_half=SM53OrLater, include_bfloat16=SM80OrLater)) 745: dtypesIfCUDA(get_all_complex_dtypes(), 746: get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC, 765: dtypesIfCUDA(get_all_complex_dtypes(), 766: get_all_fp_dtypes(include_half=SM53OrLater and TEST_CUSPARSE_GENERIC, 801: torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater, 841: torch.testing.get_all_fp_dtypes(include_bfloat16=SM80OrLater, 1182: dtypes(get_all_dtypes()) 1276: dtypes(get_all_dtypes(include_bool=False, include_half=False, include_bfloat16=False)) 1286: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_tensor_creation_ops.py`</summary> <p> ```python 21: onlyCUDA, skipCPUIf, dtypesIfCUDA, skipMeta, get_all_device_types) 23: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 150: for dt in get_all_dtypes(): 160: for dt in get_all_dtypes(): 314: dtypes = [dtype for dtype in get_all_dtypes() if dtype != torch.bfloat16] 1012: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1013: get_all_complex_dtypes())) 1032: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1033: get_all_complex_dtypes())) 1050: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1051: get_all_complex_dtypes())) 1745: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1779: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1868: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1926: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 1954: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device) 1956: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, None) 1957: do_test_empty_full(self, get_all_math_dtypes('cpu'), torch.strided, torch_device) 2538: for device in get_all_device_types(): 2645: for dtype in get_all_dtypes(): 2678: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False) + 2679: get_all_complex_dtypes())) 2716: dtypes(get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 2827: for dt in get_all_dtypes(): 2913: dtypes(get_all_dtypes(include_bool=False, include_half=False)) 2914: dtypesIfCUDA(get_all_dtypes(include_bool=False, include_half=True)) 3028: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 3033: dtypes((get_all_fp_dtypes() + get_all_complex_dtypes())) 3074: dtypes(get_all_dtypes(include_bool=False, include_half=False, include_complex=False)) 3075: dtypesIfCUDA(((get_all_int_dtypes() + [torch.float32, torch.float16, torch.bfloat16]) 3077: else get_all_dtypes(include_bool=False, include_half=True, include_complex=False))) 3873: dtypes(get_all_dtypes()) 3884: dtypes(get_all_dtypes(include_bool=False)) 3916: for other in get_all_dtypes(): 3922: dtypes(get_all_dtypes()) 3932: dtypes(get_all_dtypes(include_bool=False)) 3955: dtypes(get_all_dtypes(include_bool=False)) 3961: dtypes(get_all_dtypes(include_bool=False)) 3965: dtypes(get_all_dtypes()) ``` </p> </details> <details> <summary> `test/test_testing.py`</summary> <p> ```python 25:from torch.testing._internal.common_dtype import get_all_dtypes 31: dtypes((get_all_dtypes(include_half=True, include_bfloat16=False, ``` </p> </details> <details> <summary> `test/test_torch.py`</summary> <p> ```python 51: expectedAlertNondeterministic, get_all_device_types, skipXLA) 57: get_all_fp_dtypes, get_all_int_dtypes, get_all_math_dtypes, get_all_dtypes, get_all_complex_dtypes 296: for d in get_all_device_types(): 323: for device in get_all_device_types(): 324: for dt1 in get_all_dtypes(): 325: for dt2 in get_all_dtypes(): 343: all_dtypes = get_all_dtypes() 350: all_dtypes = get_all_dtypes() 781: for dtype in get_all_dtypes(): 986: for device in get_all_device_types(): 1017: for device in get_all_device_types(): 1018: for dtype in get_all_math_dtypes(device): 2792: for device in get_all_device_types(): 3186: dtypes(get_all_dtypes()) 3195: for error_dtype in get_all_dtypes(): 3203: dtypes(get_all_dtypes()) 3212: for error_dtype in get_all_dtypes(): 4539: dtypes(get_all_fp_dtypes()) 4545: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 4577: dtypes(get_all_fp_dtypes(include_half=False, include_bfloat16=False)) 4578: dtypesIfCPU((get_all_fp_dtypes(include_half=False, include_bfloat16=True))) 4579: dtypesIfCUDA((get_all_fp_dtypes(include_bfloat16=False))) 4599: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False))) 4600: dtypesIfCPU((get_all_dtypes(include_half=False, include_bfloat16=False, include_complex=False))) 4601: dtypesIfCUDA((get_all_dtypes(include_bfloat16=False, include_complex=False))) 4613: for p_dtype in get_all_fp_dtypes(include_half=device.startswith('cuda'), include_bfloat16=False): 4628: dtypes((get_all_fp_dtypes(include_half=False, include_bfloat16=False))) 4629: dtypesIfCUDA((get_all_fp_dtypes(include_bfloat16=False))) 4640: dtypes(get_all_fp_dtypes()) 4723: dtypes(get_all_fp_dtypes()) 4735: dtypes(get_all_fp_dtypes(include_bfloat16=False)) 4736: dtypesIfCUDA(get_all_fp_dtypes()) 4747: dtypes(get_all_fp_dtypes()) 4761: dtypes(get_all_fp_dtypes()) 4771: dtypes(get_all_fp_dtypes()) 4792: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 5302: dtypes(get_all_dtypes(include_bfloat16=False)) 5322: dtypes(get_all_dtypes(include_half=False, include_bfloat16=False)) 5323: dtypesIfCPU(get_all_dtypes(include_bfloat16=False)) 5324: dtypesIfCUDA(get_all_dtypes(include_bfloat16=False)) 5591: for dt in get_all_dtypes(): 5611: for dt in get_all_dtypes(): 5678: for dt in get_all_dtypes(): 5696: dtypesIfCUDA(set(get_all_math_dtypes('cuda'))) 5697: dtypes(set(get_all_math_dtypes('cpu'))) 5746: dtypes(get_all_dtypes()) 5780: dtypes(get_all_dtypes()) 5885: dtypes(get_all_dtypes()) 5902: dtypes(get_all_dtypes()) 5945: dtypes(get_all_dtypes()) 5979: dtypes(get_all_dtypes(include_bool=False)) 6049: dtypes(get_all_dtypes(include_bool=False)) 6092: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6093: get_all_complex_dtypes())) 6094: dtypesIfCPU(get_all_dtypes()) 6095: dtypesIfCUDA(get_all_dtypes()) 6122: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6123: get_all_complex_dtypes())) 6124: dtypesIfCPU(get_all_dtypes()) 6125: dtypesIfCUDA(get_all_dtypes()) 6163: dtypes((get_all_fp_dtypes(include_bfloat16=False, include_half=False) + 6164: get_all_complex_dtypes())) 6165: dtypesIfCPU(get_all_dtypes()) 6166: dtypesIfCUDA(get_all_dtypes()) 6190: dtypes((get_all_complex_dtypes() + 6191: get_all_int_dtypes())) 6238: dtypes(get_all_dtypes()) 6323: dtypes(get_all_dtypes()) 6389: dtypes(product(get_all_dtypes(), (torch.uint8, torch.bool))) 6699: dtypesIfCUDA(set(get_all_math_dtypes('cuda'))) 6700: dtypes(set(get_all_math_dtypes('cpu'))) 7452: dtypes(get_all_dtypes(include_bool=False)) 7461: dtypes(get_all_dtypes(include_bool=False)) 7477: dtypes(get_all_dtypes(include_bool=False)) 7496: dtypes(get_all_dtypes(include_bool=False)) 7538: dtypes(get_all_dtypes(include_bool=False)) 8162: dtypes((get_all_int_dtypes() + get_all_fp_dtypes() + 8163: get_all_complex_dtypes())) 8175: dtypes((get_all_int_dtypes() + get_all_fp_dtypes() + 8176: get_all_complex_dtypes())) ``` </p> </details> <details> <summary> `test/test_type_promotion.py`</summary> <p> ```python 14: get_all_dtypes, get_all_math_dtypes, get_all_int_dtypes, get_all_fp_dtypes 187: for dtype in get_all_dtypes(): 262: dtypes1 = get_all_math_dtypes('cuda') 263: dtypes2 = get_all_math_dtypes(device) 339: dtypes(itertools.product(get_all_dtypes(), get_all_dtypes())) 468: for dt1 in get_all_math_dtypes(device): 469: for dt2 in get_all_math_dtypes(device): 519: for dt1 in get_all_math_dtypes(device): 520: for dt2 in get_all_math_dtypes(device): 528: for dt in get_all_math_dtypes(device): 561: for dtype in get_all_dtypes(): 766: dtypes=get_all_math_dtypes(device)) 771: dtypes=get_all_math_dtypes(device)) 782: dtypes=get_all_math_dtypes(device)) 879: dtypes = get_all_dtypes(include_bfloat16=False) 898: dtypes = get_all_dtypes(include_bfloat16=False, include_bool=False) 965: dtypesIfCUDA(itertools.product(get_all_dtypes(include_bfloat16=False, include_complex=False), 966: get_all_dtypes(include_bfloat16=False, include_complex=False))) 967: dtypes(itertools.product(get_all_dtypes(include_half=False, include_bfloat16=False, 969: get_all_dtypes(include_half=False, include_bfloat16=False, 976: return dtype in get_all_int_dtypes() + [torch.bool] 979: return dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False) ``` </p> </details> <details> <summary> `test/test_unary_ufuncs.py`</summary> <p> ```python 24: floating_types_and, all_types_and_complex_and, floating_and_complex_types_and, get_all_dtypes, get_all_math_dtypes, 25: get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 517: dtypes((get_all_int_dtypes() + [torch.bool] + 518: get_all_fp_dtypes(include_bfloat16=False))) 596: dtypes(get_all_fp_dtypes(include_half=True, include_bfloat16=False)) 611: invalid_input_dtypes = get_all_int_dtypes() + \ 612: get_all_complex_dtypes() + \ 619: for dtype in get_all_fp_dtypes(include_half=True, include_bfloat16=False): 1048: dtypes(get_all_math_dtypes('cpu')) 1182: dtypesIfCUDA(get_all_fp_dtypes()) 1190: dtypesIfCUDA(get_all_fp_dtypes()) 1205: dtypesIfCUDA(get_all_fp_dtypes()) 1215: dtypesIfCUDA(get_all_fp_dtypes()) 1307: dtypes((get_all_dtypes(include_bool=False))) 1349: dtypes((get_all_fp_dtypes(include_half=False) + 1350: get_all_complex_dtypes())) 1351: dtypesIfCUDA((get_all_fp_dtypes(include_half=True) + 1352: get_all_complex_dtypes())) ``` </p> </details> <details> <summary> `test/test_view_ops.py`</summary> <p> ```python 19: get_all_dtypes, get_all_int_dtypes, get_all_fp_dtypes, get_all_complex_dtypes 124: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 131: dtypes(get_all_dtypes(include_bfloat16=False)) 213: for view_dtype in [get_all_fp_dtypes(), get_all_complex_dtypes()]: 220: dtypes(get_all_dtypes()) 224: for view_dtype in get_all_dtypes(): 305: dtypes(get_all_complex_dtypes(include_complex32=True)) 343: dtypes(get_all_dtypes()) 354: dtypes(get_all_dtypes()) 364: dtypes(get_all_dtypes()) 374: dtypes(get_all_dtypes()) 384: dtypes((get_all_int_dtypes() + get_all_fp_dtypes())) 395: dtypes(get_all_complex_dtypes()) 426: dtypes(get_all_complex_dtypes()) 451: dtypes(product(get_all_complex_dtypes(), get_all_dtypes())) 1263: dtypes((torch.testing.get_all_dtypes())) 1279: dtypes((torch.testing.get_all_dtypes())) 1405: dtypes((get_all_int_dtypes() + get_all_fp_dtypes(include_bfloat16=False) + 1406: get_all_complex_dtypes())) 1471: dtypes(get_all_dtypes(include_bfloat16=False)) 1574: dtypes(get_all_dtypes()) 1601: dtypes(get_all_dtypes(include_bfloat16=False)) 1632: dtypes(*get_all_dtypes(include_bfloat16=False)) 1711: for dt in get_all_dtypes(): 1717: for dt in get_all_dtypes(): 1724: for dt in get_all_dtypes(): ``` </p> </details> I'm looking forward to your viewpoints. Thanks :) cc: mruberry kshitij12345 anjali411 Pull Request resolved: https://github.com/pytorch/pytorch/pull/71561 Reviewed By: samdow Differential Revision: D34856571 Pulled By: mruberry fbshipit-source-id: 0dca038bcad5cf69906245c496d2e61ac3876335 (cherry picked from commit b058f67b4313143efa714ab105f36e74083131b9)	2022-03-15 20:31:41 +00:00
Sergii Dymchenko	794f813522	Fix test_binary_ufuncs.py for Python 3.10 This is a partial fix for https://github.com/pytorch/pytorch/issues/66424 Before this PR, test_binary_ufuncs.py gives DeprecationWarning in Python 3.8 and "TypeError: 'float' object cannot be interpreted as an integer" in Python 3.10. With this PR, `python run_test.py -i test_binary_ufuncs.py` succeeds with Python 3.10. Pull Request resolved: https://github.com/pytorch/pytorch/pull/74013 Approved by: https://github.com/ezyang	2022-03-11 02:13:39 +00:00
Philip Meier	0973c5a1cc	align signature of make_tensor with other creation ops (#72702 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72702 Test Plan: Imported from OSS Reviewed By: mrshenli Differential Revision: D34457729 Pulled By: mruberry fbshipit-source-id: 83d580c4201eef946dc9cf4b9e28a3d36be55609 (cherry picked from commit aa4cf20fbeb4b795595729b8ac2e6ba7707d8283)	2022-02-25 06:30:31 +00:00
Philip Meier	1f74e082e2	only compare attributes for meta tensors (#72508 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/72508 Todo: - [x] document this behavior - [x] add tests Test Plan: Imported from OSS Reviewed By: zou3519 Differential Revision: D34262452 Pulled By: ezyang fbshipit-source-id: bc5c9653d5c3ad5c6efccc9c8e0efc0d28e15104 (cherry picked from commit 233142c88e4cff02825c7e233aba9411a6df3e9f)	2022-02-17 02:33:08 +00:00
Natalia Gimelshein	06d0536dad	Low precision support for jiterator (#70157 ) Summary: This adds support for bfloat16 and fp16 types for jiterator by adding at::Half and at::BFloat16 classes to the jiterator code template. The only methods defined in those classes are construction from float and implicit conversion to float. Mathematical operations on them never need to be defined, because jiterator is written in a way to implicitly upcast the inputs to the functor, so all math has to be performed on float only (e.g. compute part of the kernel would always be written as ``` out[j] = i0<float>(arg0[j]); ``` It also adds support for casting to complex outputs, by adding a similar templated class c10::complex<T>. Originally I planned to only support float -> complex complex for it, but to compile fetch_and_cast function we also need complex -> float conversion. We can avoid it by compiling fetch_and_cast for a different subset of types, but I'm not doing it in this PR. Thus, technically, we can compile a kernel that would accept complex inputs and produce wrong results, but we are guarding against it by static asserting that none of the functor datatype are complex, and runtime-checking that none of the inputs are complex. Adding bfloat16, half and complex support allows us to remove special handling for type promotion tests for gcd. i0 (that supports half and bfloat16 inputs) is moved to use jiterator. Pull Request resolved: https://github.com/pytorch/pytorch/pull/70157 Reviewed By: mruberry Differential Revision: D33221645 Pulled By: ngimel fbshipit-source-id: 9cfe8aba3498a0604c4ea62c217292ea06c826b1	2021-12-19 11:56:57 -08:00
Natalia Gimelshein	9ff8c49ed9	Enable cpu scalar arguments for jiterator (#69861 ) Summary: Creates analog of `gpu_kernel_with_scalars` for jiterator kernels Pull Request resolved: https://github.com/pytorch/pytorch/pull/69861 Reviewed By: mruberry Differential Revision: D33134013 Pulled By: ngimel fbshipit-source-id: fd2412e8d6432e15d5721e95a194d29fa70ad92c	2021-12-16 10:58:59 -08:00
Mike Ruberry	a974699633	Skips failing ROCm test (#69456 ) Summary: ROCm and CUDA type promotion are slightly divergent and need to be updated. cc jeffdaily sunway513 jithunnair-amd ROCmSupport KyleCZH Pull Request resolved: https://github.com/pytorch/pytorch/pull/69456 Reviewed By: anjali411, janeyx99 Differential Revision: D32883895 Pulled By: mruberry fbshipit-source-id: 3b0ba8a9d092c2d7ff20d78da42d4a147b1db12d	2021-12-06 09:12:31 -08:00
Mike Ruberry	b6f41bb848	The Jiterator (#69439 ) Summary: This PR: - creates the "jiterator" pattern, allowing elementwise unary and binary kernels that don't accept scalars to be jit compiled when called - ports the gcd and i1 CUDA kernels to use the jiterator - extends elementwise binary systemic testing to be comparable to elementwise unary systemic testing - separates one test case from test_out in test_ops.py - updates more OpInfos to use expected failures instead of skips The jiterator currently does not support half, bfloat16 or complex dtypes. It also (as mentioned above) doesn't support scalar inputs. In the future we expect to add support for those datatypes and scalars. Pull Request resolved: https://github.com/pytorch/pytorch/pull/69439 Reviewed By: ngimel Differential Revision: D32874968 Pulled By: mruberry fbshipit-source-id: d44bb9cde4f602703e75400ec5a0b209f085e9b3	2021-12-06 07:32:48 -08:00

1 2 3 4

162 Commits