pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 13:44:15 +08:00

Author	SHA1	Message	Date
Masaki Kozuki	26b5986297	`ReflectionPad` supports `BFloat16` (#84949 ) Just by looking at some commits, I didn't find why BFloat16 isn't there. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84949 Approved by: https://github.com/ngimel	2022-09-14 00:01:06 +00:00
PyTorch MergeBot	36d79143ce	Revert "[reland] Call jit decomposition in VariableType to increase forward AD coverage (#84151 ) (#84675 )" This reverts commit bb4e96c9644a034e593085026b781ee78a4d6a77. Reverted https://github.com/pytorch/pytorch/pull/84675 on behalf of https://github.com/osalpekar due to causing asan xplat link-time errors like ld.lld: error: undefined symbol: torch::jit::has_jit_decomposition(c10::FunctionSchema const&)	2022-09-13 22:54:54 +00:00
Ryan Spring	d09e8b23bf	[primTorch] Add repeat and unfold_copy references (#81374 ) Add References: - repeat - unfold - expand_as Pull Request resolved: https://github.com/pytorch/pytorch/pull/81374 Approved by: https://github.com/mruberry, https://github.com/ngimel	2022-09-12 22:19:06 +00:00
soulitzer	bb4e96c964	[reland] Call jit decomposition in VariableType to increase forward AD coverage (#84151 ) (#84675 ) This reverts commit acb4a09628284201281e262aaee58e3dc6be9c2b. In addition, we also fix a memory leak in layer norm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84675 Approved by: https://github.com/zou3519	2022-09-12 20:33:14 +00:00
kshitij12345	4f6027b78a	[opinfo] narrow: add new sample for Tensor overload (#84785 ) `narrow` accepts `start` argument to be a Tensor. We add a sample to test this overload. NOTE: This leads to a bunch of failed tests and hence the skips and xfails Pull Request resolved: https://github.com/pytorch/pytorch/pull/84785 Approved by: https://github.com/zou3519	2022-09-12 16:59:08 +00:00
Jeff Daily	8cdc0679b9	[ROCm][jiterator] unskip additional tests (#84371 ) Follow-up to #77982. Unskip additional jiterator tests for ROCm. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84371 Approved by: https://github.com/ngimel, https://github.com/SherlockNoMad	2022-09-12 15:20:51 +00:00
Ivan Yashchuk	01c54ad6de	Remove deprecated torch.eig (#70982 ) The time has come to remove deprecated linear algebra related functions. This PR removes `torch.eig`. cc @jianyuh @nikitaved @pearu @mruberry @walterddr @IvanYashchuk @xwang233 @Lezcano Pull Request resolved: https://github.com/pytorch/pytorch/pull/70982 Approved by: https://github.com/Lezcano, https://github.com/malfet	2022-09-09 21:31:57 +00:00
Peter Bell	2614079f89	OpInfo: Prevent clamp sample inputs from sharing tensors (#84696 ) As per the comment, re-using tensors between sample inputs is strongly discouraged. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84696 Approved by: https://github.com/ngimel	2022-09-09 19:58:08 +00:00
Sean Ross-Ross	e8b9501861	test: adding uniform (#84292 ) Adding OpInfo for uniform Pull Request resolved: https://github.com/pytorch/pytorch/pull/84292 Approved by: https://github.com/amjames, https://github.com/ngimel	2022-09-09 18:54:49 +00:00
PyTorch MergeBot	acb4a09628	Revert "Call jit decomposition in VariableType to increase forward AD coverage (#84151 )" This reverts commit 42d99e6f196233627a28b8e9efb26a0a166fa370. Reverted https://github.com/pytorch/pytorch/pull/84151 on behalf of https://github.com/malfet due to Regressed test_jvpvjp_nn_functional_layer_norm_cuda_float32, see `42d99e6f19`	2022-09-07 18:02:27 +00:00
soulitzer	42d99e6f19	Call jit decomposition in VariableType to increase forward AD coverage (#84151 ) This PR: - updates forward AD codegen in core to generate code that tries calling into decompositions registered to jit when - (1) the function is not in-place or out variant - AND (2) the function is differentiable (requires_derivative=True) - AND (3) there are no forward AD formulas registered - To simplify things we always generating the if/else (as long as (1) is true), but generate 'false' when either (2) or (3) are false. - removes the mechanism from functorch - (follow up) some functorch tests should be updated here so they no longer have to compute the Jacobian with vjp - factors out some logic to generate the any_has_forward_grad condition - (bc-breaking) when TensorList inputs unexpectedly have forward grad, the error will no longer contain the name See https://github.com/pytorch/pytorch/pull/84151#issuecomment-1238519247 for codegen output and more discussion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84151 Approved by: https://github.com/samdow, https://github.com/albanD, https://github.com/zou3519	2022-09-07 15:31:46 +00:00
PyTorch MergeBot	166dec74b5	Revert "Dispatch torch.norm to linalg.vector_norm and linalg.matrix_norm (#81761 )" This reverts commit 65beff5acb0d7c0c484bd0558bcaf8ddc9c96aab. Reverted https://github.com/pytorch/pytorch/pull/81761 on behalf of https://github.com/mehtanirav due to Breakages in pytorch/glow	2022-09-06 22:31:14 +00:00
Ivan Yashchuk	752c3bcb47	Enable nvfuser tests for refs.broadcast_to and refs.broadcast_tensors (#84337 ) Previously these tests were failing because they required some other op alongside prims.broadcast_in_dim to be executed. Now it works standalone. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84337 Approved by: https://github.com/mruberry, https://github.com/ngimel	2022-09-06 22:08:13 +00:00
Fabio Rocha	88b1cc885c	Removed tri[lu]* tests, superseeded by OpInfos (#84256 ) triu, tril, triu_indices and tril_indices had some tests in test_tensor_creation_ops.py and test_cuda.py that are redudant with the ones done by OpInfos for those ops. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84256 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-09-06 18:54:10 +00:00
Richard Zou	c771d73461	[composite compliance] fix max_pool1d (#84127 ) max_pool1d has a fast path for CPU tensors that do not require grad that directly accesses the data_ptr. This PR makes the change that if the input Tensor is a Tensor Subclass, then we want to walk through the "slow path" of calling max_pool1d_with_indices. Test Plan: - wait for tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/84127 Approved by: https://github.com/kshitij12345, https://github.com/samdow, https://github.com/malfet	2022-09-06 17:13:09 +00:00
Richard Zou	139599ba95	Contiguify bias in slow_conv_transpose3d kernel (#84125 ) Users never run into this because PyTorch now comes with cudnn by default and cudnn has a better conv_transpose implementation. However we seem to test without cudnn in our CI; and also, ROCM goes down this path. The .contiguous() call does not regress anything because previously it was a runtime error. Because this kernel is the "slow conv transpose3d kernel", we don't care much for its performance. Test Plan: - wait for tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/84125 Approved by: https://github.com/ngimel	2022-09-06 17:13:09 +00:00
Fabio Rocha	91a5f52f51	Decomp for nn.functional.grid_sampler_2d (#84350 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84350 Approved by: https://github.com/jansel, https://github.com/Lezcano	2022-09-05 21:33:26 +00:00
lezcano	65beff5acb	Dispatch torch.norm to linalg.vector_norm and linalg.matrix_norm (#81761 ) `torch.norm` is very odd. Some notable issues are: - The default value of `"fro"` in `torch.norm` has an odd behaviour when `dim=None`. This is handled in the new dispatch - The treatment of the `dtype` argument in `torch.norm` was completely wrong. This should fix it - Some `out=` variants in the previous implementation were also wrong. This should fix those. - This new dispatch should make some paths much faster. For example, `torch.norm(x)` where `x` is complex. I'll try to make the changes in these PRs as incremental as possible as this is a tricky one. Pull Request resolved: https://github.com/pytorch/pytorch/pull/81761 Approved by: https://github.com/ngimel	2022-09-02 19:12:25 +00:00
Nikita Karetnikov	5cfe769387	[primTorch] Add refs for `reshape_as`, `view_as`, unify tests (#84222 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/84222 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-09-01 16:14:34 +00:00
Ivan Yashchuk	65e887c041	Remove unnecessary copy from torch._refs.to, add OpInfo for torch.Tensor.to (#84270 ) This PR removes unnecessary copy from `torch._refs.to`, adds OpInfo for `torch.Tensor.to`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/84270 Approved by: https://github.com/ngimel	2022-09-01 07:18:42 +00:00
Nikita Karetnikov	85b889fa5f	[primTorch] Add ref for `poisson_nll_loss` (#83805 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83805 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-08-31 17:39:34 +00:00
Nikita Karetnikov	71ce9cd072	[primTorch] Add decomp for `soft_margin_loss` (#83804 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83804 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-08-31 17:39:34 +00:00
Nikita Karetnikov	305af90d0f	[primTorch] Add docstring and promotion for `l1_loss` ref (#83803 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83803 Approved by: https://github.com/Lezcano, https://github.com/ngimel	2022-08-31 17:39:31 +00:00
Elias Ellison	9c452abcf1	Use reentrant mode when invoking prims, delete global prim_fake_mode (#84090 ) Maybe I should be using the meta_impl instead of the prim_impl, but it's not terribly clear why, since the prim impl will be better tested and should work under the re-entrant FakeTensorMode. Fixes https://github.com/pytorch/pytorch/issues/78613 in the process Pull Request resolved: https://github.com/pytorch/pytorch/pull/84090 Approved by: https://github.com/ezyang, https://github.com/samdow	2022-08-31 01:58:44 +00:00
Mike Iovine	db7784e722	[Static Runtime] Schema checks for index_put (#84152 ) Summary: `index_put` can take a list of tensors, but Static Runtime always tries to convert its argument to a list of optional tensors. This was causing crashes for some users. Add some schema checks to prevent this, and add a new overload for the new case. Also, I found a clear bug in the JIT interpreter (mutating the argument when its not supposed to), so I fixed that too. Test Plan: New unit test Differential Revision: D39072214 Pull Request resolved: https://github.com/pytorch/pytorch/pull/84152 Approved by: https://github.com/tenpercent	2022-08-31 01:20:14 +00:00
Jeff Daily	d09486ab23	[ROCm] enable nvfuser (#82498 ) ### Description The nvfuser is enabled for ROCm. ### Testing CI label ciflow/trunk covers the newly enabled ROCm functionality as well as any CUDA regressions caused by these changes. Pull Request resolved: https://github.com/pytorch/pytorch/pull/82498 Approved by: https://github.com/jjsjann123, https://github.com/davidberard98	2022-08-30 21:50:39 +00:00
Ivan Yashchuk	90161c23cf	Add nvfuser support for squeeze (#84117 ) "_refs.squeeze" and "refs.unsqueeze" now work with nvfuser executor tests. Similarly to `_refs.reshape` we need to explicitly save the concrete shape on the trace to pass that info to nvfuser, as it gets lost in translation (https://github.com/pytorch/pytorch/pull/83739#discussion_r950352124). Pull Request resolved: https://github.com/pytorch/pytorch/pull/84117 Approved by: https://github.com/ngimel	2022-08-30 20:36:11 +00:00
lezcano	b106a04d76	Fix the edge case when y = 0 in kl_div (#82714 ) Brought up in https://github.com/pytorch/pytorch/pull/80334#issuecomment-1193600883 We also prepare its opinfo to fix https://github.com/pytorch/pytorch/issues/80488 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82714 Approved by: https://github.com/albanD	2022-08-30 18:18:25 +00:00
soulitzer	7088a98fba	conv2d: require bias to have the same dtype as input and weight on cpu (#83686 ) Fixes https://github.com/pytorch/pytorch/issues/83505 BC-breaking message: - Previously we only required input and weight to have the same dtype on cpu (when input is non-complex). After this change, the dtype of bias is now also expected to have the same dtype. This change was necessary to improve the error message for certain combinations of inputs. This behavior now also matches that of convolution on cuda. <details> <summary> Old plan </summary> Previously convolution (at least for slow_conv2d) did not perform type promotion, i.e. the output of `conv(int, int, float)` is an int, and that leads to the autograd assert. This PR adds type promotion handling at the `at::native::conv2d` (this is a composite) level. We also need to correct or remove many tests that assume that conv errors when input types are mixed Pros: - Doing type promotion at this level avoids the complex path from having any special handling for mixed dtypes, and can potentially speed up mixed dtype inputs to now dispatch to faster kernels which are only capable of handling floats. Cons: - Doing type promotion at this level has the risk of introducing extra overhead when we would've dispatched to a kernel capable of handle mixed type anyway. I don't know if any of these exist at all though - it is possible that inputs with any non-float arguments are dispatched to the slow path. If this approach is OK, we can proceed with the other convolutions as well: </details> Pull Request resolved: https://github.com/pytorch/pytorch/pull/83686 Approved by: https://github.com/ngimel	2022-08-29 16:41:17 +00:00
Ivan Yashchuk	3aae6ff1e1	Add nvprims.var_mean (#83508 ) This PR adds nvfuser-specific primitive - `var_mean`. Interpretation `torch.var_mean` -> `torch.ops.nvprims.var_mean` is handled by `TorchRefsNvfuserCapabilityMode` context manager. I moved some helper code from `_prims/__init__.py` to `_prims_common`. Correctness is tested with OpInfo tests (see `PythonRefInfo("ops.nvprims.var_mean"`). Layer norm reference now uses `torch.var_mean` instead of `torch._refs.var_mean` to allow interception. Here's a simple comparison of performance with this PR and master (on 3080ti): ```py import torch from torch._prims.context import TorchRefsNvfuserCapabilityMode from torch.fx.experimental.proxy_tensor import make_fx from torch._prims.executor import execute def func(a): return torch.native_layer_norm(a, (1024,), None, None, 1e-6) a = torch.randn(10, 512, 1024, dtype=torch.float16, device="cuda") with TorchRefsNvfuserCapabilityMode(): gm = make_fx(func)(a) for _ in range(10): execute(gm, a, executor="strictly_nvfuser"); ``` run with `PYTORCH_NVFUSER_DUMP=dump_eff_bandwidth python script.py` ```py # WITH THIS PR # kernel1 run in 0.032768 ms, achieved: 641.25 GB/s # kernel1 run in 0.033792 ms, achieved: 621.818 GB/s # kernel1 run in 0.032768 ms, achieved: 641.25 GB/s # kernel1 run in 0.032608 ms, achieved: 644.396 GB/s # kernel1 run in 0.031744 ms, achieved: 661.935 GB/s # kernel1 run in 0.031744 ms, achieved: 661.935 GB/s # kernel1 run in 0.032768 ms, achieved: 641.25 GB/s # kernel1 run in 0.03072 ms, achieved: 684 GB/s # kernel1 run in 0.031744 ms, achieved: 661.935 GB/s # kernel1 run in 0.031744 ms, achieved: 661.935 GB/s # ON MASTER # kernel1 run in 0.05632 ms, achieved: 373.091 GB/s # kernel1 run in 0.044032 ms, achieved: 477.209 GB/s # kernel1 run in 0.044032 ms, achieved: 477.209 GB/s # kernel1 run in 0.044032 ms, achieved: 477.209 GB/s # kernel1 run in 0.043808 ms, achieved: 479.649 GB/s # kernel1 run in 0.043008 ms, achieved: 488.571 GB/s # kernel1 run in 0.044032 ms, achieved: 477.209 GB/s # kernel1 run in 0.043008 ms, achieved: 488.571 GB/s # kernel1 run in 0.043008 ms, achieved: 488.571 GB/s # kernel1 run in 0.043008 ms, achieved: 488.571 GB/s ``` So this PR gives about 35% improvement in performance using nvfuser executor with this specific normalized shape. Also this PR fixes https://github.com/pytorch/pytorch/issues/83506 (see the change in `torch/csrc/jit/python/pybind_utils.cpp`). Ref. https://github.com/pytorch/pytorch/issues/80187 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83508 Approved by: https://github.com/ngimel	2022-08-28 18:45:25 +00:00
PyTorch MergeBot	b159a5230f	Revert "Add nvprims.var_mean (#83508 )" This reverts commit 7e7694b6615fbf46abfab234615fa891c2819eb7. Reverted https://github.com/pytorch/pytorch/pull/83508 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally	2022-08-28 11:30:27 +00:00
kuttire42	c9b144ff47	Replace assertEqualIgnoreTypes from common_methods_invocations.py (#84076 ) This addresses TODO:38095 . More details at https://github.com/pytorch/pytorch/issues/38095 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/84076 Approved by: https://github.com/kit1980	2022-08-28 01:25:07 +00:00
Ivan Yashchuk	7e7694b661	Add nvprims.var_mean (#83508 ) This PR adds nvfuser-specific primitive - `var_mean`. Interpretation `torch.var_mean` -> `torch.ops.nvprims.var_mean` is handled by `TorchRefsNvfuserCapabilityMode` context manager. I moved some helper code from `_prims/__init__.py` to `_prims_common`. Correctness is tested with OpInfo tests (see `PythonRefInfo("ops.nvprims.var_mean"`). Layer norm reference now uses `torch.var_mean` instead of `torch._refs.var_mean` to allow interception. Here's a simple comparison of performance with this PR and master (on 3080ti): ```py import torch from torch._prims.context import TorchRefsNvfuserCapabilityMode from torch.fx.experimental.proxy_tensor import make_fx from torch._prims.executor import execute def func(a): return torch.native_layer_norm(a, (1024,), None, None, 1e-6) a = torch.randn(10, 512, 1024, dtype=torch.float16, device="cuda") with TorchRefsNvfuserCapabilityMode(): gm = make_fx(func)(a) for _ in range(10): execute(gm, a, executor="strictly_nvfuser"); ``` run with `PYTORCH_NVFUSER_DUMP=dump_eff_bandwidth python script.py` ```py # WITH THIS PR # kernel1 run in 0.032768 ms, achieved: 641.25 GB/s # kernel1 run in 0.033792 ms, achieved: 621.818 GB/s # kernel1 run in 0.032768 ms, achieved: 641.25 GB/s # kernel1 run in 0.032608 ms, achieved: 644.396 GB/s # kernel1 run in 0.031744 ms, achieved: 661.935 GB/s # kernel1 run in 0.031744 ms, achieved: 661.935 GB/s # kernel1 run in 0.032768 ms, achieved: 641.25 GB/s # kernel1 run in 0.03072 ms, achieved: 684 GB/s # kernel1 run in 0.031744 ms, achieved: 661.935 GB/s # kernel1 run in 0.031744 ms, achieved: 661.935 GB/s # ON MASTER # kernel1 run in 0.05632 ms, achieved: 373.091 GB/s # kernel1 run in 0.044032 ms, achieved: 477.209 GB/s # kernel1 run in 0.044032 ms, achieved: 477.209 GB/s # kernel1 run in 0.044032 ms, achieved: 477.209 GB/s # kernel1 run in 0.043808 ms, achieved: 479.649 GB/s # kernel1 run in 0.043008 ms, achieved: 488.571 GB/s # kernel1 run in 0.044032 ms, achieved: 477.209 GB/s # kernel1 run in 0.043008 ms, achieved: 488.571 GB/s # kernel1 run in 0.043008 ms, achieved: 488.571 GB/s # kernel1 run in 0.043008 ms, achieved: 488.571 GB/s ``` So this PR gives about 35% improvement in performance using nvfuser executor with this specific normalized shape. Also this PR fixes https://github.com/pytorch/pytorch/issues/83506 (see the change in `torch/csrc/jit/python/pybind_utils.cpp`). Ref. https://github.com/pytorch/pytorch/issues/80187 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83508 Approved by: https://github.com/ngimel	2022-08-27 09:05:20 +00:00
kshitij12345	65ea3d0621	[composite compliance] cov, corrcoef (#82954 ) Ref: #69991 Pull Request resolved: https://github.com/pytorch/pytorch/pull/82954 Approved by: https://github.com/zou3519	2022-08-26 15:14:37 +00:00
Mario Lezcano	f5a3515083	Make linalg.inv composite of linalg.solve (#80074 ) The `getri` kernel calls inside `getrs` so we can do so explicitly ourselves and save ourselves from having to maintain an extra kernel. This way we just need to optimise `lu_factor` and `lu_solve` and `inv` will be as efficient as it can be, as it'll be choosing the best backend to perform the factorisation and the best backend (not necessarily the same) to perform the solve. Fixes https://github.com/pytorch/pytorch/issues/77498 The benchmarks: https://github.com/pytorch/pytorch/pull/80074#issuecomment-1164309071 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80074 Approved by: https://github.com/IvanYashchuk, https://github.com/albanD, https://github.com/malfet	2022-08-25 09:28:55 +00:00
PyTorch MergeBot	5321bf52f2	Revert "Make linalg.inv composite of linalg.solve (#80074 )" This reverts commit 4737b3361479f4104efaa3bfa2ea517eaacb60fb. Reverted https://github.com/pytorch/pytorch/pull/80074 on behalf of https://github.com/malfet due to Depends on the changes from https://github.com/pytorch/pytorch/pull/83628	2022-08-25 00:43:00 +00:00
Mario Lezcano	4737b33614	Make linalg.inv composite of linalg.solve (#80074 ) The `getri` kernel calls inside `getrs` so we can do so explicitly ourselves and save ourselves from having to maintain an extra kernel. This way we just need to optimise `lu_factor` and `lu_solve` and `inv` will be as efficient as it can be, as it'll be choosing the best backend to perform the factorisation and the best backend (not necessarily the same) to perform the solve. Fixes https://github.com/pytorch/pytorch/issues/77498 The benchmarks: https://github.com/pytorch/pytorch/pull/80074#issuecomment-1164309071 Pull Request resolved: https://github.com/pytorch/pytorch/pull/80074 Approved by: https://github.com/IvanYashchuk, https://github.com/albanD, https://github.com/malfet	2022-08-24 15:18:56 +00:00
Sergii Dymchenko	591222f5d9	Fix use-dict-literal lint (#83718 ) Fix use-dict-literal pylint suggestions by changing `dict()` to `{}`. This PR should do the change for every Python file except test/jit/test_list_dict.py, where I think the intent is to test the constructor. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83718 Approved by: https://github.com/albanD	2022-08-24 00:26:46 +00:00
kshitij12345	a802603ef7	[complex] conv_transpose1d (#79694 ) Reference: https://github.com/pytorch/pytorch/issues/71108 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79694 Approved by: https://github.com/ngimel	2022-08-23 19:31:22 +00:00
Khushi Agrawal	9095030239	[fix] edge case in `MaxPool1d` and add ErrorInputs (#83553 ) Fixes #83224 cc @kshitij12345 @albanD! Pull Request resolved: https://github.com/pytorch/pytorch/pull/83553 Approved by: https://github.com/albanD	2022-08-23 19:23:39 +00:00
Ivan Yashchuk	cb488e6d2f	Allow None arguments for elementwise type promotion wrapper and fix clamp with None arguments (#83586 ) Fixes https://github.com/pytorch/torchdynamo/issues/759 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83586 Approved by: https://github.com/ezyang, https://github.com/ngimel	2022-08-23 17:47:10 +00:00
Peter Bell	91eb1b9bb9	Move _masked opinfos to opinfo/definitions/_masked.py (#83763 ) Ref #82518 Pull Request resolved: https://github.com/pytorch/pytorch/pull/83763 Approved by: https://github.com/albanD	2022-08-22 19:08:41 +00:00
Peter Bell	7656ef73f1	Move `torch.special` OpInfos into opinfo/definitions/special.py (#83762 ) Ref #82518 As with `linalg` this doesn't include ops with an alias in special, only the ones where `special.foo` is the actual name of the opinfo. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83762 Approved by: https://github.com/albanD	2022-08-22 19:08:41 +00:00
Nikita Karetnikov	1f38225b56	[primTorch] Add ref for `new_empty_strided` (#82466 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/82466 Approved by: https://github.com/ezyang, https://github.com/ngimel	2022-08-19 18:51:57 +00:00
jjsjann123	1407e6728c	Nvfuser python api patch take 2 (#83684 ) landing #83645 again. Previously we are breaking on codegen bf16 kernel for cuda TK 10.2. Added a short-cut to disable bf tests on pre cuda 11 build. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83684 Approved by: https://github.com/ngimel	2022-08-19 16:05:39 +00:00
Peter Bell	8788e92f0f	Move `torch.linalg` opinfos to opinfo.definitions (2/2) (#83554 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83554 Approved by: https://github.com/albanD	2022-08-19 12:26:01 +00:00
Peter Bell	8dbb0990bc	Move `torch.linalg` opinfos to opinfo.definitions (1/2) (#83547 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/83547 Approved by: https://github.com/albanD	2022-08-19 12:26:01 +00:00
Peter Bell	4aeb98dee9	Move RefInfo classes into opinfo.refs (#83563 ) Given that there is already a clear `op_db`, `python_ref_db` split I think it makes sense to have the `RefInfo` classes be defined in a different file. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83563 Approved by: https://github.com/albanD	2022-08-19 12:25:59 +00:00
Peter Bell	f4caeb25e9	Move gradcheck_wrapper and clone_sample funcs into opinfo.core (#83560 ) The linalg OpInfos need these, so moving them into core to prevent circular dependencies. Pull Request resolved: https://github.com/pytorch/pytorch/pull/83560 Approved by: https://github.com/albanD	2022-08-19 12:25:58 +00:00
Kshiteej K	b4bc0d249f	[composite compliance] batch_norm (#79990 ) Fixes https://github.com/pytorch/pytorch/issues/76283 Ref: #69991 Pull Request resolved: https://github.com/pytorch/pytorch/pull/79990 Approved by: https://github.com/zou3519	2022-08-19 11:59:31 +00:00

1 2 3 4 5 ...

1506 Commits