pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
Jeff Daily	7357eb66c5	[ROCm][CI] unskip some test_memory_format tests (#162766 ) Fixes #70125. Much of the work was done by #161687. This PR is additional test cleanup. Pull Request resolved: https://github.com/pytorch/pytorch/pull/162766 Approved by: https://github.com/jeffdaily Co-authored-by: Jeff Daily <jeff.daily@amd.com>	2025-09-12 15:02:40 +00:00
PyTorch MergeBot	641ee74781	Revert "Add `label_smoothing` param in `nn.BCELoss` and `nn.BCEWithLogitsLoss` (#150282 )" This reverts commit f990490a23815ea6ee27e487c70ba2cf513ba43d. Reverted https://github.com/pytorch/pytorch/pull/150282 on behalf of https://github.com/facebook-github-bot due to Diff reverted internally ([comment](https://github.com/pytorch/pytorch/pull/150282#issuecomment-3182844949))	2025-08-13 09:01:52 +00:00
zeshengzong	f990490a23	Add `label_smoothing` param in `nn.BCELoss` and `nn.BCEWithLogitsLoss` (#150282 ) Fixes #91545 ## Changes - Add `label_smoothing` param and docs - Add test case for `label_smoothing` - Remove duplicate description in `nn.BCELoss` and `nn.BCEWithLogitsLoss` ## Test Result ```bash pytest -s test/test_nn.py -k test_bce ``` ![image](https://github.com/user-attachments/assets/30c0b7fe-fe49-4aa0-9b05-4d70403a7b05) ![image](https://github.com/user-attachments/assets/4fe3fd1c-54b8-4012-afd9-133ce9fb4964) ![image](https://github.com/user-attachments/assets/5cad019a-3a4c-475a-9fde-9c1acad5792d) Pull Request resolved: https://github.com/pytorch/pytorch/pull/150282 Approved by: https://github.com/cyyever, https://github.com/mikaylagawarecki	2025-08-12 09:37:03 +00:00
Kurt Mohler	b59b61a099	Add `avg_pool3d` backward pass for MPS (#159089 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/159089 Approved by: https://github.com/malfet	2025-08-05 01:55:38 +00:00
Kurt Mohler	52b9af163c	Add `avg_pool3d` for MPS (#158877 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/158877 Approved by: https://github.com/malfet	2025-07-29 15:22:22 +00:00
Kurt Mohler	510c398a4f	Add `max_pool3d` backward pass for MPS (#157498 ) Note on backward precision over fp16: A float16 number has 10 bits of mantissa, 5 bits of exponent, and 1 bit for the sign. If the sign bit is positive, then with a mantissa $m$ and exponent $e$ represented in base 10, the number that the float16 format represents is $(1 + m / 1024) \exp2(e)$. ([source](https://en.wikipedia.org/wiki/Half-precision_floating-point_format)) Consider adding two numbers $a$ and $b$ which have arbitrary mantissas, and say their exponents are $e_a = 1$ (so $2 \le a \lt 4$) and $e_b=-3$ (so $0.175 \le b \lt 0.25$). Assume that the result has the same exponent as $a$. Since the exponents differ by 4, we'll effectively need to truncate the 4 rightmost bits of $b$'s mantissa, which would introduce a maximum error on the order of $(2^4 / 1024) \exp2(-3) \approx 0.002$. The error is nearly the same if $e_b = -2$ (so $0.25 \le b \lt 0.5$), where the 3 rightmost bits are truncated, giving a maximum error on the order of $(2^3 / 1024) \exp2(-2) \approx 0.002$. Same for $e_b=-1$. So if we're adding up nine different numbers that all have exponents -3, -2, or -1, and they sum to a number with exponent 1, then we would expect a maximum error of several times greater than 0.002. In my comments above, summing those particular nine numbers in different ways gave results that ranged between 3.1816 and 3.1758, a difference of $0.0058 \approx 2.9 * 0.002$. That's within the acceptable bounds, and we can safely just increase the error tolerance used in test_output_grad_match for the case of max_pool3d_backward with float16. Pull Request resolved: https://github.com/pytorch/pytorch/pull/157498 Approved by: https://github.com/malfet	2025-07-07 19:46:44 +00:00
Manuel Candales	f8d93b3783	[MPS] Migrate hardswish (forward and backward) to Metal kernel (#155479 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155479 Approved by: https://github.com/kulinseth, https://github.com/malfet ghstack dependencies: #155304, #155316, #155462	2025-06-11 20:58:46 +00:00
atalman	7a03b0d2ca	[BE] Remove CUDA 11 artifacts. Fix Check Binary workflow (#155555 ) Please see: https://github.com/pytorch/pytorch/issues/147383 1. Remove CUDA 11 build and test artifacts. One place CUDA 12.4 2. Fix Check Binary Workflow to use Stable Cuda version variable rather then hardcoded one Pull Request resolved: https://github.com/pytorch/pytorch/pull/155555 Approved by: https://github.com/malfet, https://github.com/Skylion007	2025-06-10 21:32:08 +00:00
Manuel Candales	0f47e76937	[MPS] Implement hardshrink metal kernel (#155304 ) Implements the forward and backward hardshrink operators as Metal kernels. In order to support the lambda parameter, we extend the `exec_unary_kernel` and `exec_binary_kernel` methods. Now they take an optional Scalar and an optional ScalarType argument. When the optional ScalarType is provided, it overrides the type of the Scalar. We add a new `REGISTER_UNARY_ALPHA_OP` macro, and modify the existing `REGISTER_BINARY_ALPHA_OP` to support the new feature. Pull Request resolved: https://github.com/pytorch/pytorch/pull/155304 Approved by: https://github.com/malfet	2025-06-10 18:20:27 +00:00
Kurt Mohler	e7698ff5cf	[MPS] Move abs op to Metal (#155474 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/155474 Approved by: https://github.com/Skylion007, https://github.com/malfet	2025-06-10 00:23:59 +00:00
cyy	970fefcc53	Remove outdated skipCUDAIfCudnnVersionLessThan decoration (#148940 ) Test conditions for CUDNN 7 and 8 were removed because we have moved to CUDNN 9. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148940 Approved by: https://github.com/mikaylagawarecki	2025-03-13 18:02:50 +00:00
riccardofelluga	8f71d4563e	Fix rms_norm in fp16/bf16 (#147203 ) Fixes #134106. This PR moves the `upcasted_result` down-casting after all computation is done. Since the multiplication with the weight_opt input is not done in half precision, the current code path is doing the following: fp16 -> fp32 -> fp16 -> fp32 -> fp16. What we want tho is to avoid down-casting and this PR proposes: fp16 -> fp32 -> fp16. This results in better accuracy as it avoids truncating. Pull Request resolved: https://github.com/pytorch/pytorch/pull/147203 Approved by: https://github.com/eqy	2025-03-08 04:43:18 +00:00
Aaron Gokaslan	edd640a95a	[BE][Ez]: Use itertools.chain.from_iterable when possible (#148190 ) Often makes the code more readable, more efficient, and adds support for infinite iterables. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148190 Approved by: https://github.com/jansel, https://github.com/malfet	2025-03-06 20:37:06 +00:00
Afanti	c219c5ca38	Fix code descriptions in the test package. (#148145 ) The parameter and function description have something wrong and make them correct. Pull Request resolved: https://github.com/pytorch/pytorch/pull/148145 Approved by: https://github.com/janeyx99	2025-03-04 19:14:41 +00:00
soulitzer	c7ca1df37e	Disable slow gradcheck for nn.Transformer ModuleInfo (#145531 ) Fixes https://github.com/pytorch/pytorch/issues/117140 Pull Request resolved: https://github.com/pytorch/pytorch/pull/145531 Approved by: https://github.com/mikaylagawarecki ghstack dependencies: #145520	2025-01-25 00:58:03 +00:00
Aaron Orenstein	dea7ad3371	PEP585 update - torch/testing (#145200 ) See #145101 for details. Pull Request resolved: https://github.com/pytorch/pytorch/pull/145200 Approved by: https://github.com/bobrenjc93	2025-01-20 22:42:42 +00:00
bobrenjc93	3b6b306b71	Migrate from Tuple -> tuple in torch/testing (#144256 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/144256 Approved by: https://github.com/aorenste	2025-01-10 06:37:55 +00:00
Nikita Shulga	2d52f7946b	[BE] Use `torch.log1p(x)` instead of `torch.log(1+x)` (#141167 ) To fix TOR107 linter violations Found while trying to migrate PyTorch to latest torchfix Pull Request resolved: https://github.com/pytorch/pytorch/pull/141167 Approved by: https://github.com/kit1980, https://github.com/Skylion007	2024-11-21 00:36:20 +00:00
Nikita Shulga	9c88b08ac9	[BE] Replace `skipIfMPS` with `expectedFailureMPS` (#139940 ) Functionally two decorators are very similar, but one should rely on expectedFailure as much as possible to get signal when something is fixed. - Move `product_version` variable from `test_mps` to common_utils, but call it `MACOS_VERSION` - Introduce `skipIfMPSOnMacOS13` to decorate the hard crashes that happens only on MacOS13 (which at this point will not get any fixes and will be deprecated soon) - Add `device_type='mps'` to all `skipIfMPS` per https://github.com/pytorch/pytorch/issues/140560 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139940 Approved by: https://github.com/janeyx99, https://github.com/huydhn	2024-11-15 03:48:37 +00:00
Nikita Shulga	0f739b8f66	[Codemod] `skipIfMps`->`skipIfMPS` (#140562 ) As `MPS` is an acronym that stands for Metal Performance Shaders Also to closer align with `skipCUDAIf` not `skipCudaIf` Pull Request resolved: https://github.com/pytorch/pytorch/pull/140562 Approved by: https://github.com/ZainRizvi, https://github.com/r-barnes	2024-11-13 19:45:08 +00:00
Nikita Shulga	68ef445c33	[MPS][Perf] Dispatch to SDP-math-mps for non-contig Tensors (#139791 ) As MacOS-15 or newer supports those out of the box. This significantly reduces memory requirements and improves performance for some stable diffision networks. Test plan: Run ```python from diffusers import StableDiffusionXLPipeline, AutoencoderKL, EulerAncestralDiscreteScheduler import torch import time vae = AutoencoderKL.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder='vae', torch_dtype=torch.bfloat16, force_upcast=False).to('mps') pipe = StableDiffusionXLPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", vae=vae, torch_dtype=torch.bfloat16, variant="fp16").to('mps') pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config) start_time = time.time() start_mps_mem = torch.mps.driver_allocated_memory() image = pipe(prompt="Spherical cow in vacuum", num_inference_steps=10, guidance_scale=8, generator=torch.Generator("mps").manual_seed(42), ).images[0] end_mps_mem = torch.mps.driver_allocated_memory() run_time = time.time() - start_time print(f"run time in {run_time:.2f} sec, end_mps_mem {end_mps_mem/1024.02:.2f} Mb mem increase {(end_mps_mem-start_time)/1024.02:.2f} Mb") image.save(f'bfloat16.png') ``` Before the change total memory use were 16Gb and needed 65 sec to complete, after it drops down to 14Gb and takes 50 sec to finish on M2Pro, though generated image remains the same: ![image](https://github.com/user-attachments/assets/1a35efef-9f80-4cd0-ac9c-30203eab6bb1) Fixes https://github.com/pytorch/pytorch/issues/139389 Pull Request resolved: https://github.com/pytorch/pytorch/pull/139791 Approved by: https://github.com/drisspg, https://github.com/Skylion007 ghstack dependencies: #139788, #139784, #139763	2024-11-06 16:25:39 +00:00
Tom Ritchford	c0582fd0f8	Remove unused Python variables in torch/[b-z]* (#136963 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/136963 Approved by: https://github.com/ezyang	2024-10-19 16:45:22 +00:00
PyTorch MergeBot	b6d6aa49b8	Revert "Validate input types for `torch.nn.Linear` and `torch.nn.Bilinear` (#135596 )" This reverts commit e157ce3ebbb3f30d008c15914e82eb74217562f0. Reverted https://github.com/pytorch/pytorch/pull/135596 on behalf of https://github.com/malfet due to It's too restrictive, should allow other int-like types, such as `numpy.int64` ([comment](https://github.com/pytorch/pytorch/pull/135596#issuecomment-2349714104))	2024-09-13 18:06:56 +00:00
Sanskar Modi	e157ce3ebb	Validate input types for `torch.nn.Linear` and `torch.nn.Bilinear` (#135596 ) Adding validation checks to check the input types and display better error messages for the same. Fixes https://github.com/pytorch/pytorch/issues/135463 Pull Request resolved: https://github.com/pytorch/pytorch/pull/135596 Approved by: https://github.com/malfet	2024-09-12 21:28:37 +00:00
Mayank Mishra	9a04cfbeff	fix for fp16 (#134106 ) This PR is a replacement for https://github.com/pytorch/pytorch/pull/133085 for pushing a quick fix for RMSNorm. The original author is @kkontny Previous PR summary: Since FP16 has quite small dynamic range it is very easy to overflow while computing `at::pow(input, 2)` , and it happens in real world computation. I've tried to use `nn.RMSNorm` fused implementation instead of `LlamaRMSNorm` inside `transformers` implementation of Llama (`src/transformers/models/llama/modeling_llama.py`). It started to give wrong answers in Fp16 while still giving good in FP32. I figured out happens due to overflow while computing square of the input tensor. Original `LLamaRMSNorm` implementation upcasts input to fp32 to prevent this and give better numerical stability. ``` class LlamaRMSNorm(nn.Module): def __init__(self, hidden_size, eps=1e-6): """ LlamaRMSNorm is equivalent to T5LayerNorm """ super().__init__() self.weight = nn.Parameter(torch.ones(hidden_size)) self.variance_epsilon = eps def forward(self, hidden_states): input_dtype = hidden_states.dtype hidden_states = hidden_states.to(torch.float32) variance = hidden_states.pow(2).mean(-1, keepdim=True) hidden_states = hidden_states * torch.rsqrt(variance + self.variance_epsilon) return self.weight * hidden_states.to(input_dtype) ``` Proposed commit fixed the issue. FP16 in RMSNorm has to be treated in special way, to be usable in real world implementations. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134106 Approved by: https://github.com/mikaylagawarecki, https://github.com/eqy	2024-09-11 22:02:07 +00:00
Nikita Shulga	71383dd3da	[MPS] Fix bachnorm_2d for channels last (#134618 ) By skipping gather of input tensor if memory_layout is channels_last, which is a first step towards fixing https://github.com/pytorch/pytorch/issues/134580 Though underlying problem is much more interesting, i.e. MPS does not have a generic support for channels last, but `c10::is_contiguoius()` is true for channels last layout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/134618 Approved by: https://github.com/albanD	2024-09-03 19:20:11 +00:00
Nikita Shulga	f95085fd91	[BE][MPS] Prefer xfail to skip (#134858 ) This essentially undoes large skips on everything but MacOS Sequoia to nn.modules made by https://github.com/pytorch/pytorch/pull/128393 Instead it uses existing `xfail`, but guards it on `_macos15_or_newer` boolean Before the change if run on MacOS 14: ``` % python3 ../test/test_modules.py -v -k Hardswish 2>&1\|tail -n3 Ran 57 tests in 0.053s OK (skipped=32) ``` After ``` % python3 ../test/test_modules.py -v -k Hardswish 2>&1\|tail -n3 Ran 57 tests in 0.229s OK (skipped=10, expected failures=2) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/134858 Approved by: https://github.com/janeyx99	2024-08-31 00:29:48 +00:00
Nikita Shulga	8de0d7690c	Use newer `toAccumulateType` signature in `Normalization.cpp` (#134540 ) Which fixes BatchNorm behavior for if called with empty tensors on MPS backed. Removed `expectedFailureMPS` in test_nn.py, deleted expected failure in `test_mps.py` and adjusted `skipIfMPS` to `expectedFailureMPS` in BatchNorm2d OpInfo decorator, but restrict it only to the memory format tests Test Plan: CI + `python3 -c "import torch; print(torch.nn.BatchNorm2d(3, device='mps')(torch.rand(0, 3, 2, 2, device='mps')))"` Fixes https://github.com/pytorch/pytorch/issues/134423 Pull Request resolved: https://github.com/pytorch/pytorch/pull/134540 Approved by: https://github.com/Skylion007, https://github.com/albanD	2024-08-27 18:09:20 +00:00
Mikayla Gawarecki	d028b810fe	Fix flaky GroupNorm ModuleInfo test (#133899 ) Fixes https://github.com/pytorch/pytorch/issues/98677 Pull Request resolved: https://github.com/pytorch/pytorch/pull/133899 Approved by: https://github.com/albanD	2024-08-27 14:45:51 +00:00
Denis Vieriu	861bdf96f4	[MPS] Add native strided API for MPSNDArray starting with macOS 15 (#128393 ) Add support for native strides in MPS starting with macOS Sequoia. This will get rid of the additional gather and scatter operations needed to solve the strides or storage offsets of the tensors. Summary of changes (starting with macOS 15): - Add support for MPS strided API (strides/storage offsets etc): - [initWithBuffer:offset:descriptor:](https://developer.apple.com/documentation/metalperformanceshaders/mpsndarray/4391636-initwithbuffer?language=objc) - [arrayViewWithCommandBuffer:descriptor:aliasing:](https://developer.apple.com/documentation/metalperformanceshaders/mpsndarray/3114040-arrayviewwithcommandbuffer?language=objc) - [arrayViewWithShape:strides:](https://developer.apple.com/documentation/metalperformanceshaders/mpsndarray/4408694-arrayviewwithshape?language=objc) - [reshapeWithCommandBuffer:sourceArray:shape:destinationArray:](https://developer.apple.com/documentation/metalperformanceshaders/mpsndarrayidentity/4438557-reshapewithcommandbuffer?language=objc) - Add native support for NHWC convolutions (without incurring any extra copy from NCHW -> NHWC -> NCHW). - Add support for strided output buffers (previously we would create a contiguous buffer OSes older than macOS 15 will run the old gather/scatter code path to solve strides/storage offsets. --- Couple performance stats collected from torchbench comparing macOS 15 vs macOS 14: ``` - test_train[functorch_maml_omniglot-mps]: 27% faster - test_train[timm_vision_transformer-mps]: 12% faster - test_train[hf_T5-mps]: 9.46% faster ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/128393 Approved by: https://github.com/albanD Co-authored-by: Siddharth Kotapati <skotapati@apple.com>	2024-08-16 21:07:50 +00:00
ankurneog	ebc012ace6	Add hooks for execution on intel gaudi devices - 1 (#128584 ) ## Motivation This is follow up to PR:https://github.com/pytorch/pytorch/pull/126970 to support Gaudi devices for Pytorch UT execution. ## Changes We are adding additional hooks to: 1. Add dtype exceptions for Gaudi/HPU 2. Extend onlyNativeDevices decorator functionality to add additional devices Pull Request resolved: https://github.com/pytorch/pytorch/pull/128584 Approved by: https://github.com/albanD	2024-07-20 05:03:36 +00:00
cyy	d44daebdbc	[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127051 Approved by: https://github.com/cpuhrsch, https://github.com/malfet	2024-05-31 01:20:45 +00:00
PyTorch MergeBot	67739d8c6f	Revert "[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 )" This reverts commit 699db7988d84d163ebb6919f78885e4630182a7a. Reverted https://github.com/pytorch/pytorch/pull/127051 on behalf of https://github.com/PaliC due to This PR needs to be synced using the import button as there is a bug in our diff train ([comment](https://github.com/pytorch/pytorch/pull/127051#issuecomment-2138496995))	2024-05-30 01:16:57 +00:00
cyy	699db7988d	[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127051 Approved by: https://github.com/cpuhrsch, https://github.com/malfet	2024-05-29 11:58:03 +00:00
PyTorch MergeBot	cdbb2c9acc	Revert "[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 )" This reverts commit 4fdbaa794f9d5af2f171f772a51cb710c51c925f. Reverted https://github.com/pytorch/pytorch/pull/127051 on behalf of https://github.com/PaliC due to This PR needs to be synced using the import button as there is a bug in our diff train ([comment](https://github.com/pytorch/pytorch/pull/127051#issuecomment-2136428735))	2024-05-29 03:02:35 +00:00
cyy	4fdbaa794f	[Submodule] Remove deprecated USE_TBB option and TBB submodule (#127051 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/127051 Approved by: https://github.com/cpuhrsch, https://github.com/malfet	2024-05-27 03:54:03 +00:00
Shunting Zhang	db9c6aeec6	Revert "Skip test_memory_format_nn_BatchNorm2d in inductor (#125970 )" (#126594 ) This reverts commit 0a9c6e92f8d1a35f33042c8dab39f23b7f39d6e7. enable the test since it's fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126594 Approved by: https://github.com/huydhn ghstack dependencies: #126593	2024-05-25 01:27:02 +00:00
PyTorch MergeBot	df4b7cb5f7	Reapply "Skip test_memory_format_nn_BatchNorm2d in inductor (#125970 )" (#126594 ) This reverts commit ce6e36bf8b524c3f4b07605c5b3af2b7d5ba8fd9. Reverted https://github.com/pytorch/pytorch/pull/126594 on behalf of https://github.com/clee2000 due to broke tests on inductor? test_modules.py::TestModuleCUDA::test_cpu_gpu_parity_nn_CTCLoss_cuda_float64 `43f2f43eb3` https://github.com/pytorch/pytorch/actions/runs/9200644034/job/25308511495 ([comment](https://github.com/pytorch/pytorch/pull/126586#issuecomment-2126228689))	2024-05-23 04:54:28 +00:00
Shunting Zhang	ce6e36bf8b	Revert "Skip test_memory_format_nn_BatchNorm2d in inductor (#125970 )" (#126594 ) This reverts commit 0a9c6e92f8d1a35f33042c8dab39f23b7f39d6e7. enable the test since it's fixed. Pull Request resolved: https://github.com/pytorch/pytorch/pull/126594 Approved by: https://github.com/huydhn ghstack dependencies: #126586, #126593	2024-05-22 22:43:09 +00:00
Huy Do	0a9c6e92f8	Skip test_memory_format_nn_BatchNorm2d in inductor (#125970 ) Skipping the test in the context of https://github.com/pytorch/pytorch/issues/125967 until the issue is root caused and fixed properly. Pull Request resolved: https://github.com/pytorch/pytorch/pull/125970 Approved by: https://github.com/clee2000	2024-05-11 04:11:18 +00:00
Aaron Gokaslan	2f3b0befed	[BE]: Apply ruff FURB 118. (#124743 ) Replaces various lambdas with operator.itemgetter which is more efficient (as it's a builtin function). Particularly useful for when lambdas are used as 'key' functions. Pull Request resolved: https://github.com/pytorch/pytorch/pull/124743 Approved by: https://github.com/albanD, https://github.com/malfet	2024-04-26 14:34:52 +00:00
xinan.lin	6fcbeb3489	[ATen] Add CPU fp16 support for nll_loss and cross_entropy_loss (#123256 ) Add CPU FP16 support for nll_loss and cross_entropy_loss. Resolve issue #123328. Pull Request resolved: https://github.com/pytorch/pytorch/pull/123256 Approved by: https://github.com/jgong5, https://github.com/EikanWang, https://github.com/malfet	2024-04-18 11:44:38 +00:00
Mikayla Gawarecki	487b6d40ec	Add RMSNorm module (#121364 ) Similar to `dbeed9724b/torchmultimodal/modules/layers/normalizations.py (L51)` The implementation here is not optimized and we welcome pull requests to improve this - Use `normalized_shape` instead of singular integer `dim` to be aligned with the `nn.LayerNorm` implementation - Remove the [upcast to float and downcast ](`dbeed9724b/torchmultimodal/modules/layers/normalizations.py (L73)`) Differential Revision: [](https://our.internmc.facebook.com/intern/diff/) Differential Revision: [D55485840](https://our.internmc.facebook.com/intern/diff/D55485840) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121364 Approved by: https://github.com/albanD	2024-03-29 18:05:28 +00:00
PyTorch MergeBot	8698121636	Revert "Add RMSNorm module (#121364 )" This reverts commit a7306de0dc96cda8b698d19680a88d27aa45a31d. Reverted https://github.com/pytorch/pytorch/pull/121364 on behalf of https://github.com/atalman due to Broke internal tests ([comment](https://github.com/pytorch/pytorch/pull/121364#issuecomment-2025502007))	2024-03-28 15:31:10 +00:00
Mikayla Gawarecki	cc12668053	Fix swap_tensors path in _apply for modules that inherit from RNNBase (RNN, GRU, LSTM) (#122800 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/122800 Approved by: https://github.com/albanD	2024-03-27 23:34:16 +00:00
Mikayla Gawarecki	a7306de0dc	Add RMSNorm module (#121364 ) Similar to `dbeed9724b/torchmultimodal/modules/layers/normalizations.py (L51)` The implementation here is not optimized and we welcome pull requests to improve this - Use `normalized_shape` instead of singular integer `dim` to be aligned with the `nn.LayerNorm` implementation - Remove the [upcast to float and downcast ](`dbeed9724b/torchmultimodal/modules/layers/normalizations.py (L73)`) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121364 Approved by: https://github.com/albanD	2024-03-27 21:39:30 +00:00
Mikayla Gawarecki	d621e3e3b8	Add exhaustive module and optimizer tests for torch.load(state_dict, weights_only=True) (#121049 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/121049 Approved by: https://github.com/janeyx99	2024-03-05 14:27:50 +00:00
feifan	bfa71b523d	add complex32 to v3_dtypes (#120388 ) Fixes [#120290](https://github.com/pytorch/pytorch/issues/120290) Fixes https://github.com/pytorch/pytorch/issues/73502 use `v3_dtypes` and `torch._utils._rebuild_tensor_v3` to handle torch.save(complex32) result: ![image](https://github.com/pytorch/pytorch/assets/37650440/18b6cbb3-fb3f-4855-9d48-374014647988) Pull Request resolved: https://github.com/pytorch/pytorch/pull/120388 Approved by: https://github.com/albanD	2024-02-28 02:32:29 +00:00
Mikayla Gawarecki	677e67c399	Update nn.Module._apply to not gate on should_use_set_data when swap_tensors is set (#120659 ) This updates the nesting of if statements in `nn.Module._apply` such that if `torch.__future__.set_swap_module_params_on_conversion(True)`, we always try to swap regardless of whether - `torch._has_compatible_shallow_copy_type(param, fn(param)` - `torch.__future__.set_overwrite_module_params_on_conversion` is set This means that `meta_module.to_empty('device')` can now use the swap_tensors path cc @awgu Pull Request resolved: https://github.com/pytorch/pytorch/pull/120659 Approved by: https://github.com/albanD	2024-02-28 00:59:34 +00:00
rzou	b3df3e4e94	Restore OpInfo/ModuleInfo tests in Inductor-wrapped tests (#119693 ) I accidentally disabled this without realizing it. It turns out that PYTORCH_TEST_WITH_INDUCTOR=1 implies PYTORCH_TEST_WITH_DYNAMO=1, which activates skipIfTorchDynamo decorators. Test Plan: - wait for CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/119693 Approved by: https://github.com/bdhirsh	2024-02-12 22:44:45 +00:00

1 2 3 4

163 Commits