pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 21:14:14 +08:00

Author	SHA1	Message	Date
dolpm	66f53889d5	[nativert] port semaphore to c10 util (#153504 ) Summary: nativert RFC: https://github.com/zhxchen17/rfcs/blob/master/RFC-0043-torch-native-runtime.md To land the runtime into PyTorch core, we will gradually land logical parts of the code into the Github issue and get each piece properly reviewed. This diff adds a simple semaphore interface into c10 until c++20 where we get counting_semaphore gonna need a oss build export to take a look at this... Test Plan: CI Differential Revision: D73882656 Pull Request resolved: https://github.com/pytorch/pytorch/pull/153504 Approved by: https://github.com/zhxchen17	2025-05-28 19:17:30 +00:00
Jithun Nair	24980d2641	[ROCm][CI] Update build-environment for mi300 workflows (#153134 ) so their test times are tracked separately in https://raw.githubusercontent.com/pytorch/test-infra/generated-stats/stats/test-times.json. Currently, both MI200 and MI300 test times get combined into the same key `linux-focal-rocm-py3.10` Pull Request resolved: https://github.com/pytorch/pytorch/pull/153134 Approved by: https://github.com/huydhn	2025-05-28 19:04:53 +00:00
PyTorch MergeBot	d4ab8e74f3	Revert "Fix the Problems About Defining Static Variable in Inline Function (#147095 )" This reverts commit c6fc11af760d4ad1f01cc699a3c6488ab5f41770. Reverted https://github.com/pytorch/pytorch/pull/147095 on behalf of https://github.com/izaitsevfb due to still fails to link internally at meta ([comment](https://github.com/pytorch/pytorch/pull/147095#issuecomment-2917221575))	2025-05-28 18:22:39 +00:00
henrylhtsang	1c7a70b483	[AOTI][cutlass backend] Do not remove the cutlass kernel .o file after packaging (#154155 ) Differential Revision: [D75253009](https://our.internmc.facebook.com/intern/diff/D75253009/) In general, we want to cache the cutlass kernels. Also saw an error saying .o not found. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154155 Approved by: https://github.com/chenyang78	2025-05-28 17:35:19 +00:00
Laith Sakka	66ac724b56	pyfmt lint torch/_export/passes/replace_view_ops_with_view_copy_ops_pass.py (#154488 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154488 Approved by: https://github.com/Skylion007 ghstack dependencies: #154483, #154484, #154485, #154487	2025-05-28 17:07:15 +00:00
Laith Sakka	dfe0f48123	pyfmt lint torch/_export/serde/schema.py (#154487 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154487 Approved by: https://github.com/Skylion007 ghstack dependencies: #154483, #154484, #154485	2025-05-28 17:07:15 +00:00
Laith Sakka	92cebed1bd	pyfmt lint torch/_export/serde/serialize.py (#154485 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154485 Approved by: https://github.com/Skylion007 ghstack dependencies: #154483, #154484	2025-05-28 17:07:07 +00:00
Laith Sakka	b4fe5ca58a	pymft lint torch/utils/weak.py (#154484 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154484 Approved by: https://github.com/Skylion007 ghstack dependencies: #154483	2025-05-28 17:06:58 +00:00
Laith Sakka	4de1b25df7	Remove empty files from execlude lint rule (#154483 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154483 Approved by: https://github.com/Skylion007	2025-05-28 17:06:50 +00:00
Sidharth	70539308ac	[dynamo] updating gb_type names for uniqueness (#154452 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154452 Approved by: https://github.com/williamwen42	2025-05-28 16:54:10 +00:00
Isalia20	e313152a33	SDPA fix memory efficient attention for large batch dim (#154029 ) Fixes #146704 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154029 Approved by: https://github.com/ngimel	2025-05-28 16:53:53 +00:00
Natalia Gimelshein	3b38989b5f	Remove MemPoolContext (#154042 ) Removes MemPoolContext from custom user mempools. The ground truth for which pool should be used is in graph_pools active pool, and MemPoolContext just introduced an opportunity for the pool pointed to by MemPoolContext and active pool in graph_pools to go out of sync (see all the asserts in the code to make sure that happens, and yet it still could happen in a multithread scenario, see my recent PRs (#153990). Pull Request resolved: https://github.com/pytorch/pytorch/pull/154042 Approved by: https://github.com/albanD, https://github.com/syed-ahmed	2025-05-28 16:35:48 +00:00
Jerry Zhang	d23aa7e182	Add deprecation warning for `torch.ao.quantization` (#153892 ) Summary: att Test Plan: (ao) $ PYTHONWARNINGS='default' python Python 3.10.14 \| packaged by conda-forge \| (main, Mar 20 2024, 12:45:18) [GCC 12.3.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> from torch.ao.quantization.quantizer.xnnpack_quantizer import XNNPACKQuantizer printing warning /anaconda3/envs/ao/lib/python3.10/site-packages/torch/ao/quantization/__init__.py:36: DeprecationWarning: torch.ao.quantization is deprecated. Plan is to 1. Remove eager mode quantization (torch.ao.quantization.quantize, torch.ao.quantization.quantize_dynamic), please migrate to use torchao eager mode quantize_ API instead 2. Remove fx graph mode quantization (torch.ao.quantization.quantize_fx.prepare_fx, torch.ao.quantization.quantize_fx.convert_fx, please migrate to use torchao pt2e quantization API instead (prepare_pt2e, convert_pt2e) 3. pt2e quantization has been migrated to torchao (https://github.com/pytorch/ao/tree/main/torchao/quantization/pt2e) see https://dev-discuss.pytorch.org/t/torch-ao-quantization-migration-plan/2810 for more details warnings.warn( >>> a = XNNPACKQuantizer() /anaconda3/envs/ao/lib/python3.10/site-packages/torch/ao/quantization/quantizer/xnnpack_quantizer.py:281: DeprecationWarning: XNNPACKQuantizer is deprecated! Please use xnnpack quantizer in ExecuTorch (https://github.com/pytorch/executorch/tree/main/backends/xnnpack/quantizer) instead warnings.warn(f"{self.__class__.__name__} is deprecated! Please use xnnpack quantizer in ExecuTorch (https://github.com/pytorch/executorch/tree/main/backends/xnnpack/quantizer) instead", DeprecationWarning) >>> Reviewers: Subscribers: Tasks: Tags: Pull Request resolved: https://github.com/pytorch/pytorch/pull/153892 Approved by: https://github.com/Skylion007	2025-05-28 16:25:30 +00:00
Zhengxu Chen	5bf74753f6	[precompile] Prune local scope variables for guard serialization. (#154431 ) Summary: Prune unused local objects from serialized local scope if they are not used in guard reconstruction. This is helpful when a user program takes things like local callable functions or the function call is recursive. Test Plan: test/dynamo/test_guard_serialization.py -k test_function_locals Before pruning locals: ``` state = GuardsState(output_graph=OutputGraphGuardsState(local_scope={'x': tensor([ 0.0461, 0.4024, -1.0115]), 'g': <function ...aints=None, _guards=<torch._guards.GuardsSet object at 0x7fbccc7e9fc0>, _aotautograd_guards=[]), shape_code_parts=None) def pickle_guards_state(state: GuardsState) -> bytes: buf = io.BytesIO() pickler = GuardsStatePickler(buf) try: pickler.dump(state) except AttributeError as e: > raise torch._dynamo.exc.PackageError(str(e)) from e E torch._dynamo.exc.PackageError: Can't pickle local object 'TestGuardSerialization.test_function_locals.<locals>.foo' ``` After the diff ``` Tests finished: Pass 1. Fail 0. Fatal 0. Skip 0. Build failure 0 ``` Differential Revision: D75452123 Pull Request resolved: https://github.com/pytorch/pytorch/pull/154431 Approved by: https://github.com/jansel	2025-05-28 16:03:02 +00:00
Joel Schlosser	9db7bcb3fe	[Dynamo] Introduce hook receiving list of traced code objects (#153622 ) This PR: * Expands `Hooks` with a new, optional `frame_traced_fn` field. It should be a callable receiving the list of traced code objects * Maintains a list of `traced_code` objects in the `TracingContext` of an `OutputGraph` * Whenever an `inline_call()` is encountered, the corresponding code object is added to this set * `OutputGraph`'s associated `f_code` is added to the list just before the hook is called I believe use of this hook should enable the source code hashing that vLLM does in a better way than monkey-patching `inline_call()`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153622 Approved by: https://github.com/jansel	2025-05-28 15:40:09 +00:00
bobrenjc93	476e0a643a	[ez] add docblock for ShapeGuardPythonPrinter (#154403 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154403 Approved by: https://github.com/jingsh ghstack dependencies: #154374, #154375, #154376, #154386, #154401, #154404, #154405, #154377, #154378, #154379, #154380, #154381, #154383, #154384, #154385, #154402	2025-05-28 14:17:17 +00:00
bobrenjc93	473a93eb58	[ez] add docblock for _ShapeGuardPrinter (#154402 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154402 Approved by: https://github.com/jingsh ghstack dependencies: #154374, #154375, #154376, #154386, #154401, #154404, #154405, #154377, #154378, #154379, #154380, #154381, #154383, #154384, #154385	2025-05-28 14:13:22 +00:00
bobrenjc93	35a473e364	[ez] add docblock for guard_scalar (#154385 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154385 Approved by: https://github.com/jingsh ghstack dependencies: #154374, #154375, #154376, #154386, #154401, #154404, #154405, #154377, #154378, #154379, #154380, #154381, #154383, #154384	2025-05-28 14:10:07 +00:00
bobrenjc93	ee4f433963	[ez] add docblock for _guard_or (#154384 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154384 Approved by: https://github.com/pianpwk ghstack dependencies: #154374, #154375, #154376, #154386, #154401, #154404, #154405, #154377, #154378, #154379, #154380, #154381, #154383	2025-05-28 14:06:29 +00:00
bobrenjc93	e9b97d19b1	[ez] Make SymNodeImpl comments less misleading (#154480 ) As discussed in DS workchat, it's easy for users to get confused by guarding for these supposedly non-guarding methods. The TL;DR is in the case of non pythonic compilers like XLA, we actually do guard. I've updated the comments accordingly to reduce confusion. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154480 Approved by: https://github.com/pianpwk, https://github.com/Skylion007	2025-05-28 14:04:32 +00:00
PyTorch MergeBot	a75e3a02be	Revert "[dynamo, nested graph breaks] small fixes to resume function generation (#151056 )" This reverts commit 28e7aa21c522e92ea01a62dfdc5e3b74e398d8f0. Reverted https://github.com/pytorch/pytorch/pull/151056 on behalf of https://github.com/malfet due to Not sure which one, but it broke test_error_messages, see `203b0efd63/1` ([comment](https://github.com/pytorch/pytorch/pull/151056#issuecomment-2916437433))	2025-05-28 13:53:50 +00:00
PyTorch MergeBot	9603d6382d	Revert "[dynamo, nested graph breaks] refactor codegen to minimize NULL codegen'ing (#153510 )" This reverts commit 1fe98429222a8ba5e16dd9381f50a8fb90edcf0e. Reverted https://github.com/pytorch/pytorch/pull/153510 on behalf of https://github.com/malfet due to Not sure which one, but it broke test_error_messages, see `203b0efd63/1` ([comment](https://github.com/pytorch/pytorch/pull/151056#issuecomment-2916437433))	2025-05-28 13:53:50 +00:00
PyTorch MergeBot	5fd7004dc9	Revert "[dynamo, nested graph breaks] remove block stack graph break in output_graph (#153772 )" This reverts commit 9a66c30bdc563c62375e5030c4103b67515b8dac. Reverted https://github.com/pytorch/pytorch/pull/153772 on behalf of https://github.com/malfet due to Not sure which one, but it broke test_error_messages, see `203b0efd63/1` ([comment](https://github.com/pytorch/pytorch/pull/151056#issuecomment-2916437433))	2025-05-28 13:53:50 +00:00
PyTorch MergeBot	e86439ed5b	Revert "[dynamo, nested graph breaks] add skip_frame debugging function (#153773 )" This reverts commit aadf9eae63c4793e1107a3b21ede30e5289eeaca. Reverted https://github.com/pytorch/pytorch/pull/153773 on behalf of https://github.com/malfet due to Not sure which one, but it broke test_error_messages, see `203b0efd63/1` ([comment](https://github.com/pytorch/pytorch/pull/151056#issuecomment-2916437433))	2025-05-28 13:53:50 +00:00
Howard Huang	203b0efd63	[PP] Allow unused kwargs in ZB path (#153498 ) This is a fix when an unused kwarg is in the PP stage forward, we try to call `torch.autograd.grad()` and update its gradients when it shouldn't have gradients. Leading to this error: ``` [rank3]:[rank3]: File "/data/users/howardhuang/pytorch/torch/distributed/pipelining/stage.py", line 613, in [rank3]:[rank3]: return lambda: stage_backward_input( [rank3]:[rank3]: File "/data/users/howardhuang/pytorch/torch/distributed/pipelining/_backward.py", line 199, in stage_backward_input [rank3]:[rank3]: dinputs = torch.autograd.grad( [rank3]:[rank3]: File "/data/users/howardhuang/pytorch/torch/autograd/init.py", line 503, in grad [rank3]:[rank3]: result = _engine_run_backward( [rank3]:[rank3]: File "/data/users/howardhuang/pytorch/torch/autograd/graph.py", line 824, in _engine_run_backward [rank3]:[rank3]: return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass [rank3]:[rank3]: RuntimeError: One of the differentiated Tensors does not require grad ``` related issues: https://github.com/pytorch/torchtitan/issues/1188 Pull Request resolved: https://github.com/pytorch/pytorch/pull/153498 Approved by: https://github.com/kwen2501	2025-05-28 13:34:04 +00:00
ILCSFNO	cf7451f279	Fix signature of torch.sparse_coo_tensor() (#152681 ) Fixes #145371 @pearu Searched all and find these codes, wondering whether is the root cause of the issue, could you have a review? Thanks a lot! Pull Request resolved: https://github.com/pytorch/pytorch/pull/152681 Approved by: https://github.com/Skylion007, https://github.com/pearu, https://github.com/nikitaved	2025-05-28 13:16:41 +00:00
Yuanhao Ji	f58143b945	[Typing] Refactor `torch.types.Device` in `torch/cuda/__init__.py` (#153447 ) Part of: #152952 Follow up: #153027 Here is the definition of `torch.types.Device`: `ab997d9ff5/torch/types.py (L74)` So `Optional[Union[Device, int]]` is equivalent to `torch.types.Device`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153447 Approved by: https://github.com/cyyever, https://github.com/Skylion007	2025-05-28 10:09:31 +00:00
PyTorch MergeBot	fdc339003b	Revert "[AOTI] Support multi-arch when using package_cpp_only (#154414 )" This reverts commit a84d8c4a1cc515db274366537afd0b1492800c2d. Reverted https://github.com/pytorch/pytorch/pull/154414 on behalf of https://github.com/huydhn due to Sorry for reverting your change but it is failing ROCm trunk job ([comment](https://github.com/pytorch/pytorch/pull/154414#issuecomment-2915597821))	2025-05-28 09:23:31 +00:00
Laith Sakka	853958f82c	Fix: Replacements can cause runtime assertions to disappear and can cause invalid inductor code. (#153661 ) Lets explore firs a couple of problem related to replacements and runtime assertions. #### example problem 1 if we have a runtime assertions that u0==s0, u0 is an input coming from mark_unbacked. A replacement u0=s0 will be added, the function f(u0, s0) will become f(s0, s0), this leads to the assert not being inserted during insert_deferred_runtime_asserts. The reason is that insert_deferred_runtime_asserts logic insert each assertion once all its inputs are seen, but u0 will never be seen. Same thing can happen when we defer assertion on backed i.e: s0==s2 ..etc. #### example problem 2 Consider u0==s0, where u0 is coming from a call to .item() Imagine later on that a specialization happens to s0 to become 2. In that case s0 as input wont be seen during insert_deferred_runtime_asserts and the assertion won't be inserted in the graph. Worse, Inductor will generate some code that refers to s0 in the cpp wrapper while it does not exist, causing a failure. internal xref: https://fb.workplace.com/groups/1075192433118967/permalink/1669766396994898/ ## The solution : Runtime assertions insertion loops depend on detecting that the symbols that are used in the runtime assertions are seen, note that those symbols are either graph inputs or generated in the graph from data dependent ops like .item(). The issues above happen when symbols are graph inputs, in order to force the symbols to exist in the graph and to be seen by the runtime assertions we do not do replacements on placeholders expressions during codegen and during runtime assertions insertion. This should not have performance overhead, since we already optimized the graph with replacements, the only effect is not mistakenly dropping graph inputs that are used in runtime assertions. I added extended testing. A solo unrelated follow up that I noticed, is that we might want to rename unbacked symbols in runtime assertions when we do unbacked renaming, but that's a different issue. Other approaches that did not work : #### ban replacements on unbacked. 1. does not work when we defer runtime assertions on backed ex: s0==s1. we could also ban such replacements but problem 2 becomes more problematic. 2. Problem two, it affects the quality of reasoning ! in a bad way. #### Apply specialization on runtime assertions before codegen . 1. Can fix some issues, but may lead also to runtime assertions becoming NOPs. 2. Does not fix the issue if not inserting runtime assertions during insert_deferred_runtime_asserts due to input not being detected. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153661 Approved by: https://github.com/jansel	2025-05-28 09:08:05 +00:00
William Wen	aadf9eae63	[dynamo, nested graph breaks] add skip_frame debugging function (#153773 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/153773 Approved by: https://github.com/jansel ghstack dependencies: #151056, #153510, #153772	2025-05-28 08:54:09 +00:00
William Wen	9a66c30bdc	[dynamo, nested graph breaks] remove block stack graph break in output_graph (#153772 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/153772 Approved by: https://github.com/jansel ghstack dependencies: #151056, #153510	2025-05-28 08:54:09 +00:00
William Wen	1fe9842922	[dynamo, nested graph breaks] refactor codegen to minimize NULL codegen'ing (#153510 ) Stop codegening NULLs that we need to pop later. Some output_graph.py changes to prepare for nested graph break support. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153510 Approved by: https://github.com/jansel ghstack dependencies: #151056	2025-05-28 08:54:09 +00:00
William Wen	28e7aa21c5	[dynamo, nested graph breaks] small fixes to resume function generation (#151056 ) Old: ~pack resume function stack + locals into a list: we need to be able to pass frame stack+locals in lists to hand off to nested functions in the future, so we implement this part first.~ We are no longer doing this right now since GraphModule/guard variable naming gets messed up. Going forward, our approach will be to keep the top frame unpacked, but pack the rest of the contents of other frames in a list. Pull Request resolved: https://github.com/pytorch/pytorch/pull/151056 Approved by: https://github.com/jansel	2025-05-28 08:54:09 +00:00
cyy	9d04c0f352	Remove outdated CUDA 11 conditions (#154313 ) Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/154313 Approved by: https://github.com/eqy	2025-05-28 08:44:58 +00:00
Pian Pawakapan	1d9b7dd2d1	[PGO] suggest dynamic whitelist for recompilations (#154189 ) suggests `TORCH_COMPILE_DYNAMIC_SOURCES` based off tensor size changes in PGO code state, including parameters. Closing #153442 which took the dynamo guards approach. Pull Request resolved: https://github.com/pytorch/pytorch/pull/154189 Approved by: https://github.com/bobrenjc93	2025-05-28 07:11:43 +00:00
bobrenjc93	fe760b6636	[ez] add docblock for _free_unbacked_symbols_with_path (#154383 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154383 Approved by: https://github.com/pianpwk ghstack dependencies: #154374, #154375, #154376, #154386, #154401, #154404, #154405, #154377, #154378, #154379, #154380, #154381	2025-05-28 05:53:50 +00:00
bobrenjc93	8e25ba6963	[ez] add docblock for find_symbol_binding_fx_nodes (#154381 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154381 Approved by: https://github.com/pianpwk ghstack dependencies: #154374, #154375, #154376, #154386, #154401, #154404, #154405, #154377, #154378, #154379, #154380	2025-05-28 05:44:26 +00:00
bobrenjc93	08c29deb5f	[ez] add docblock to is_symbol_binding_fx_node (#154380 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154380 Approved by: https://github.com/pianpwk ghstack dependencies: #154374, #154375, #154376, #154386, #154401, #154404, #154405, #154377, #154378, #154379	2025-05-28 05:41:19 +00:00
bobrenjc93	07405a6cff	[ez] add docblock for free_unbacked_symbols (#154379 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154379 Approved by: https://github.com/pianpwk ghstack dependencies: #154374, #154375, #154376, #154386, #154401, #154404, #154405, #154377, #154378	2025-05-28 05:37:25 +00:00
bobrenjc93	dcdaef5206	[ez] add docblock for free_symbols (#154378 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154378 Approved by: https://github.com/pianpwk ghstack dependencies: #154374, #154375, #154376, #154386, #154401, #154404, #154405, #154377	2025-05-28 05:34:25 +00:00
bobrenjc93	abc3fdc7ac	[ez] add docblock for _iterate_exprs (#154377 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154377 Approved by: https://github.com/pianpwk ghstack dependencies: #154374, #154375, #154376, #154386, #154401, #154404, #154405	2025-05-28 05:28:58 +00:00
bobrenjc93	ab6cb85cb0	[ez] add docblock for _remove_effect_token_unbacked_bindings (#154405 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154405 Approved by: https://github.com/Skylion007, https://github.com/pianpwk ghstack dependencies: #154374, #154375, #154376, #154386, #154401, #154404	2025-05-28 05:16:14 +00:00
bobrenjc93	fde8f6a8b8	[ez] add docblock for _suggest_torch_checks (#154404 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154404 Approved by: https://github.com/Skylion007 ghstack dependencies: #154374, #154375, #154376, #154386, #154401	2025-05-28 04:45:55 +00:00
bobrenjc93	b82fb57b67	[ez] add docblock for RuntimeAssert (#154401 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154401 Approved by: https://github.com/Skylion007 ghstack dependencies: #154374, #154375, #154376, #154386	2025-05-28 04:43:22 +00:00
bobrenjc93	d64b4a91dd	[ez] remove unused function _constrain_symbol_range (#154386 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154386 Approved by: https://github.com/Skylion007 ghstack dependencies: #154374, #154375, #154376	2025-05-28 04:41:00 +00:00
Laith Sakka	ef90cc18d7	use definitely_contiguous for _prim_elementwise_meta short circuit (#153441 ) * This verifies that the check short circuit is not material. https://github.com/pytorch/pytorch/pull/153431 ``` import torch from torch.export import Dim, export class MyModel(torch.nn.Module): def forward(self, x, ranks): first_k = ranks.max().item() torch._check_is_size(first_k) narrow = x.narrow(dim = 1, start = 0, length = first_k) lt = narrow < narrow.size(1) return lt inps = ( torch.randn((8, 16), device="cuda"), torch.arange(8, device="cuda", dtype=torch.int8) ) spec = { "x": (Dim.AUTO, Dim.AUTO), "ranks": (Dim.AUTO,), } traced = export(MyModel(), inps, dynamic_shapes=spec, strict=True).run_decompositions({}) ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/153441 Approved by: https://github.com/jansel ghstack dependencies: #153432	2025-05-28 03:41:26 +00:00
Laith Sakka	39df901b2a	introduce definitely_contiguous and use it for reshape and tensor meta data computation. (#153432 ) when a tensor has unbacked symbols it can be general enough to represent both contiguous and non contiguous tensors. in that case we cant really evaluate is_contiguous. In many places in the code base, we check for is_contiguous to take a fast path. but the general path usually works for both contiguous and not contiguous in that case we probably want to use definitely _contiguous API. This is appleid for reshape in this PR and also to tensor meta data computation, the meta data now will have an attribute that says that its contiguous when its always contiguous. We would store that only if definitely _contiguous is true now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/153432 Approved by: https://github.com/bobrenjc93	2025-05-28 03:41:26 +00:00
Sidharth	54f1f29fed	[dynamo] dynamic gb_type -> static gb_type (#154435 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154435 Approved by: https://github.com/williamwen42	2025-05-28 03:14:26 +00:00
ZhiweiYan-96	f12ce4e36b	[Intel GPU] convolution fusion at XPU backend (#154202 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/154202 Approved by: https://github.com/EikanWang, https://github.com/guangyey, https://github.com/etaf ghstack dependencies: #140365	2025-05-28 03:14:18 +00:00
FFFrog	c6fc11af76	Fix the Problems About Defining Static Variable in Inline Function (#147095 ) Refer to https://github.com/pytorch/pytorch/issues/125465 for more informations - Remove unused header files - Move the inline function that defines the static variable to .cc Pull Request resolved: https://github.com/pytorch/pytorch/pull/147095 Approved by: https://github.com/cyyever, https://github.com/albanD	2025-05-28 02:47:16 +00:00

1 2 3 4 5 ...

88284 Commits