Fix AArch64 segfaults by disabling strict-aliasing in GridSamplerKernel for GCC 12 and above (#158117)

This PR disables `strict-aliasing` GCC C++ optimization flag on all AArch64 cpus for GCC versions 12 and above.

Pull Request #152825 upgraded gcc version from 11 to 13 in manywheel which caused several segmentation faults in unit tests ( not visible in CI workflows because the jammy gcc version has not been updated yet ).

We Identified the problem also exists in GCC12 hence the ` __GNUC__ >= 12`

Fixes #157626

fixes these tests failures when pytorch is built in GCC12 and above
```
test_ops.py::TestCommonCPU::test_noncontiguous_samples_grid_sampler_2d_cpu_float32 Fatal Python error: Segmentation fault
test_ops.py::TestCommonCPU::test_dtypes_grid_sampler_2d_cpu Fatal Python error: Segmentation fault
test_ops.py::TestMathBitsCPU::test_neg_view_nn_functional_grid_sample_cpu_float64 free(): invalid next size (fast)
test_ops.py::TestCompositeComplianceCPU::test_backward_grid_sampler_2d_cpu_float32 Fatal Python error: Segmentation fault
test_ops.py::TestCommonCPU::test_dtypes_nn_functional_grid_sample_cpu Fatal Python error: Segmentation fault

```

Pull Request resolved: https://github.com/pytorch/pytorch/pull/158117
Approved by: https://github.com/malfet
This commit is contained in:
Robert Hardwick
2025-07-15 18:26:34 +00:00
committed by PyTorch MergeBot
parent 41971335c9
commit 8c3f206457

View File

@ -14,6 +14,12 @@
namespace at::native { namespace {
// fixes segfaults for GCC >= 12 on some AArch64 cpus https://github.com/pytorch/pytorch/issues/157626
#if defined(__GNUC__) && __GNUC__ >= 12 && defined(__aarch64__)
#pragma GCC push_options
#pragma GCC optimize ("no-strict-aliasing")
#endif
/** NOTE [ Grid Sample CPU Kernels ]
*
* Implementation of vectorized grid sample CPU kernels is divided into three
@ -1014,6 +1020,10 @@ struct ApplyGridSample<scalar_t, 2, GridSamplerInterpolation::Bicubic,
}
};
#if defined(__GNUC__) && __GNUC__ >= 12 && defined(__aarch64__)
#pragma GCC pop_options
#endif
// ~~~~~~~~~~~~~~~~~~ grid_sample_2d_grid_slice_iterator ~~~~~~~~~~~~~~~~~~~~~~
// Function to apply a vectorized function on a grid slice tensor (without batch
// dimension).