|
bcf50636ba
|
[CI] Removing --user flag from all pip install commands (#154900)
Related to https://github.com/pytorch/pytorch/issues/148335
python virtualenv doesn't support using `--user` flag:
```
ERROR: Can not perform a '--user' install. User site-packages are not visible in this virtualenv.
+ python3 -m pip install --progress-bar off --user ninja==1.10.2
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/154900
Approved by: https://github.com/jeffdaily
Co-authored-by: Jithun Nair <jithun.nair@amd.com>
|
2025-07-14 21:09:42 +00:00 |
|
|
7d39e73c57
|
Fix more URLs (#153277)
Or ignore them.
Found by running the lint_urls.sh script locally with https://github.com/pytorch/pytorch/pull/153246
Pull Request resolved: https://github.com/pytorch/pytorch/pull/153277
Approved by: https://github.com/malfet
|
2025-05-14 16:23:50 +00:00 |
|
|
bcf1031cb8
|
[ROCm] Fixes to enable VM-based MI300 CI runners (#152133)
New VM-based MI300 CI runners tested in https://github.com/pytorch/pytorch/pull/151708 exposed some issues in CI that this PR fixes:
* HSAKMT_DEBUG_LEVEL is a debug env var that was introduced to debug driver issues. However, in the new MI300 runners being tested, since they run inside a VM, the driver emits a debug message `Failed to map remapped mmio page on gpu_mem 0` when calling `rocminfo` or doing other GPU-related work. This results in multiple PyTorch unit tests failing when doing a string match on the stdout vs expected output.
* HSA_FORCE_FINE_GRAIN_PCIE was relevant for rccl performance improvement, but is not required now.
* amdsmi doesn't return metrics like [power_info](https://rocm.docs.amd.com/projects/amdsmi/en/latest/reference/amdsmi-py-api.html#amdsmi-get-power-cap-info) and [clock_info](https://rocm.docs.amd.com/projects/amdsmi/en/latest/reference/amdsmi-py-api.html#amdsmi-get-clock-info) in a VM ("Guest") environment. Return 0 as the default in cases where amdsmi returns "N/A"
* amdsmi throws an exception when calling `amdsmi.amdsmi_get_clock_info` on the VM-based runners. Temporarily skipping the unit test for MI300 until we find a resolution.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/152133
Approved by: https://github.com/jeffdaily
|
2025-04-25 18:06:48 +00:00 |
|
|
0ecb071fc4
|
[BE][CI] change references from .jenkins to .ci (#92624)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92624
Approved by: https://github.com/ZainRizvi, https://github.com/huydhn
|
2023-01-30 22:50:07 +00:00 |
|
|
b453adc945
|
[BE][CI] rename .jenkins (#92845)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92845
Approved by: https://github.com/clee2000
|
2023-01-25 23:47:38 +00:00 |
|
|
afe6ea884f
|
Revert "[BE][CI] rename .jenkins to .ci, add symlink (#92621)"
This reverts commit 8972a9fe6aa8be8f8035c83094ed371973bfbe73.
Reverted https://github.com/pytorch/pytorch/pull/92621 on behalf of https://github.com/atalman due to breaks shipit
|
2023-01-23 15:04:58 +00:00 |
|
|
8972a9fe6a
|
[BE][CI] rename .jenkins to .ci, add symlink (#92621)
Pull Request resolved: https://github.com/pytorch/pytorch/pull/92621
Approved by: https://github.com/huydhn, https://github.com/ZainRizvi
|
2023-01-21 02:40:18 +00:00 |
|