Debugging CI Failures without SSH Access new page

clee2000
2024-08-16 10:38:32 -07:00
parent 50ce38acea
commit 2a9034ec8b

@ -0,0 +1,11 @@
# Debugging without SSH Access
## Linux CPU job
1. Download docker on an x86 machine.
2. In the CI job, find the step titled “Use following to pull public copy of the image”. It will have a command to pull the docker image. Pull and run the docker image (ex `docker run rm -it ghcr.io/pytorch/ci-image:pytorch-linux-focal-cuda12.1-cudnn9-py3-gcc9-93520d5082026249ce8ae0413d61e4891366a9df`). The ghcr containers should be public, but firewalls and VPNs might result in permissions issues.
3. Find the wheel for your job: go to the HUD page build your commit. Search for “Expand to see all artifacts” and search for the build that corresponds to the test. This should also be publicly available.
4. Inside your docker container in jenkins folder (this should be home folder), download the build artifact link and unzip. Install the wheel inside the dist folder using pip.
5. Clone pytorch and check out the corresponding sha (can be found in the bottom of the “Checkout PyTorch” step in the CI job).
Notes:
Tests are usually run through `pytest <test file>.py -k <test name>` or `python <test file>.py -k <test name>`