Changes:
1. Bump `ruff` from 0.7.4 to 0.8.4
2. Change `%`-formatted strings to f-string
3. Change arguments with the `__`-prefix to positional-only arguments with the `/` separator in function signature.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143753
Approved by: https://github.com/Skylion007
As title, this patch prevents developers from importing third party
libraries to patch things in Dynamo, unless there's no other easy
workaround (in which case one would add the library to the allowlist in
`import_linter.py`, as instructed by the lint error).
For instance, if we remove `einops` from the allowlist, we'd get this
```verbatim
>>> Lint for torch/_dynamo/decorators.py:
Error (IMPORT) Disallowed import
importing from einops is not allowed, if you believe there's a valid
reason, please add it to import_linter.py
608 |# Note: this carefully avoids eagerly import einops.
609 |# TODO: we should delete this whole _allow_in_graph_einops logic by approximately 2024 Q2
610 |def _allow_in_graph_einops():
>>> 611 | import einops
612 |
613 | try:
614 | # requires einops > 0.6.1, torch >= 2.0
Error (IMPORT) Disallowed import
importing from einops is not allowed, if you believe there's a valid
reason, please add it to import_linter.py
612 |
613 | try:
614 | # requires einops > 0.6.1, torch >= 2.0
>>> 615 | from einops._torch_specific import ( # type: ignore[attr-defined] # noqa: F401
616 | _ops_were_registered_in_torchdynamo,
617 | )
618 |
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/143312
Approved by: https://github.com/zou3519
This is a first step towards removing builds dependency to conda.
Currently we build magma as a conda package in a pytorch conda channel, implemented in a1b372dbda/magma.
This commit adapts the logic from pytorch/builder as follows:
- use pytorch/manylinux-cuda<cuda-version> as base image
- apply patches and invoke the build.sh script directly (not anymore through conda build)
- stores license and build files along with the built artifact, in an info subfolder
- create a tarball file which resembles that created by conda, without any conda-specific metadata
A new matrix workflow is added, which runs the build for each supported cuda version, and uploads the binaries to pyorch s3 bucket.
For the upload, define an upload.sh script, which will be used by the magma windows job as well, to upload to `s3://ossci-*` buckets.
The build runs on PR and push, upload runs in DRY_RUN mode in case of PR.
Fixes#139397
Pull Request resolved: https://github.com/pytorch/pytorch/pull/139888
Approved by: https://github.com/atalman, https://github.com/malfet, https://github.com/seemethere
## This Stack
This stack does the following things to support `xformers`-style, comm-aware Triton kernels:
- Exposes `signal_pad`s as tensors in Python
- Adds a binding for `cuMemsetAsync`
These in combination aims to provide users with more flexibility to express custom signaling/synchronization patterns.
## This PR
Make `cuMemset32Async` available via `_SymmetricMemory.memset32`. We chose `cuMemset32Async` over `cudaMemsetAsync` because it allows for `uint32_t`-wise memset. This provides users with better flexibility.
To enable this, we also added the following cuda driver APIs in `c10::cuda::DriverAPI`:
- `cuDevicePrimaryCtxRetain` - for obtaining the primary context of a device in the form of `CUcontext`.
- `cuCtxGetCurrent`/`cuCtxSetCurrent` - for setting and restoring the context for cuda driver APIs such as `cuMemset32Async`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138755
Approved by: https://github.com/weifengpy, https://github.com/eqy, https://github.com/lw
## This Stack
This stack does the following things to support `xformers`-style, comm-aware Triton kernels:
- Exposes `signal_pad`s as tensors in Python
- Adds a binding for `cuMemsetAsync`
These in combination aims to provide users with more flexibility to express custom signaling/synchronization patterns.
## This PR
Make `cuMemset32Async` available via `_SymmetricMemory.memset32`. We chose `cuMemset32Async` over `cudaMemsetAsync` because it allows for `uint32_t`-wise memset. This provides users with better flexibility.
To enable this, we also added the following cuda driver APIs in `c10::cuda::DriverAPI`:
- `cuDevicePrimaryCtxRetain` - for obtaining the primary context of a device in the form of `CUcontext`.
- `cuCtxGetCurrent`/`cuCtxSetCurrent` - for setting and restoring the context for cuda driver APIs such as `cuMemset32Async`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138755
Approved by: https://github.com/weifengpy, https://github.com/eqy, https://github.com/lw
MInor, adds a linter that ensures that all jobs run on pull_request, schedule, push etc have a `if: github.repository_owner == 'pytorch'` or are dependent on a job that has that check
There is also a setting in Github repos that can disable all workflows for that repo
A lot of these are unnecessary because many jobs use reusable workflows that have that check. However, this is a one time change so I'm not that bothered
Unfortunately I can't put this at the workflow level, which would make this better
Lots of weird string parsing
Pull Request resolved: https://github.com/pytorch/pytorch/pull/138849
Approved by: https://github.com/malfet