When minifying extremely large repros, the minifier can run out of memory. This is because, for delta debugging, the minifier keeps a copy of every intermediate output in the network. This can easily put you over the memory limit for your GPU. To make matters worse, we cannot easily delta debug in such a situation, as delta debugging involves replacing intermediates with inputs, but doing so can cause an intermediate to become live longer than its actual extent in the original model (since inputs all have to be allocated up front).
The strategy in this PR is to use `load_tensor` from the previous PR to offer a low memory mode for delta debugging. Instead of putting intermediates as inputs, we instead load them in the middle of the graph in question. If, through DCE, the load_tensor ends up floating to the top of the graph, we can input-ify it. We now no longer save all intermediates in memory, but instead save them to disk. I used this to successfully minify the repro that helped us solve https://github.com/pytorch/pytorch/pull/100332
The testing is not very good. I can try to add more robust testing but it will involve a more involved refactor to FX minifier. Let me know if that's what you want.
Signed-off-by: Edward Z. Yang <ezyang@meta.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/100546
Approved by: https://github.com/anijain2305, https://github.com/voznesenskym
We are using the idiom
```py
sys.path.insert(0, path)
# do something
sys.path.remove(path)
```
three times in `torch.hub`. This is a textbook case for using a context manager. In addition, by using `try` / `finally` we can enforce the Python path is back in its original state even if the actual action raises an exception:
```py
import sys
path = "/tmp"
# PR
try:
sys.path.insert(0, path)
try:
# Any exception raised while performing the actual functionality
raise Exception
finally:
sys.path.remove(path)
except Exception:
assert path not in sys.path
# main
try:
sys.path.insert(0, path)
# Any exception raised while performing the actual functionality
raise Exception
sys.path.remove(path)
except Exception:
assert path in sys.path
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/75786
Approved by: https://github.com/NicolasHug
This is a new version of #15648 based on the latest master branch.
Unlike the previous PR where I fixed a lot of the doctests in addition to integrating xdoctest, I'm going to reduce the scope here. I'm simply going to integrate xdoctest, and then I'm going to mark all of the failing tests as "SKIP". This will let xdoctest run on the dashboards, provide some value, and still let the dashboards pass. I'll leave fixing the doctests themselves to another PR.
In my initial commit, I do the bare minimum to get something running with failing dashboards. The few tests that I marked as skip are causing segfaults. Running xdoctest results in 293 failed, 201 passed tests. The next commits will be to disable those tests. (unfortunately I don't have a tool that will insert the `#xdoctest: +SKIP` directive over every failing test, so I'm going to do this mostly manually.)
Fixes https://github.com/pytorch/pytorch/issues/71105
@ezyang
Pull Request resolved: https://github.com/pytorch/pytorch/pull/82797
Approved by: https://github.com/ezyang
As pointed by #71205, `torch.hub.load` assumes that the user trusts the repo from where the code is gathered and exececuted. We propose a solution to make sure that the user is aware of the security threat that this can represent.
**Solution**: Adds a `trust_repo` parameter to the `load`, `list` and `help` functions in torch.hub.
For now, the default `trust_repo=None` warns that, in the future, the user will need to authorize explicitly every repo before downloading it.
Once the repo has been trusted (via `trust_repo=True` or via a command prompt input) it will be added to the list of trusted repositories.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/72060
Approved by: https://github.com/NicolasHug
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/67990
Duplicate of the following PR which was merged by mistake without ghimport
https://github.com/pytorch/pytorch/pull/67914
cc albanD NicolasHug
Test Plan: Imported from OSS
Reviewed By: H-Huang
Differential Revision: D32247560
Pulled By: jdsgomes
fbshipit-source-id: 8ba5ba7d17fc3d0d2c377da467ea805822e21ec1
Summary:
Closes https://github.com/pytorch/pytorch/issues/63753
This PR changes the assumption regarding the default branch of a repo to the following:
> If main exist then use main,otherwise use master
This will make torchhub more robust w.r.t. to the ongoing changes where repo use `main` instead of `master` as the development / default branch.
cc nairbv NicolasHug
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64364
Reviewed By: saketh-are
Differential Revision: D30731551
Pulled By: NicolasHug
fbshipit-source-id: 7232a30e956dcccca21933a29de5eddd711aa99b
Summary:
This PR adds more detailed error messages to torchhub if the commit hash validation goes wrong, providing suggestions to the users on how to resolve the issue.
It also documents why such validation is important.
EDIT: it also avoids validatating some stuff when we know "stuff" isn't a commit since there's no risk in this case
CC malfet mthrok
cc nairbv NicolasHug
Pull Request resolved: https://github.com/pytorch/pytorch/pull/64362
Reviewed By: gchanan, malfet
Differential Revision: D30731191
Pulled By: NicolasHug
fbshipit-source-id: d1ee7c2ef2591dd7a5291977af1635ada2552d1b
Summary:
This PR:
- adds a few details regarding the newly added `skip_validation` parameter https://github.com/pytorch/pytorch/pull/62139
- uses double-backticks instead of single-backticks since this is rst, not mardown.
- adds a few minor doc nits here and there
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63783
Reviewed By: zou3519
Differential Revision: D30696658
Pulled By: NicolasHug
fbshipit-source-id: 6f01c7eb3cfcd7e17e4c33c09d193054fa18ad36
Summary:
This PR Fixes the help() and list() torchhub functions which were probably failing for Windows since the `/` OS separator was hardcoded.
Before merging this I need to double check whether the CI actually runs the corresponding tests on Windows or not
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63773
Reviewed By: zou3519
Differential Revision: D30695664
Pulled By: NicolasHug
fbshipit-source-id: fac328163fd05db804a8186ae28f22b3cc3a6404
Summary:
This PR removes an outdated comment about Python2 that was orginally introduced in https://github.com/pytorch/pytorch/pull/25083/files. The code has changed since then, but the comment wasn't removed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63757
Reviewed By: zou3519
Differential Revision: D30695656
Pulled By: NicolasHug
fbshipit-source-id: 431cf414588b9e5a1ad6acdae724ff5af1b16971
Summary:
This PR addresses an old comment about Python2 EOL, directly putting some parameters in the function signature instead of in a `**kargs` dict.
I believe the changes are fully backward compatible.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/63755
Reviewed By: zou3519
Differential Revision: D30695634
Pulled By: NicolasHug
fbshipit-source-id: 398f347c5a04bfb58e77e46773a869cb9d0eb225
Summary:
Increase `page_idx` in the loop rather than outside of it
Break from the loop when receive empty response as it means there are no more items to fetch via pagination request
Also, add options to use provided github token (via `GITHUB_TOKEN` environment variable)
Fixes failure with "Rate Limit Exceeded" when doing something like `torch.hub.list("pytorch/test-infra:dsf")`
Fixes https://github.com/pytorch/pytorch/issues/61755
Pull Request resolved: https://github.com/pytorch/pytorch/pull/62072
Reviewed By: jbschlosser
Differential Revision: D29868539
Pulled By: malfet
fbshipit-source-id: 206082a0ba1208e9b15ff6c9c6cb71d2da74f1c3
Summary:
We should iterate all pages of the branches API. Otherwise, even using "pytorch/vision" would fail to find master.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56138
Reviewed By: heitorschueroff
Differential Revision: D27872346
Pulled By: ailzhang
fbshipit-source-id: 55881558f7980b1fb08b0d08ed6687a38df06edd
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/56048
This reverts commit c411017a41988e9c5184279c1ec7dd7ef4e1a6fe.
This implementation broke CI in pytorch/vision and it's not handling
tags properly. So I want to revert it first to unblock vision CI and
send out a proper fix later.
Test Plan: Imported from OSS
Reviewed By: gchanan
Differential Revision: D27771701
Pulled By: ailzhang
fbshipit-source-id: 932f9be72a1ae1816f4032643b3c2dde0cb7ae4c
Summary:
I think these can be safely removed since the min version of supported Python is now 3.6
Pull Request resolved: https://github.com/pytorch/pytorch/pull/47822
Reviewed By: smessmer
Differential Revision: D24954936
Pulled By: ezyang
fbshipit-source-id: 5d4b2aeb78fc97d7ee4abaf5fb2aae21bf765e8b
Summary:
Fixes https://github.com/pytorch/pytorch/issues/43622
- Moves the model loading part of `torch.hub.load()` into a new `torch.hub.load_local()` function that takes in a path to a local directory that contains a `hubconf.py` instead of a repo name.
- Refactors `torch.hub.load()` so that it now calls `torch.hub.load_local()` after downloading and extracting the repo.
- Updates `torch.hub` docs to include the new function + minor fixes.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/44204
Reviewed By: malfet
Differential Revision: D23817429
Pulled By: ailzhang
fbshipit-source-id: 788fd83c87a94f487b558715b2809d346ead02b2