pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 05:34:18 +08:00

Author	SHA1	Message	Date
Bin Bao	34f681c13b	[CI] Remove inductor skip list for timm_models (#98840 ) Summary: check against the expected csv file instead of skipping tests Pull Request resolved: https://github.com/pytorch/pytorch/pull/98840 Approved by: https://github.com/ezyang	2023-04-15 13:54:41 +00:00
Bin Bao	c4de7fdef5	[CI] Mark sebotnet33ts_256 as nondeterministic (#98356 ) Summary: The goal is make sure the new dashboard doesn't give noisy alert on this test. Pull Request resolved: https://github.com/pytorch/pytorch/pull/98356 Approved by: https://github.com/ezyang	2023-04-05 12:05:47 +00:00
Yanbo Liang	d305d4a57f	[Dynamo] Fix TIMM benchmark compute_loss (#97423 ) Fixes #97382 #95416 fixed a critical bug in dynamo benchmark, where AMP tests fall back to eager mode before that PR. However, after that PR, we found [a list of TIMM models amp + eager + training testing failed](https://docs.google.com/spreadsheets/d/1DEhirVOkj15Lu4UNawIUon9MqkVLaWqyT-DQPif5NHk/edit#gid=0). Now we identified the root cause is: high loss values make gradient checking harder, as small changes in accumulation order upset accuracy checks. We should switch to the helper function ```reduce_to_scalar_loss``` which has been used by Torchbench tests. After switching to ```reduce_to_scalar_loss```, TIMM models accuracy pass rate grows from 67.74% to 91.94% in my local test. The rest 5 failed models(ese_vovnet19b_dw, fbnetc_100, mnasnet_100, mobilevit_s, sebotnet33ts_256) need further investigation and handling, but I think it should be similar reason. Pull Request resolved: https://github.com/pytorch/pytorch/pull/97423 Approved by: https://github.com/Chillee	2023-03-24 16:50:28 +00:00
BowenBao	60a68477a6	Bump black version to 23.1.0 (#96578 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/96578 Approved by: https://github.com/ezyang	2023-03-15 06:27:59 +00:00
Bin Bao	02792ff16f	[CI] Make inductor-perf-test-nightly produce data for dashboard (#95685 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/95685 Approved by: https://github.com/ezyang, https://github.com/huydhn	2023-03-06 03:14:03 +00:00
Xuehai Pan	8d45f555d7	[BE] [1/3] Rewrite `super()` calls in caffe2 and benchmarks (#94587 ) Rewrite Python built-in class `super()` calls. Only non-semantic changes should be applied. - #94587 - #94588 - #94592 Also, methods with only a `super()` call are removed: ```diff class MyModule(nn.Module): - def __init__(self): - super().__init__() - def forward(self, ...): ... ``` Some cases that change the semantics should be kept unchanged. E.g.: `f152a79be9/caffe2/python/net_printer.py (L184-L190)` `f152a79be9/test/test_jit_fuser_te.py (L2628-L2635)` Pull Request resolved: https://github.com/pytorch/pytorch/pull/94587 Approved by: https://github.com/ezyang	2023-02-11 18:19:48 +00:00
Michael Voznesensky	333e771394	Add benchmarks.py to run all benchmarks, add new file with all torchbench model names (#94146 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/94146 Approved by: https://github.com/ezyang	2023-02-08 01:18:38 +00:00
Edward Z. Yang	498c6ed8d8	Add missing format string (#93866 ) Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/93866 Approved by: https://github.com/albanD, https://github.com/Skylion007	2023-02-01 20:56:46 +00:00
soulitzer	f646126ecd	Running timm benchmarks no longer silently retries (#93030 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/93030 Approved by: https://github.com/eellison	2023-01-26 03:44:38 +00:00
Edward Z. Yang	c52567ec18	Switch CI exclusions to use exact match. (#92761 ) Since the CI exclusions are hard-coded in our script, we might as well require them to match exactly. This solved some head scratching where I was like, "this model is not obviously excluded, why is it not showing up in CI." Signed-off-by: Edward Z. Yang <ezyang@meta.com> Pull Request resolved: https://github.com/pytorch/pytorch/pull/92761 Approved by: https://github.com/jansel	2023-01-22 17:10:20 +00:00
blzheng	0c1777acec	Dynamo benchmark: add CPU specific changes (#88477 ) This pr adds some CPU specific changes: - Add support for IPEX backend - https://github.com/pytorch/torchdynamo/issues/1618 - https://github.com/pytorch/torchdynamo/issues/1534 - Enable CPU launcher in runner.py. - Fix the issue that some environment variables are not support on CPU Pull Request resolved: https://github.com/pytorch/pytorch/pull/88477 Approved by: https://github.com/jgong5, https://github.com/jansel	2023-01-07 09:26:06 +00:00
Bin Bao	84e73e1269	[inductor] small CI improvements (#91140 ) Summary: 1) Increase timm_model download retry times; 2) Skip certain random triton failures. Pull Request resolved: https://github.com/pytorch/pytorch/pull/91140 Approved by: https://github.com/williamwen42	2022-12-20 17:26:12 +00:00
William Wen	7ebc45eadd	[dynamo] Better error message for bad timm model name (#91049 ) Fixes https://github.com/pytorch/torchdynamo/issues/1995 Running `python benchmarks/dynamo/timm_models.py --performance --float32 -dcuda --output=out.csv --training --inductor --only bad_model_name` gives ``` Traceback (most recent call last): File "benchmarks/dynamo/timm_models.py", line 338, in <module> main(TimmRunnner()) File "/scratch/williamwen/work/pytorch/benchmarks/dynamo/common.py", line 1660, in main return maybe_fresh_cache(run, args.cold_start_latency and args.only)( File "/scratch/williamwen/work/pytorch/benchmarks/dynamo/common.py", line 833, in inner return fn(args, *kwargs) File "/scratch/williamwen/work/pytorch/benchmarks/dynamo/common.py", line 2000, in run ) = runner.load_model(device, model_name, batch_size=batch_size) File "benchmarks/dynamo/timm_models.py", line 215, in load_model raise RuntimeError(f"Failed to load model '{model_name}'") RuntimeError: Failed to load model 'bad_model_name' ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/91049 Approved by: https://github.com/ezyang	2022-12-19 22:37:34 +00:00
Michael Lazos	7c524221ba	[reland3][dynamo] Revert "Revert "[reland][dynamo] use optimizers correctly in benchmar… (#90956 ) …king (#87492)" (#90746)" This reverts commit ff1bbc2773a31ab839438966266ed8ee206cb8c5. This should be okay to merge now. The flakiness of HF models will be fixed by seeding the rng (https://github.com/pytorch/pytorch/pull/90936), and the numeric mismatch was root-caused to three decomps (still investigating why those decomps cause this) see https://github.com/pytorch/torchdynamo/issues/1985 for more detail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90956 Approved by: https://github.com/desertfire	2022-12-17 06:27:15 +00:00
PyTorch MergeBot	6bc6fb21db	Revert "[reland2][dynamo] Revert "Revert "[reland][dynamo] use optimizers correctly in benchmar… (#90956 )" This reverts commit 8bc38ae4e2037ae42813d552e5d412db77167bc0. Reverted https://github.com/pytorch/pytorch/pull/90956 on behalf of https://github.com/desertfire due to Causing TIMM model failures	2022-12-16 19:28:05 +00:00
Michael Lazos	8bc38ae4e2	[reland2][dynamo] Revert "Revert "[reland][dynamo] use optimizers correctly in benchmar… (#90956 ) …king (#87492)" (#90746)" This reverts commit ff1bbc2773a31ab839438966266ed8ee206cb8c5. This should be okay to merge now. The flakiness of HF models will be fixed by seeding the rng (https://github.com/pytorch/pytorch/pull/90936), and the numeric mismatch was root-caused to three decomps (still investigating why those decomps cause this) see https://github.com/pytorch/torchdynamo/issues/1985 for more detail. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90956 Approved by: https://github.com/desertfire	2022-12-16 13:33:38 +00:00
eqy	57e2090e21	[Dynamo][TIMM][Benchmarks] Fix TIMM `0.8.0dev` breaking the `timm_models.py` script's data config (#90404 ) It seems `0.8.0dev` breaks the current argument passing by expecting a dictionary instead of a namespace after `0dadb4a6e9` CC @desertfire @ngimel Pull Request resolved: https://github.com/pytorch/pytorch/pull/90404 Approved by: https://github.com/ngimel	2022-12-15 22:21:19 +00:00
Bin Bao	ff1bbc2773	Revert "[reland][dynamo] use optimizers correctly in benchmarking (#87492 )" (#90746 ) This reverts commit d91d7a322172da4d92672301f3cfa3344d544a9e. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90746 Approved by: https://github.com/anijain2305	2022-12-13 11:37:16 +00:00
Animesh Jain	d91d7a3221	[reland][dynamo] use optimizers correctly in benchmarking (#87492 ) Reland https://github.com/pytorch/pytorch/pull/87311 mlazos: updated to use SGD to not add a bunch of additional memory allocations (like Adam) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87492 Approved by: https://github.com/desertfire	2022-12-09 20:32:53 +00:00
Bin Bao	f7cdd3a7a0	[inductor] Use a large tolerance for botnet26t_256 (#90383 ) Summary: botnet26t_256 shows random tolerance failure on CI. The root cause of this randomness is still to-be-invesitgated, but let's use a larger tolerance for now. Pull Request resolved: https://github.com/pytorch/pytorch/pull/90383 Approved by: https://github.com/ezyang	2022-12-07 19:35:06 +00:00
Animesh Jain	3162a48a77	[dynamo][benchmarks] Call zero grad (#90026 ) Hoping that it might reduce some flakiness Pull Request resolved: https://github.com/pytorch/pytorch/pull/90026 Approved by: https://github.com/williamwen42	2022-12-02 04:05:57 +00:00
Animesh Jain	68805b08d1	[benchmarks][dynamo] Trying CI - Set train() for TIMM models accuracy tests (#89780 ) Moving to train mode for TIMM models and also raising batch size for accuracy testing. Raising batch size seems to remove a lot of noise/instability coming from batch_norm decomposition. Pull Request resolved: https://github.com/pytorch/pytorch/pull/89780 Approved by: https://github.com/ngimel	2022-11-30 12:57:35 +00:00
Animesh Jain	1b575782a0	[dynamo][benchmarks] use fresh inductor cache and raise batch size wherever possible (#88044 ) cc @mlazos @soumith @voznesenskym @yanboliang @penguinwu @EikanWang @jgong5 @Guobing-Chen @chunyuan-w @XiaobingSuper @zhuhaozhe @blzheng @Xia-Weiwen @wenzhe-nrv @jiayisunx Pull Request resolved: https://github.com/pytorch/pytorch/pull/88044 Approved by: https://github.com/ngimel	2022-10-30 17:10:17 +00:00
Bin Bao	57b36bf353	Bring back TIMM model inductor CI test (#87730 ) Summary: https://github.com/pytorch/pytorch/pull/87588 has solved the inductor compilation speed regression, so we can try to run TIMM models with fewer shards and also enable pretained model downloading which should resolve the flakyness we have seen previously. cc @jansel @mlazos @soumith @voznesenskym @yanboliang @penguinwu @anijain2305 Pull Request resolved: https://github.com/pytorch/pytorch/pull/87730 Approved by: https://github.com/anijain2305	2022-10-26 00:15:35 +00:00
Bin Bao	f047dadab9	Enable inductor CI for TIMM (#87462 ) cc @jansel @lezcano @fdrocha @mlazos @soumith @voznesenskym @yanboliang Pull Request resolved: https://github.com/pytorch/pytorch/pull/87462 Approved by: https://github.com/anijain2305	2022-10-22 05:50:00 +00:00
PyTorch MergeBot	f38a88c4dd	Revert "[dynamo] use optimizers correctly in benchmarking (#87311 )" This reverts commit 703c19008df4700b6a522b0ae5c4b6d5ffc0906f. Reverted https://github.com/pytorch/pytorch/pull/87311 on behalf of https://github.com/anijain2305 due to Bin (desertfire) is trying to get torchbench models in CI, and this PR prevents that. I will bring this back after models are in CI.	2022-10-20 22:01:51 +00:00
Animesh Jain	703c19008d	[dynamo] use optimizers correctly in benchmarking (#87311 ) We were not setting optimizers correctly * This hid the issue that we see here - https://github.com/pytorch/torchdynamo/issues/1687 * This has also revealed that we are activating profilers for every dynamo optimized model call. This could affect speedup cc @jansel @lezcano @fdrocha Pull Request resolved: https://github.com/pytorch/pytorch/pull/87311 Approved by: https://github.com/mlazos, https://github.com/yanboliang	2022-10-20 05:46:25 +00:00
Animesh Jain	c30cfb07ab	[dynamo][dashboard] Run 2 iterations for the correctness runs (#87104 ) Pull Request resolved: https://github.com/pytorch/pytorch/pull/87104 Approved by: https://github.com/soumith	2022-10-18 15:53:40 +00:00
Jason Ansel	c7c09722ad	Move TorchDynamo into PyTorch core (#86461 ) Context: https://github.com/pytorch/torchdynamo/issues/1588 This PR moves [TorchDynamo](https://github.com/pytorch/torchdynamo) and TorchInductor into PyTorch core. - `torchdynamo` becomes `torch._dynamo` - `torchinductor` becomes `torch._inductor` This PR was generated by running `copy_to_core.sh` in https://github.com/pytorch/torchdynamo/pull/1538 Pull Request resolved: https://github.com/pytorch/pytorch/pull/86461 Approved by: https://github.com/voznesenskym	2022-10-13 23:18:06 +00:00

1 2

79 Commits