pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-21 05:34:18 +08:00

Author	SHA1	Message	Date
Will Feng	5918de8e84	Revert D13166669: [pytorch][PR] Allow dataloader to accept a custom memory pinning function Differential Revision: D13166669 Original commit changeset: ca965f9841d4 fbshipit-source-id: 0836b4f50f73ba01c97491a719660f02e36f20ad	2018-11-26 14:55:04 -08:00
Michael Carilli	7557a993ab	Allow dataloader to accept a custom memory pinning function (#14171 ) Summary: Currently, the `pin_memory_batch` function in the dataloader will return a batch comprised of any unrecognized type without pinning the data, because it doesn't know how. This behavior was preventing us from overlapping data prefetching in Mask-RCNN, whose custom `collate_fn` returns a custom batch type. The present PR adds the ability for the user to pass a `pin_fn` alongside any custom `collate_fn` to handle such custom types. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14171 Differential Revision: D13166669 Pulled By: soumith fbshipit-source-id: ca965f9841d4a259b3ca4413c8bd0d8743d433ab	2018-11-23 08:12:43 -08:00
Freddie Mendoza	2c21de2007	Make JOIN_TIMEOUT longer for ppc64le (#14107 ) Summary: This should resolve the issue on ppc64le getting FAIL: test_proper_exit (__main__.TestDataLoader). This only happens when the CI build machine is very busy and fails with a timeout. Pull Request resolved: https://github.com/pytorch/pytorch/pull/14107 Differential Revision: D13103859 Pulled By: soumith fbshipit-source-id: 268be80b59840853c5025f3211af272f68608fe5	2018-11-16 12:12:58 -08:00
Zachary DeVito	dae7616078	Shard all of tests based on how many tests exist. (#13160 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13160 Reduces pytorch_core build from 2 hours to 30 minutes Reviewed By: soumith, dzhulgakov Differential Revision: D10524261 fbshipit-source-id: 97270ac73404b5ea4c264cd0e9d8d4b1be79b0e9	2018-10-26 18:20:34 -07:00
James Sun	f4944f0f8a	Rename test/common.py to test/common_utils.py (#12794 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12794 common.py is used in base_module for almost all tests in test/. The name of this file is so common that can easily conflict with other dependencies if they happen to have another common.py in the base module. Rename the file to avoid conflict. Reviewed By: orionr Differential Revision: D10438204 fbshipit-source-id: 6a996c14980722330be0a9fd3a54c20af4b3d380	2018-10-17 23:04:29 -07:00
Brian W. Hart	53f4dbc9ac	test_proper_exit: avoid truncation of info message (#12612 ) Summary: test_proper_exit in the dataloader test bucket includes (as its docstring) a reassuring message about complaints that may appear during the test. The message is displayed when the tests are run in verbose mode. But the docstring includes a line break, and the unittest framework only prints the first line of the docstring (see shortDesription()). As a result, the 2nd (more reassuring) half of the message is not displayed. Catenate the docstring onto a single line so all is visible. Pull Request resolved: https://github.com/pytorch/pytorch/pull/12612 Differential Revision: D10368786 Pulled By: ezyang fbshipit-source-id: 14b259a6d6a3491d4290148eae56e6ab06f2a9b6	2018-10-12 16:32:28 -07:00
Tongzhou Wang	6069f6f454	Try to prevent occasional timeout in test_proper_exit Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12587 Differential Revision: D10361411 Pulled By: SsnL fbshipit-source-id: 97d0ff9d40918b7729c21f4de6d8cabeb65c728a	2018-10-12 10:53:01 -07:00
Johannes M Dieterich	957142a4fe	switch ROCm CI targets to white rabbit release (#12577 ) Summary: * switches docker files over to white rabbit release - removed custom package installs * skips five tests that regressed in that release * fixes some case-sensitivity issues in ROCm supplied cmake files by sed'ing them in the docker * includes first changes to the infrastructure to support upcoming hip-clang compiler * prints ROCm library versions as part of the build (as discussed w/ ezyang ) * explicitly searches for miopengemm * installs the new hip-thrust package to be able to remove the explicit Thrust checkout in a future revision Pull Request resolved: https://github.com/pytorch/pytorch/pull/12577 Differential Revision: D10350165 Pulled By: bddppq fbshipit-source-id: 60f9c9caf04a48cfa90f4c37e242d944a175ab31	2018-10-11 18:03:11 -07:00
Jie	a3fb004b18	(#12474 ) Summary: Modifies the DistributedSampler logic. Now each process samples elements with a given interval, instead of a consecutive section. This eliminates the possibility where the DataLoader uses padded data while dropping the real data. It happens when: 1. DistributedSampler padded data; and 2. DataLoader drops_last is effectively true, and drops less then the number of padded data. from the example down, we see that data (10, 11, 12) are padded through duplicating data sample (1, 2, 3) The old sampler drops legit original data (3, 6, 9) and introduces duplication (10, 11) into the training set; while the new sampler logic samples correct data points from the data set. This example has been added to dataloader unit test example: ``` data after shuffle: 1, 2, 3, 4, 5, 6, 7, 8, 9 padded data : 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 old sampler: -> DataLoader with (batch_size=2 and drop_last=True) p 1: 1, 2, 3 1, 2 p 2: 4, 5, 6 4, 5 p 3: 7, 8, 9 7, 8 p 4:10,11,12 10,11 new sampler: -> p 1: 1, 5, 9 1, 5 p 2: 2, 6,10 2, 6 p 3: 3, 7,11 3, 7 p 4: 4, 8,12 4, 8 ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/12474 Differential Revision: D10260410 Pulled By: SsnL fbshipit-source-id: 710856571260f42ce25955b81a5b8008e04938cf	2018-10-09 11:23:50 -07:00
Tongzhou Wang	11c31aef04	Prevent hanging in data loader altogether Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11985 Differential Revision: D10202374 Pulled By: SsnL fbshipit-source-id: 1ab1a07185f78a104f9b05930a87ef5a32f431e4	2018-10-09 09:54:19 -07:00
iotamudelta	a2ebbccc9f	fix unit tests on CI Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/12187 Differential Revision: D10118483 Pulled By: bddppq fbshipit-source-id: 986c8fb48d61e00103c713548a50e74489a0e442	2018-09-28 23:11:55 -07:00
Wei Yang	7f9fd1cc26	allow RandomSampler to sample with replacement (#9911 ) Summary: fixes #7908 Pull Request resolved: https://github.com/pytorch/pytorch/pull/9911 Reviewed By: yf225 Differential Revision: D9023223 Pulled By: weiyangfb fbshipit-source-id: 68b199bef3940b7205d0fdad75e7c46e6fe65ba7	2018-08-28 10:52:25 -07:00
Johannes M Dieterich	a4c59a9dab	MIOpen integration, more tests enabled, bug fixes (#10612 ) Summary: * first integration of MIOpen for batch norm and conv on ROCm * workaround a ROCm compiler bug exposed by elementwise_kernel through explicit capture of variables in the densest packing * workaround a ROCm compiler bug exposed by having `extern "C" __host__` as a definition and just `__host__` in the implementation through the hipify script * use fabs() in accordance with C++11 for double absolute, not ::abs() which is integer-only on ROCm * enable test_sparse set on CI, skip tests that don't work currently on ROCm * enable more tests in test_optim after the elementwise_bug got fixed * enable more tests in test_dataloader * improvements to hipification and ROCm build With this, resnet18 on CIFAR data trains without hang or crash in our tests. Pull Request resolved: https://github.com/pytorch/pytorch/pull/10612 Reviewed By: bddppq Differential Revision: D9423872 Pulled By: ezyang fbshipit-source-id: 22c0c985217d65c593f35762b3eb16969ad96bdd	2018-08-23 15:24:47 -07:00
iotamudelta	75651d5b58	improve use of ROCm libraries, enable more tests, small fixes (#10406 ) Summary: * some small leftovers from the last PR review * enable more unit test sets for CI * replace use of hcRNG w/ rocRAND (docker image was already updated w/ newer rocRAND) * use rocBLAS instead of hipBLAS to allow convergence w/ Caffe2 * use strided_batched gemm interface also from the batched internal interface * re-enable Dropout.cu as we now have philox w/ rocRAND Pull Request resolved: https://github.com/pytorch/pytorch/pull/10406 Reviewed By: Jorghi12 Differential Revision: D9277093 Pulled By: ezyang fbshipit-source-id: 7ef2f6fe4ead77e501ed7aea5c3743afe2466ca2	2018-08-13 11:39:43 -07:00
Tongzhou Wang	04f381650e	Resubmit: Fix dataloader hang when it is not completely iterated (#10366 ) Summary: https://github.com/pytorch/pytorch/pull/9655 Pull Request resolved: https://github.com/pytorch/pytorch/pull/10366 Differential Revision: D9237393 Pulled By: SsnL fbshipit-source-id: fabfad7f371ba33300098f6b885c0e3f26c3e14a	2018-08-09 00:10:24 -07:00
iotamudelta	cfa05706ef	ROCm contributions week 29 (#9653 ) Summary: In this changeset: * improvements to `hipify-python.py` * marking unit tests broken for ROCm * reducing the number of jobs for the built to avoid out of memory issues * switch to Thrust/cub-hip master for the CI Pull Request resolved: https://github.com/pytorch/pytorch/pull/9653 Differential Revision: D9117791 Pulled By: ezyang fbshipit-source-id: a6c3c7b81f2bda9825974bf9bf89a97767244352	2018-08-02 09:09:00 -07:00
Tongzhou Wang	a7f183f971	Revert "Fix dataloader hang when it is not completely iterated (#9655 )" (#9804 ) Summary: This reverts commit 9ee513365121cd387e11987c66db6599ac53ded7. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9804 Reviewed By: ezyang Differential Revision: D8987780 Pulled By: SsnL fbshipit-source-id: 75ad70b0b8d672d0b35235fa248b187be64b68e5	2018-07-25 10:10:30 -07:00
Tongzhou Wang	a387331e54	Re-enable test_segfault after recent dataloder changes Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9700 Differential Revision: D8953615 Pulled By: SsnL fbshipit-source-id: c6aa3c07dd2857dd54889d47e537a6b1e9198c60	2018-07-23 18:38:42 -07:00
Tongzhou Wang	9ee5133651	Fix dataloader hang when it is not completely iterated (#9655 ) Summary: second trial of https://github.com/pytorch/pytorch/pull/7140 cc csarofeen Let's see if this works. It passes everything locally. Pull Request resolved: https://github.com/pytorch/pytorch/pull/9655 Differential Revision: D8940177 Pulled By: SsnL fbshipit-source-id: 8d6340fc9f7355c71e1e26b262da166402faa158	2018-07-22 20:38:27 -07:00
Tongzhou Wang	050a2588b5	change stft to have consistent signature with librosa (#9497 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/9497 Fixes #7883 by using `rfft`. It's worth noting that this is BC breaking. And it's impossible to detect the change because the two signatures before and after this change supports a common subset of calling patterns, e.g., `stft(Tensor, int, int)`. (some other calling patterns will raise error). soumith and I plan to change the current `stft` interface because it is a bit messy and non-standard. rafaelvalle suggested us that `librosa` is a good reference API to align with. After discussing with soumith and ezyang , and given that `stft` is only out for 1 release, I decide to go with directly changing the signature. Also, my understanding is that most researchers in this field will welcome this change as `librosa` seems to be the golden-standard here. (it doesn't yet support all `pad_mode` but those will become available if added to `F.pad`.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/9308 Reviewed By: ezyang Differential Revision: D8806148 Pulled By: SsnL fbshipit-source-id: f6e8777d0c34d4a4d7024e638dc9c63242e8bb58	2018-07-17 10:55:43 -07:00
Will Feng	90fd4df695	Add flag for disabling tests with multiprocessing spawn start method (#9061 ) Summary: This will resolve some of the timeout issues in CPU and GPU tests internally. Closes https://github.com/pytorch/pytorch/pull/9061 Reviewed By: ezyang Differential Revision: D8707471 Pulled By: yf225 fbshipit-source-id: 9dc82a2c9da0c540ae015442f74b9b2b1a67a246	2018-06-30 14:39:11 -07:00
Will Feng	c84b97b979	[READY TO MERGE] Enable tests that use DataLoader with multiple workers on Windows (#6745 ) * Don't import TEST_CUDA for test_dataloader on Windows * test_partial_workers is stuck on Windows	2018-06-06 22:50:39 -04:00
Will Feng	e8bdbdaa27	Terminate dataloader workers properly when parent process is SIGKILL'ed (#6779 ) Reopening #6606 with fix for TEST_CUDA import issue on Windows and improvement to how we wait for manager exit in test_manager_unclean_exit. Loop tested on the Windows CI multiple times to make sure this actually fixes the CUDA OOM issue. * Terminate dataloader workers properly when parent process is SIGKILL'ed * Wait for worker processes to finish before shutting down manager process * Add test for checking proper worker exit * cosmetic change * Test only if CUDA exists * Don't call multiprocessing.set_start_method() in Python 2 * import TEST_CUDA only when we are in __main__ * Tune JOIN_TIMEOUT * handle os.getppid() == 0 case * Reset to original JOIN_TIMEOUT * Use WaitForSingleObject() to check parent process status on Windows * Fix TEST_CUDA import * clean up * Check main process only when index_queue.get() times out * Change index_queues to multiprocessing.Queue * Move manager checking logic to watchdog class * Fix bugs in dataloader * Fix TEST_CUDA import issue * Don't import TEST_CUDA from common_nn * Use event to signal manager exit in test * fix lint * Add comments	2018-04-22 23:03:54 -04:00
gchanan	4c5b95a433	Revert "Terminate dataloader workers properly when parent process is SIGKILL'ed (#6606 )" (#6772 ) This reverts commit 8d6a50aaeba2166ce870016da7488f879395ebb1.	2018-04-19 14:28:48 -04:00
Will Feng	8d6a50aaeb	Terminate dataloader workers properly when parent process is SIGKILL'ed (#6606 ) * Terminate dataloader workers properly when parent process is SIGKILL'ed * Wait for worker processes to finish before shutting down manager process * Add test for checking proper worker exit * cosmetic change * Test only if CUDA exists * Don't call multiprocessing.set_start_method() in Python 2 * import TEST_CUDA only when we are in __main__ * Tune JOIN_TIMEOUT * handle os.getppid() == 0 case * Reset to original JOIN_TIMEOUT * Use WaitForSingleObject() to check parent process status on Windows * Fix TEST_CUDA import * clean up * Check main process only when index_queue.get() times out * Change index_queues to multiprocessing.Queue * Move manager checking logic to watchdog class * Fix bugs in dataloader * Fix TEST_CUDA import issue	2018-04-18 20:41:33 -04:00
Tongzhou Wang	60a16e5663	Set dataloader.batch_size = None when batch_sampler is given (#6108 )	2018-03-30 10:01:09 +02:00
Jason Park	64e2c03bea	Enable TensorDataset to get any number of tensors (#6038 ) Keeping compatibility, enable TensorDataset to get any number of tensors. * Enable TensorDataset to get any number of tensors * Update dataset.py Fix syntax error on python 2.7 * Add several test for tensordataset * Fix whitespaces * Simplify args * Update dataset.py	2018-03-28 11:20:50 -04:00
AlexanderRadionov	831780390c	Fixed non-determinate preprocessing on DataLoader (#4640 ) dded ind_worker_queue parameter to data.DataLoader. It makes preprocessing determinate. DataLoader in multiprocessing mode may cause non-deterministic issue. Even if radom_seed has frozen, each subprocess may get tasks in unstable order. This is caused by different I/O time while data loads. If you use augmentation while data loading, it makes results unreproduceble. Look at the https://discuss.pytorch.org/t/deterministic-non-deterministic-results-with-pytorch/9087 To fix this issue I have added the individual queue for each worker. In this case each worker get tasks in the stable order. In summary, subprocess produces the stable results. To reproduce issue you may change ind_worker_queue to False and run the script several times. Code to reproduce issue is in the corresponding PR. * TestIndividualWorkerQueue added to DataLoader tests * Review fixes * "Simplify" code by removing itertools * Rebase conflicts fix * Review fixes * Fixed shutdown behavior * Removed ind_worker_queue flag. * Rebase on master * Disable tests that use DataLoader with multiple workers (#5322)	2018-03-23 17:43:59 -04:00
Will Feng	0340e46f9b	Disable tests that use DataLoader with multiple workers (#5322 )	2018-02-21 09:20:37 -05:00
Tongzhou Wang	964707e9b5	temporarily disable test_segfault until we figure out why it intermittently fails on cuda CI workere (#4976 )	2018-01-31 19:04:44 -05:00
Tongzhou Wang	64a9ecae02	Dataloader issues (#4643 ) * EINTR and kill by loader fix * addressed @apaszke 's comments * remove EINTR handling and add test if we are in main thread before setting SIGCHLD	2018-01-29 01:18:17 +01:00
peterjc123	2dd7039b6b	Fix multiprocessing and dataloader tests on Windows (#4453 )	2018-01-06 17:41:36 +01:00
Tongzhou Wang	cc9dc3f343	add lock for SynchronizedSeedDataset; add additional os level close stderr for tests that launch failing process (#4463 )	2018-01-03 22:45:05 -05:00
Alykhan Tejani	18a866aedd	Add random_split to torch.utils.data.dataset (#4435 )	2018-01-02 18:56:49 +01:00
Will Feng	1681d07199	Disable tests and fix issues with Windows CUDA build (#4251 )	2017-12-20 11:30:21 +01:00
Tongzhou Wang	5cc26c0c90	Add default PyTorch seeding and worker_init_fn to DataLoader (#4018 ) * Add default PyTorch seeding and worker_init_fn to DataLoader * generate seed using current RNG each time * worker_seed <- main_proc_RNG_generated_seed + worker_id	2017-12-18 02:19:08 -05:00
Will Feng	db446d69ca	Fix issues with Windows 7 & 10 CPU build (#4065 )	2017-12-15 10:14:43 +01:00
SsnL	1661370ac5	Signal handling in DataLoader workers; Timeout option (#3474 )	2017-11-29 23:52:14 +01:00
Richard Zou	e579ae75b5	Fix error when default_collate is passed a collection of numpy.str_ (#3404 ) * Fix error when default_collate is passed a collection of numpy.str_ * Error if default_collate input is nested nparray containing non-numbers	2017-11-08 10:02:08 -05:00
Tzu-Wei Huang	618026e999	implements operator + for Dataset class (#3180 ) * implements operator + for Dataset class * check for exact equivalent	2017-10-29 01:19:59 +05:30
Valentin Haenel	d592e188f7	port of ConcatDataset (#1902 )	2017-06-27 12:31:56 -04:00
Sam Gross	f09027bc29	Add batch sampler to DataLoader (#1867 )	2017-06-22 20:18:31 +02:00
Isac Arnekvist	156fe28666	dataloader can now handle growing datasets (#1575 )	2017-05-17 19:23:15 -04:00
Sasank Chilamkurthy	94b147fd41	Allows dicts batches in dataloader. (#1354 ) * Allow dicts in Dataloader * use collections.Sequence instead of collections.Iterable in dataloader	2017-04-28 19:14:52 +02:00
Adam Paszke	605b3c86ce	Retain the type of numpy scalars in collate_fn	2017-04-11 14:48:54 -07:00
Eli Stevens	e216f557fd	Fixes issue returning strings from a Dataloader with pin_memory=True (#908 )	2017-03-13 10:11:07 +01:00
Adam Paszke	7ea6ae57c8	Support numpy arrays in default_collate	2017-02-20 23:28:31 -08:00
zhtvk	4d37ef878c	Remove view on data and target tensors of dim 1 in TensorDataset (#609 )	2017-02-09 22:06:39 +01:00
Luke Yeager	e7c1e6a8e3	[pep8] Fix most lint automatically with autopep8 Here's the command I used to invoke autopep8 (in parallel!): git ls-files \| grep '\.py$' \| xargs -n1 -P`nproc` autopep8 -i Several rules are ignored in setup.cfg. The goal is to let autopep8 handle everything which it can handle safely, and to disable any rules which are tricky or controversial to address. We may want to come back and re-enable some of these rules later, but I'm trying to make this patch as safe as possible. Also configures flake8 to match pep8's behavior. Also configures TravisCI to check the whole project for lint.	2017-01-28 01:15:51 +01:00
Adam Paszke	a1fa995044	Fixes and improvements (#593 ) * Fix error in ELU backward * Add --seed flag for testst st * Add test for BatchNorm eval * Fix autograd.backward docs * Support cc flags in cuDNN search * Fix IndexSelect backward formula	2017-01-25 22:21:49 -05:00

... 2 3 4 5 6

255 Commits