pytorch

mirror of https://github.com/pytorch/pytorch.git synced 2025-10-20 12:54:11 +08:00

Author	SHA1	Message	Date
Mikhail Zolotukhin	a212a5b97a	ir.cpp, module.cpp: clang-format. (#20592 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20592 ghimport-source-id: 98dc62a9595c6b94706960274ce9beebacc9ca00 Differential Revision: D15375131 Pulled By: ZolotukhinM fbshipit-source-id: 7edbb14a337d1646b48756eef4163846648cbd93	2019-05-17 09:21:32 -07:00
Sam Gross	b90790ab1b	Don't split 256-bit AVX2 load/store intrinsics (#20609 ) Summary: Recent versions of GCC split unaligned load and store intrinsics into two 128-bit instructions. On old processors (Sandy Bridge) this was a bit faster for unaligned data, but bit slower for aligned data. On new processors (Intel Haswell+, recent AMD) splitting loads is slower on both aligned and unaligned data. Clang, MSVC, and ICC do not split unaligned load and store intrinsics. There's a good explanation here: https://stackoverflow.com/questions/52626726/why-doesnt-gcc-resolve-mm256-loadu-pd-as-single-vmovupd#tab-top Splitting load and store intrinsics makes no sense in our AVX2 configuration because the CPUs that support AVX2 instructions are the same CPUs where splitting is disadvantageous on all data alignemnt. Note that this doesn't change the AVX configuration (used by CPUs that support AVX but not AVX2). It's possible this would be benficial for that configuration too (our data is usually 32-byte aligned), but I'd prefer the conservative change for now. torch.add generated assembly (hot loop) (GCC 7.3.0) before: https://gist.github.com/colesbury/066376537bccd514daf8fe4ab54d8295 after: https://gist.github.com/colesbury/8b4b948145001d44b225c51d2428bb91 Timing of `torch.add(x, y, out=z)` for size 10240 (1 thread, Broadwell, no turbo): before: 7.35 us after: 6.39 us (Take the torch.add timings with a grain of salt. The difference in timings is much larger than I would expect.) Pull Request resolved: https://github.com/pytorch/pytorch/pull/20609 Differential Revision: D15385800 Pulled By: colesbury fbshipit-source-id: 66415b148a3b19360b9de9881af594ab46547b6f	2019-05-17 09:16:17 -07:00
Natalia Gimelshein	000d73ccde	fix WAR race (#20182 ) Summary: was flagged by racecheck. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20182 Differential Revision: D15393536 Pulled By: ezyang fbshipit-source-id: ad4849c9fb2c8feb966be1c4ca0dadd7360f58fe	2019-05-17 09:06:52 -07:00
kirayue	3c69c9a7fe	Refine CosineAnnealingWarmRestarts doc for issue #20028 (#20267 ) Summary: Fixes #20028 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20267 Differential Revision: D15393514 Pulled By: ezyang fbshipit-source-id: 03f270a577fc3e0414d3f07d97512a409b08f7cd	2019-05-17 09:02:28 -07:00
zhiqiang	cfb87c1022	Update documentation for CTCLoss (#20422 ) Summary: Change `Inputs` to `Shape` to unify the format of CTCLoss `class`, and add the type of `Output` in `Shape`. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20422 Differential Revision: D15393484 Pulled By: ezyang fbshipit-source-id: 5b49647f9740de77db49a566fa2de74fcecd9110	2019-05-17 09:02:25 -07:00
daquexian	35e0015c70	Export sign onnx operator (#20470 ) Summary: A trivial commit that supports exporting sign operator Pull Request resolved: https://github.com/pytorch/pytorch/pull/20470 Differential Revision: D15393446 Pulled By: ezyang fbshipit-source-id: 12fb1c147d016205abf814907d667f7d8b074ae1	2019-05-17 08:57:22 -07:00
Edward Yang	4e551a7edb	Make C10_NODISCARD macro more portable for nvcc+clang. (#20324 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20324 ghimport-source-id: e51181c82f87c946b5ffcb87b0ad71a056cb4659 Differential Revision: D15359317 Pulled By: ezyang fbshipit-source-id: d88798f13a61c74456641ddec8250c08ce8af240	2019-05-17 08:57:19 -07:00
vishwakftw	690efa5220	Remove checks for CUDA 8 in LU-based tests (#20482 ) Summary: CUDA 8 is no longer supported and removed from CI, so these checks are irrelevant Pull Request resolved: https://github.com/pytorch/pytorch/pull/20482 Differential Revision: D15393438 Pulled By: ezyang fbshipit-source-id: ac0979bf660b3314eec502c745e34ce4940bda0e	2019-05-17 08:51:56 -07:00
Edward Yang	110ed511a4	Make check-doxygen.sh output more interpretable. (#20362 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20362 ghimport-source-id: ac791884dc6d3954f69d8fc997b2b561f435e0e7 Differential Revision: D15375139 Pulled By: ezyang fbshipit-source-id: c8aa0f991430269090e068f828810bae7aa39a07	2019-05-17 08:47:11 -07:00
peter	1136ad59f9	Enable simd and loop vectorizer with MSVC Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20530 Differential Revision: D15392676 Pulled By: ezyang fbshipit-source-id: c8fda0c7835127f81adf55016223bb4dc14ff40a	2019-05-17 08:38:56 -07:00
Shen Li	fa4ca4e70e	Emphasize all DDP forward() outputs must participate in computing loss (#20586 ) Summary: CC borguz chenyangyu1988 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20586 Reviewed By: ezyang Differential Revision: D15373674 Pulled By: mrshenli fbshipit-source-id: b986918b3592616a9bcc88fba1b8fd53016f68d7	2019-05-17 07:35:49 -07:00
Ivan Ogasawara	c941abbc0a	Fix upsample kernel launch / reorder arguments (#20505 ) Summary: this is a follow up for https://github.com/pytorch/pytorch/pull/19630 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20505 Differential Revision: D15392706 Pulled By: ezyang fbshipit-source-id: 5a8a7aacdbcf740508baf2b6e0c081c4e5a0390f	2019-05-17 07:30:50 -07:00
peter	3bc0bd9534	Fix caffe2 build failure on Windows (#20574 ) Summary: Fixes #20568. Looks like CMake is passing `/MD` when we call `add_library`. We need to fix these with C source files too. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20574 Differential Revision: D15392682 Pulled By: ezyang fbshipit-source-id: c92034d8725fcec48fd7db6cf5322868e956dc6b	2019-05-17 07:21:42 -07:00
vishwakftw	4c806a9e8a	Allow tuples for scale_factor argument in nn.Upsample (#20581 ) Summary: Fixes #20523 . nn.Upsample was unable to accept tuple inputs for the scale_factor argument due to direct casting to float, which was done in #17732. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20581 Differential Revision: D15392622 Pulled By: ezyang fbshipit-source-id: b56ba8197a5bbf8891bc7e1bebf5cad63dcab04d	2019-05-17 07:14:18 -07:00
Ilia Cherniavskii	409200df59	Move inter-op settings into ATen/Parallel (#20050 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20050 ghimport-source-id: cc102bab8abf3e56c099245976786317ed63ea14 Differential Revision: D15248576 Pulled By: ilia-cher fbshipit-source-id: 55ddcb7af387ddfc68a42ac7167de07ea648e249	2019-05-17 03:12:02 -07:00
Ashwin Bharambe	36d3398aa5	Clang-format ImageInputOp (#20441 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20441 This op is fairly complex and the fact that it isn't formatted correctly makes things that much harder to reason about. Clean it up. Reviewed By: dreiss Differential Revision: D15220006 fbshipit-source-id: 30632d8bdbf15f96e73d8b6c96c5f29c052e6e7c	2019-05-16 23:00:09 -07:00
Jongsoo Park	ea9c6e7581	eliminate FE_INVALID in unit test (#20502 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20502 Following D15307410 removing more floating point exceptions in unit tests Reviewed By: hx89 Differential Revision: D15340930 fbshipit-source-id: 269fc75e0800bc9d39126767a0f3ca15cd8b0cad	2019-05-16 21:55:28 -07:00
Will Feng	e4c7f59fbc	Shallow-copy indices and values in sparse tensor ctor (#20614 ) Summary: (Reopens https://github.com/pytorch/pytorch/pull/20330 and fixes test error.) After the Variable/Tensor merge, there is no guarantee that `indices` and `values` passed into the sparse tensor constructor don't contain AutogradMeta. However, we want to maintain the existing invariant that `indices_` and `values_` of a sparse tensor don't contain AutogradMeta, and to achieve this we need do shallow-copy in the sparse tensor constructor. Note that this is BC-breaking for code that changes the sizes / strides of the indices or values tensor after it's used to create a sparse tensor. In current master, such changes will be reflected in the sparse tensor and break sparse tensor invariants. After this PR, those changes will not be reflected in the sparse tensor, and thus the sparse tensor invariants are always preserved. Specifically, running in-place size/stride-changing ops such as `resize_` / `resize_as_` / `as_strided_` / `set_` / `transpose_` on the original values tensor will not update the sparse tensor's `values_`. For example: ```python # Calling resize_ on non-requires-grad value tensor i2 = torch.zeros([1, 1]) v2 = torch.ones([1, 2, 3]) t2 = torch.sparse_coo_tensor(i2, v2, torch.Size([2, 2, 3])) v2.resize_(4, 5) t2.coalesce().values().size() # On current master, this throws "indices and values must have same nnz, but got nnz from indices: 1, nnz from values: 4", because resizing the original value tensor affects `values_` of the sparse tensor. # After this PR, this prints "torch.Size([1, 2, 3])", which means resizing the original value tensor doesn't affect `values_` of the sparse tensor. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20614 Differential Revision: D15385811 Pulled By: yf225 fbshipit-source-id: e963fcf5e4097f8c881b56145f408565d97cf5c1	2019-05-16 18:35:05 -07:00
Yanghan Wang	3c86d597c4	update legacy plus one for mpscnn Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20554 Reviewed By: jerryzh168 Differential Revision: D15362378 fbshipit-source-id: 070cd8314257386036dca89167c738c6602b3f33	2019-05-16 18:17:18 -07:00
Yanghan Wang	8bdbd59d0c	handle box plus one for gpu generate_proposals Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20553 Reviewed By: newstzpz Differential Revision: D15362108 fbshipit-source-id: 53b1ef132288855f8977748442bfe5e5806c6c6e	2019-05-16 18:17:15 -07:00
Yanghan Wang	373e6a78bf	make box plus one a legacy argument in detection ops Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20550 Reviewed By: newstzpz Differential Revision: D15348610 fbshipit-source-id: 12b1e119e9bc9191ba9f2aa6d695ef215780c349	2019-05-16 18:17:12 -07:00
Jerry Zhang	220e6894c5	Rename qint8 data type (#19932 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19932 In preparation to add int8_t data type for QTensor Reviewed By: zafartahirov Differential Revision: D15137838 fbshipit-source-id: 59462c36d6fc5982986d4196bf3f32f49bb294d7	2019-05-16 18:09:28 -07:00
svcscm	980982ac09	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: d0158f7e77a915ffbc28c10de8864d2ae9b24e6f	2019-05-16 16:06:55 -07:00
Will Feng	2ddf126b96	Revert D15373683: [pytorch][PR] [BC-breaking] Shallow-copy indices and values in sparse tensor ctor Differential Revision: D15373683 Original commit changeset: 32e7275d7121 fbshipit-source-id: ed1786ee9ffa11f7c14c9cd10be6db48285dc57a	2019-05-16 15:22:48 -07:00
Will Feng	4f02321a9a	Shallow-copy indices and values in sparse tensor ctor (#20330 ) Summary: After the Variable/Tensor merge, there is no guarantee that `indices` and `values` passed into the sparse tensor constructor don't contain AutogradMeta. However, we want to maintain the existing invariant that `indices_` and `values_` of a sparse tensor don't contain AutogradMeta, and to achieve this we need do shallow-copy in the sparse tensor constructor. Note that this is BC-breaking for code that changes the sizes / strides of the indices or values tensor after it's used to create a sparse tensor. In current master, such changes will be reflected in the sparse tensor and break sparse tensor invariants. After this PR, those changes will not be reflected in the sparse tensor, and thus the sparse tensor invariants are always preserved. Specifically, running in-place size/stride-changing ops such as `resize_` / `resize_as_` / `as_strided_` / `set_` / `transpose_` on the original values tensor will not update the sparse tensor's `values_`. For example: ```python # Calling resize_ on non-requires-grad value tensor i2 = torch.zeros([1, 1]) v2 = torch.ones([1, 2, 3]) t2 = torch.sparse_coo_tensor(i2, v2, torch.Size([2, 2, 3])) v2.resize_(4, 5) t2.coalesce().values().size() # On current master, this throws "indices and values must have same nnz, but got nnz from indices: 1, nnz from values: 4", because resizing the original value tensor affects `values_` of the sparse tensor. # After this PR, this prints "torch.Size([1, 2, 3])", which means resizing the original value tensor doesn't affect `values_` of the sparse tensor. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20330 Differential Revision: D15373683 Pulled By: yf225 fbshipit-source-id: 32e7275d7121e17937c7cc258e8a60bb0848ff25	2019-05-16 15:04:23 -07:00
Mingfei Ma	21ef4cc615	Improve bmm performance on CPU by applying TensorAccessor (#20266 ) Summary: Currently `bmm()` has very heavy performance overhead on CPU due to construction/deconstruction of `TensorImpl`. Applying `TensorAccessor` when indexing tensor data can greatly improve the performance. I tested this on `fairseq` Transformer model. Results on Xeon 6148 (202 cores 2.5GHz) indicate this PR improves Transformer training performance by approximately 10%* (seconds per iteration reduced from 3.60 to 3.21). Considering the fact that `bmm()` takes only 14% of the total time, 10% overall improvement indicates `bmm()` itself improves by roughly 3x. Before: ``` \| epoch 001: 0%\| \| 43/25337 [02:34<25:17:11, 3.60s/it, loss=16.179, nll_loss=16.137, ppl=72045.59, wps=1320, ups=0, wpb=4758.767, bsz=136.558, num_updates=43, lr=6.45e-06, gnorm=6.88 ``` After: ``` \| epoch 001: 0%\| \| 23/25337 [01:13<22:32:48, 3.21s/it, loss=17.072, nll_loss=17.068, ppl=137419.42, wps=1478, ups=0, wpb=4746.870, bsz=128.348, num_updates=23, lr=3.45e-06, gnorm=10. ``` Pull Request resolved: https://github.com/pytorch/pytorch/pull/20266 Differential Revision: D15262201 Pulled By: cpuhrsch fbshipit-source-id: c2e4e406c06714b04cc7534f3da71e986eddca35	2019-05-16 14:01:48 -07:00
BowenBao	fa189641b5	Add export for __and__ & __or__ (#17894 ) Summary: In onnx spec, the supported input/output type for `And` and `Or` is `Bool` only. Thus in exporting, cast to/from `Bool` is inserted for input/output. Pull Request resolved: https://github.com/pytorch/pytorch/pull/17894 Reviewed By: zrphercule Differential Revision: D15103148 Pulled By: houseroad fbshipit-source-id: 3e1068ea236c743260d42882fb11f0e3a21707e6	2019-05-16 13:52:06 -07:00
Yanghan Wang	61012080c8	split and register CollectAndDistributeFpnRpnProposals with C10 Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20509 Reviewed By: newstzpz Differential Revision: D15302181 fbshipit-source-id: 7d3b29b667cd900f2976101f35200e1ee20b0f64	2019-05-16 13:40:46 -07:00
Mikhail Zolotukhin	d784636b39	Scope: Move implementations from .h to .cpp file. (#20593 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20593 ghimport-source-id: e1e9a7f98158c23adf02d5ed2763ab1e33bb2997 Differential Revision: D15375134 Pulled By: ZolotukhinM fbshipit-source-id: 8d8e0c1e0ef7697ded59a4b19e2d9de7c5230294	2019-05-16 13:18:04 -07:00
svcscm	75d04900fe	Updating submodules Reviewed By: cdelahousse fbshipit-source-id: 2ee799db589f63e7b9336a02d047afcc768e8b58	2019-05-16 09:48:39 -07:00
Jesse Hellemn	5821a76b8e	Forcing gcc ABI and safer bash scripts, v2 (#20540 ) Summary: First time this was merged it broke master and was reverted. This time I do not add ```set -u``` to the .circleci/scripts/setup* scripts. There's still a chance that ```set -u``` breaks the binary builds on master, but at least those can be fixed in parallel and don't completely eliminate signal from all merges. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20540 Differential Revision: D15373444 Pulled By: pjh5 fbshipit-source-id: 0203c20865827366ecd8fa07b2db74d255549ed1	2019-05-16 09:40:01 -07:00
Natalia Gimelshein	66c6133264	fix empty dropout (#20541 ) Summary: Fix for #20499 Pull Request resolved: https://github.com/pytorch/pytorch/pull/20541 Differential Revision: D15372461 Pulled By: ezyang fbshipit-source-id: cdc237a98244515a573216a6dac4826261c973f9	2019-05-16 09:33:51 -07:00
Vitaly Fedyunin	a837c00acd	Removing unnecessary comments (+fix flake8) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20589 Differential Revision: D15373655 Pulled By: VitalyFedyunin fbshipit-source-id: 25277648d3e8f8a09cec7569ceda56e74c2ef0b1	2019-05-16 09:19:34 -07:00
Jongsoo Park	5f8e849d84	eliminate FE_INVALID in optimizer related operators and tests (#20501 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20501 Fixing unit tests related to optimizer related operators and tests Reviewed By: hx89 Differential Revision: D15307410 fbshipit-source-id: e5400c26e08f26191ee542fe6b02e0a69bc4e1ae	2019-05-16 08:23:46 -07:00
Vitaly Fedyunin	5b78a5eadb	Memory format support for contiguous and is_contiguous (#20455 ) Summary: #19975 was separated by 2 PRs. This one: Introduce MemoryFormat argument to the `x.is_contiguous(memory_format=torch.channels_last)` and to the `y = x.contiguous(memory_format=torch.channels_last)` functions. At this moment both functions just operate with strides and doesn't store any tensor state. (Original RFC #19092) ----- Expands functionality of two tensor functions `.is_contiguous` and `.contiguous` (both python and c++ api). Note: We had several complaints about `.to(memory_format)` function, and decided not to support it. 1. `.contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`. - Using `torch.contiguous_format` will preserve existing `.contiguous()` behavior. - Calling `x.contiguous(memory_format=torch.channels_last)` returns new tensor which maintain same semantical layout (NCHW), but have different memory allocation pattern. `x.contiguous(memory_format=torch.channels_last)` expects input tensor to be 3d, 4d or 5d; and fails otherwise. 2. `.is_contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`. - `x.is_contiguous(memory_format=torch.contiguous_format)` preserves same functionality as `x.is_contiguous()` and remains unchanged. - `x.is_contiguous(memory_format=torch.channels_last)` returns true if A) input tensor is contiguous in memory AND B) allocated in the memory in NWHC (or similar for 3d,5d) format. Note: By the end of the phase one `x.is_contiguous(memory_format=torch.channels_last)` will calculate state of the Tensor on every call. This functionality going to be updated later. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20455 Differential Revision: D15341577 Pulled By: VitalyFedyunin fbshipit-source-id: bbb6b4159a8a49149110ad321109a3742383185d	2019-05-16 07:18:24 -07:00
Sebastian Messmer	09f22d10a6	Infer schema for experimental ops (#20513 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20513 They've been using an old API, switch them to the new one instead. Reviewed By: li-roy Differential Revision: D15346349 fbshipit-source-id: 538eb460897ec6addebeebf88b316eb0d6b1dd6f	2019-05-16 01:29:35 -07:00
Sebastian Messmer	9bd3305592	Allow nested lists/dicts in legacy operator API (#20379 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20379 The legacy custom op API allowed nesting of std::unordered_map and std::vector. While we haven't figured out yet how to do that with the new API, we at least have to keep backwards compatibility. This diff adds the feature so we can switch to the new API without breaking third party code. Reviewed By: li-roy Differential Revision: D15287693 fbshipit-source-id: bb5b8429fddf6298719cbf567b584ed371f8fc81	2019-05-16 01:29:32 -07:00
Will Feng	456b889353	Require passing version_counter and allow_tensor_metadata_change to shallow_copy_and_detach() (#20496 ) Summary: Previously, the caller of `shallow_copy_and_detach()` is responsible for deciding whether the shallow-copy should share the source TensorImpl's version counter, or have its own new version counter. However, since this decision is crucial for ensuring the correctness of the shallow-copy's version counter, we want to enforce users of `shallow_copy_and_detach()` to pass a version counter to the function call, so that they are required to make the decision at the time of API usage, not as an afterthought. For similar reasons, we want to enforce users of `shallow_copy_and_detach()` to pass `allow_tensor_metadata_change` to the function call, so that they are required to decide "whether the TensorImpl shallow-copy should allow tensor metadata change" at the time of API usage, not as an afterthought. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20496 Differential Revision: D15363620 Pulled By: yf225 fbshipit-source-id: a65e74738b10452668d6dc644b43aad5b3d8c9e6	2019-05-15 21:02:48 -07:00
Guanheng Zhang	3caf4e6985	Remove weak_script in MultiheadAttention function. (#20563 ) Summary: Remove weak_script. After recently splitting the forward() function in MultiheadAttention module, we notice a memory leak on GPU. Fix the problem by removing those "weak_script" decorator. Pull Request resolved: https://github.com/pytorch/pytorch/pull/20563 Differential Revision: D15368262 Pulled By: zhangguanheng66 fbshipit-source-id: 475db93c9ee0dbaea8fb914c004e7d1e0d419bc2	2019-05-15 20:10:39 -07:00
Edward Yang	7db1fb84fa	Use slimmer exception raising code when on mobile. (#20543 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20543 All of that code for concatenating strings together adds up. Just discard it all for mobile builds. Reviewed By: ljk53 Differential Revision: D15353447 fbshipit-source-id: a82dd0b884335d662605aabf7dd3d09dfcc1478b	2019-05-15 19:45:18 -07:00
David Reiss	1891614aa5	Add GivenTensorInt16Fill (#20515 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20515 Needed by the upcoming quantized version of GenerateProposals Reviewed By: dzhulgakov Differential Revision: D14430952 fbshipit-source-id: ea852f04cc4b070f8fbe7a1e6535bba4d5b230fd	2019-05-15 19:45:15 -07:00
Ilia Cherniavskii	5917ec2c52	Print registry warning only when DEBUG is set (#20398 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20398 Reduce logging volume from the Registry Reviewed By: nairbv Differential Revision: D15312262 fbshipit-source-id: e3546c288d6e1a396b2a4b08204a418aca889437	2019-05-15 19:29:05 -07:00
Rui Zhu	c129ab06e9	Change onnxifi workflow to support multi-group quantized & Add multi quantization info to caffe2.proto (#20439 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20439 This is the QTensorProto workflow for multi group quantization in C2 side. No DNNLOWP Tensor related thing is included in this pr, so once we finished glow side, we should be able to test this pr using resnet50. Reviewed By: yinghai Differential Revision: D15096919 fbshipit-source-id: 741eecd59eb79d24d9fe2b035f6246d42422d25c	2019-05-15 19:24:08 -07:00
Roy Li	51e40ab832	Add scalar type info to tensor print (#20483 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20483 ghimport-source-id: 31bfc51af1060e83492315b96884fc725a1eb84f Differential Revision: D15334010 Pulled By: li-roy fbshipit-source-id: 199b575855146a7336d57c165191a16e7e1b5785	2019-05-15 19:03:21 -07:00
Jerry Zhang	abb3698976	Add QInt32 ScalarType and qint32 data type (#19816 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/19816 We need this for quantization for bias add third argument of ScalarType to `quantize_linear` Differential Revision: D15094174 fbshipit-source-id: f19ec8f4716cf5fe0aa21b38d45af6d27c9ab377	2019-05-15 18:50:18 -07:00
Rakesh Komuravelli	1a0f753e6e	Fixing typos in schema description for BatchMatMul (#20512 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20512 Fixing typos in the description of schema for one of the inputs for BatchMatMul operator. Reviewed By: jianyuh, BIT-silence Differential Revision: D15343879 fbshipit-source-id: 06354e8e6b0d79fea937ed2703bb457b2d04f859	2019-05-15 18:06:30 -07:00
Jerry Zhang	b3e510518b	Tensor codemod for instance_norm (#20517 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20517 fixing a bug in instance_norm Reviewed By: ezyang Differential Revision: D15349006 fbshipit-source-id: 2496f7f372118d2713c12a6e9b3357bf6c640b71	2019-05-15 17:51:37 -07:00
Guanheng Zhang	ca24e18c7e	Add an AssertError check back to MultiheadAttention module (#20492 ) Summary: Fix a typo in the doc. Add an AssertError check back to MultiheadAttention module Pull Request resolved: https://github.com/pytorch/pytorch/pull/20492 Differential Revision: D15349008 Pulled By: cpuhrsch fbshipit-source-id: 2d898345f03787c713e537673613a748ad826b34	2019-05-15 17:28:25 -07:00
Long Jin	161566187c	enable CopyVector for type of int on CUDA (#20520 ) Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/20520 as title Reviewed By: xianjiec Differential Revision: D15351010 fbshipit-source-id: 99466de9da0abdffe26d6919768dcb4e52cb2ff1	2019-05-15 16:53:51 -07:00
Igor Fedan	4c23c34e79	Computing var/stddev and mean at the same time (#18731 ) Summary: The current variance kernels compute mean at the same time. Many times we want both statistics together, so it seems reasonable to have a kwarg/function that allows us to get both values without launching an extra kernel. Pull Request resolved: https://github.com/pytorch/pytorch/pull/18731 Differential Revision: D14726082 Pulled By: ifedan fbshipit-source-id: 473cba0227b69eb2240dca5e61a8f4366df0e029	2019-05-15 16:42:38 -07:00

... 2 3 4 5 6 ...

18149 Commits