Commit Graph

210 Commits

Author SHA1 Message Date
71b24127d2 Correct sphinx-note in symeig (wrong indentation)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16073

Differential Revision: D13692874

Pulled By: soumith

fbshipit-source-id: ea2a98e88679d382f9a2edab199e9ba7c8ce2213
2019-01-16 10:47:48 -08:00
19717224c5 Miscellaneous broken RSTs fixed (#16033)
Summary:
https://pytorch.org/docs/master/tensors.html#torch.Tensor.bernoulli_
https://pytorch.org/docs/master/torch.html#torch.addmm
https://pytorch.org/docs/master/distributed_deprecated.html#torch.distributed.deprecated.reduce_multigpu
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16033

Differential Revision: D13671202

Pulled By: soumith

fbshipit-source-id: 276e10e610affe205376573e7f0f9894695d218d
2019-01-15 09:50:12 -08:00
1065e7cd24 Add itertools.{prod, combinations, combinations_with_replacement} like op to pytorch (#9393)
Summary:
closes https://github.com/pytorch/pytorch/issues/7580
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9393

Differential Revision: D13659628

Pulled By: zou3519

fbshipit-source-id: 3a233befa785709395a793ba8833413be394a6fd
2019-01-15 08:31:22 -08:00
ddece5a793 Fix ormqr docs, fixes #15565 (#15694)
Summary:
Signed-off-by: Edward Z. Yang <ezyang@fb.com>

cc meganset
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15694

Differential Revision: D13573064

Pulled By: zou3519

fbshipit-source-id: 1d0b693d7c26db91826b81e6c98b45a69b5e9bc4
2019-01-14 17:08:18 -08:00
492b7d410b doc fixes (#15990)
Summary: fixes  #15597 ,  #15283 and #10258

Differential Revision: D13649905

Pulled By: soumith

fbshipit-source-id: 753f46c2c96c61fba460019d9ed3e0d047d42ee7
2019-01-13 23:38:39 -08:00
e46e572b30 Add backward pass notes for eig() and symeig()
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15929

Differential Revision: D13626158

Pulled By: soumith

fbshipit-source-id: ab869560926036053c39d20b217ccef8767e7d3f
2019-01-10 16:27:48 -08:00
b4c3268b23 Batched upper triangular, lower triangular (#15257)
Summary:
Changelog:

- Implements `triu` and `tril` for batches of 2D tensors.
- Remove TH/THC binding for `tril`
- Fix CUDA implementation
- Update docstrings for tril and triu.
- Remove mask-based `triu` and `tril` in cholesky forward and backward.
- Remove batched tril in torch.distributions.utils
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15257

Differential Revision: D13613888

Pulled By: mrshenli

fbshipit-source-id: 0949a05b9b8e974c1acfaf02a6284848ec5cc1c4
2019-01-09 19:46:39 -08:00
95febdfacc Add is_floating_point to docs (#15704)
Summary:
Fixes #15700 .

Changelog:

- Expose torch.*.is_floating_point to docs

Differential Revision: D13580734

Pulled By: zou3519

fbshipit-source-id: 76edb4af666c08237091a2cebf53d9ba5e6c8909
2019-01-07 10:43:22 -08:00
2d8f14cd12 clarified language of doc for torch.mul (#15664)
Summary:
see issue #15636

Please note - I build the documents but the HTML is not updated with the edited content.
I did not also build the fork.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15664

Differential Revision: D13571310

Pulled By: soumith

fbshipit-source-id: d43be0f61705693d778cc12c13e86d6b06130ac7
2019-01-03 21:39:35 -08:00
eeb14675f1 Fix torch.gesv args in doc
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15649

Differential Revision: D13564312

Pulled By: soumith

fbshipit-source-id: b3bba2ece600880077eb09b092ce17e331995bd6
2019-01-02 00:20:22 -08:00
e4477feb15 Update cuda.get/set_rng_state doc (#14324)
Summary:
Now that `cuda.get/set_rng_state` accept `device` objects, the default value should be an device object, and doc should mention so.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14324

Reviewed By: ezyang

Differential Revision: D13528707

Pulled By: soumith

fbshipit-source-id: 32fdac467dfea6d5b96b7e2a42dc8cfd42ba11ee
2018-12-27 14:09:25 -08:00
06a7cb5901 Implementing cuda kernel for tril_indices and triu_indices (#15203)
Summary:
Followup PR of #14904, and the stretch goal of #12653.

Directly calculate coordinates in the original tensor using column index in the result tensor. Every GPU thread takes care of a column (two numbers) in the output tensor.

The implementation detects and handles precision loss during calculating the square root of a `int64_t` variable, and supports tensors with up to `row * column = 2 ^ 59` numbers.

Algorithm details are describe in [comments of TensorFactories.cu](23ddb6f58a/aten/src/ATen/native/cuda/TensorFactories.cu (L109-L255)).

zou3519
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15203

Reviewed By: zou3519

Differential Revision: D13517695

Pulled By: mrshenli

fbshipit-source-id: 86b305d22cac08c8962a3b0cf8e9e620b7ec33ea
2018-12-20 10:23:38 -08:00
7a764fe270 multi-dim standard deviation for CUDA. (#14990)
Summary:
This is the CUDA version of #14535 .
It refactors Reduce.cuh to allow more general classes of reductions to be performed -- we no longer assume that the temporary data returned during reduction is just one scalar, and instead allow an arbitrary accumulate type.
We also allow 64-bit indexing when necessary, since in general we will no longer be able to accumulate directly in the output. (In the cases when we can, we continue to split the tensors until they can be addressed with 32-bits, as before).
As an initial use-case, we implement `std` in multiple dimensions.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14990

Differential Revision: D13405097

Pulled By: umanwizard

fbshipit-source-id: a56c24dc2fd5326d417632089bd3f5c4f9f0d2cb
2018-12-20 08:56:32 -08:00
41e7e1bc40 Rename potrs to cholesky_solve (#15334)
Summary:
Changelog:
- Renames `potrs` to `cholesky_solve` to remain consistent with Tensorflow and Scipy (not really, they call their function chol_solve)
- Default argument for upper in cholesky_solve is False. This will allow a seamless interface between `cholesky` and `cholesky_solve`, since the `upper` argument in both function are the same.
- Rename all tests
- Create a tentative alias for `cholesky_solve` under the name `potrs`, and add deprecated warning to not promote usage.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15334

Differential Revision: D13507724

Pulled By: soumith

fbshipit-source-id: b826996541e49d2e2bcd061b72a38c39450c76d0
2018-12-19 12:31:24 -08:00
4bcb425490 fix cholesky call in potrs example (#15215)
Summary:
Cholesky by default returns the lower triangular matrix, see [docs](https://pytorch.org/docs/stable/torch.html#torch.cholesky).

However `torch.potrs` by default requires the upper triangular matrix. The naming of the variable `u` suggests that the example expects the upper to be returned, so I've added the flag to make that happen in the example.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15215

Differential Revision: D13476468

Pulled By: soumith

fbshipit-source-id: 7b68035f435a2b1be4d363b3f63e407394af949d
2018-12-15 04:43:34 -08:00
90f9e8103c Implement torch.tril_indices and torch.triu_indices (#12653) (#14904)
Summary:
This is an optimized implementation that does the following:

1. created an empty Tensor of correct size.
2. fill the Tensor with correct values.

The following three designs to fill in the Tensor result in roughly the same performance. Hence, the 2nd option is taken for simpler code, and to return contiguous tensors.

1. Sequential: fill row coordinates first, then columns. This results in two for-loop and more arithmetic operations.
2. Interleaved: fill in index coordinates one by one, which jumps between the two output Tensor rows in every iteration.
3. Transpose: create a n X 2 Tensor, fill the Tensor sequentially, and then transpose it.

<img width="352" alt="screen shot 2018-12-10 at 3 54 39 pm" src="https://user-images.githubusercontent.com/16999635/49769172-07bd3580-fc94-11e8-8164-41839185e9f9.png">

NOTE:

This implementation returns a 2D tensor, instead of a tuple of two tensors. It means that users will not be able to do the following:

```python
x = torch.ones(3, 3)
i = torch.tril_indices(3, 3)
x[i]  # need to first convert the 2D tensor into a tuple of two 1D tensors.
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14904

Reviewed By: zou3519

Differential Revision: D13433027

Pulled By: mrshenli

fbshipit-source-id: 41c876aafcf584832d7069f7c5929ffb59e0ae6a
2018-12-12 15:40:14 -08:00
342e62f1e3 Minor documentation mistake (#15068)
Summary:
keepdim is a optional parameter for torch.max()
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15068

Differential Revision: D13437745

Pulled By: zou3519

fbshipit-source-id: b5198c7d4ae17758cd136f6e5aecc6cb5838f174
2018-12-12 15:24:26 -08:00
fc30e2782c Remove deprecated info argument in btrifact (#14935)
Summary:
As specified in title.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14935

Differential Revision: D13394449

Pulled By: soumith

fbshipit-source-id: 569d59414f3a1a43ea641bded4b5433eb53e3490
2018-12-09 15:59:30 -08:00
1c8d41a08d Allow linspace and logspace with steps=1 and start != end like numpy (#14748)
Summary:
`torch.linspace(0, 1, 1)` fails with `RuntimeError: invalid argument 3: invalid number of points at ../aten/src/TH/generic/THTensorMoreMath.cpp:2119`, while `np.linspace(0, 1, 1)` works fine.
Looking at the code, there is even a comment by gchanan asking: "NumPy allows you to pass different points even if n <= 1 -- should we?"
I would say "yes". Currently, I would need to handle the case of `steps == 1` or `steps == 0` separately, making sure to change the `end` when calling `torch.linspace`. This is impractical. If we support `start != end`, there are two possibilities for the result: Either we ensure the first value in the resulting sequence always equals `start`, or we ensure the last value in the resulting sequence always equals `end`. Numpy chose the former, which also allows it to support a boolean `endpoint` flag. I'd say we should follow numpy.

This PR adapts `linspace` and `logspace` to mimic the behavior of numpy, adapts the tests accordingly, and extends the docstrings to make clear what happens when passing `steps=1`.

If you decide against this PR, the error message should become explicit about what I did wrong, and the documentation should be extended to mention this restriction.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14748

Differential Revision: D13356136

Pulled By: ezyang

fbshipit-source-id: db85b8f0a98a5e24b3acd766132ab71c91794a82
2018-12-06 09:30:55 -08:00
c638f379b3 Make mean function work across multiple dimensions. (#14252)
Summary:
Multi-dimensional `sum` is already implemented, and it's trivial to implement `mean` in terms of `sum`, so just do it.

Bonus: Fix incomplete language in the `torch.sum` documentation which doesn't take into account multiple dimensions when describing `unsqueeze` (at the same time as introducing similar language in `torch.mean`).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14252

Differential Revision: D13161157

Pulled By: umanwizard

fbshipit-source-id: c45da692ba83c0ec80815200c5543302128da75c
2018-11-28 06:53:09 -08:00
b08a186153 roll along multiple dimensions
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/13874

Differential Revision: D13223669

Pulled By: nairbv

fbshipit-source-id: 1678d52529c326fa4a0614d0994b1820ad12bc04
2018-11-27 20:32:30 -08:00
1ca0ec7299 fix typo in torch.sum documentation (#14250)
Summary:
Notice that an extra colon was added to `:attr:`, so in https://pytorch.org/docs/stable/torch.html#torch.sum , `dim` shows up as ":attr::_dim_". This patch fixes the issue.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14250

Reviewed By: soumith

Differential Revision: D13146363

Pulled By: umanwizard

fbshipit-source-id: f7d03dcb0973aae248b56ab407ba8489f2b1fe36
2018-11-26 16:36:52 -08:00
a30ade1139 Batched cholesky decomposition (#14017)
Summary:
Implements batching for the Cholesky decomposition.

Performance could be improved with a dedicated batched `tril` and `triu` op, which is also impeding autograd operations.

Changes made:
- batching code
- tests in `test_torch.py`, `test_cuda.py` and `test_autograd.py`.
- doc string modification
- autograd modification
- removal of `_batch_potrf` in `MultivariateNormal`.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14017

Differential Revision: D13087945

Pulled By: ezyang

fbshipit-source-id: 2386db887140295475ffc247742d5e9562a42f6e
2018-11-17 10:49:15 -08:00
a3f39f1ebb Fix randint docs (#14083)
Summary: Closes #14079

Differential Revision: D13095904

Pulled By: soumith

fbshipit-source-id: e39319c5326bfdf6f401eaddebe94474349901c3
2018-11-16 03:04:02 -08:00
35a24a9a94 Example with edge case 0 for torch.sign (#13771)
Summary:
The behavior of the edge case 0 is not self-evident for the `torch.sign` function ( I personally expected a result of 1):
```python
>>> a = torch.tensor([0.7, -1.2, 0., 2.3])
>>> a
tensor([ 0.7000, -1.2000,  0.0000,  2.3000])
>>> torch.sign(a)
tensor([ 1., -1.,  0.,  1.])
```
This is not currently documented, I think it is worth it to give a simple example showing this behavior.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13771

Differential Revision: D13044520

Pulled By: ailzhang

fbshipit-source-id: c3011ccbdf1c13348f6c7242b06a9aa52ebc9204
2018-11-14 09:16:09 -08:00
f112aa746a Fix document about torch.get_default_dtype() (#13890)
Summary:
Minor fix.
```
torch.get_default_dtype() → :class:`torch.dtype`
```
→
```
torch.get_default_dtype() → torch.dtype
```
:class: is not rendered in https://pytorch.org/docs/stable/torch.html
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13890

Differential Revision: D13040704

Pulled By: colesbury

fbshipit-source-id: 5fadb01ad365042d5df2bac058f4ae89b281d3b7
2018-11-13 09:25:32 -08:00
7b2fb012a8 Make potrs batched (#13453)
Summary:
- This is a straightforward PR, building up on the batch inverse PR, except for one change:
  - The GENERATE_LINALG_HELPER_n_ARGS macro has been removed, since it is not very general and the resulting code is actually not very copy-pasty.

Billing of changes:
- Add batching for `potrs`
- Add relevant tests
- Modify doc string

Minor changes:
- Remove `_gesv_single`, `_getri_single` from `aten_interned_strings.h`.
- Add test for CUDA `potrs` (2D Tensor op)
- Move the batched shape checking to `LinearAlgebraUtils.h`
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13453

Reviewed By: soumith

Differential Revision: D12942039

Pulled By: zou3519

fbshipit-source-id: 1b8007f00218e61593fc415865b51c1dac0b6a35
2018-11-09 15:16:26 -08:00
4fadf571fd handle flat rolling (no dim specified) T36264909 (#13588)
Summary:
update roll to behave as in numpy.roll when dimension to roll not specified.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13588

Differential Revision: D12964295

Pulled By: nairbv

fbshipit-source-id: de9cdea1a937773033f081f8c1505a40e4e08bc1
2018-11-08 12:39:35 -08:00
619c2f8b44 small fixes regarding docu of torch tensors (#13635)
Summary:
Removed duplicate doc args block.
Made statements involving 'each element' more precise.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13635

Differential Revision: D12946987

Pulled By: soumith

fbshipit-source-id: a17da92f69086b530ff769cf4662ae29843fd188
2018-11-06 17:24:42 -08:00
f0ed927b62 Add diag_embed to ATen and torch (#12447)
Summary:
Fixes: #12160
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12447

Differential Revision: D12916234

Pulled By: SsnL

fbshipit-source-id: 512a04efb0c2e0a54295b857a61be66c3aae13da
2018-11-05 08:55:28 -08:00
07f8b61cc6 Roll operator t32802531 (#13261)
Summary:
Adding a roll operator
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13261

Differential Revision: D12922575

Pulled By: nairbv

fbshipit-source-id: ff05c075d9c484a615011192b023debf47da4017
2018-11-05 08:33:36 -08:00
d714ecf879 Rename potrf to cholesky (#12699)
Summary:
This PR performs a renaming of the function `potrf` responsible for the Cholesky
decomposition on positive definite matrices to `cholesky` as NumPy and TF do.

Billing of changes
- make potrf cname for cholesky in Declarations.cwrap
- modify the function names in ATen/core
- modify the function names in Python frontend
- issue warnings when potrf is called to notify users of the change

Reviewed By: soumith

Differential Revision: D10528361

Pulled By: zou3519

fbshipit-source-id: 19d9bcf8ffb38def698ae5acf30743884dda0d88
2018-11-01 15:10:55 -07:00
518b0d0600 Fix add out=None to digamma docstring (Fixes #13225) (#13307)
Summary:
Fixes #13225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13307

Differential Revision: D12840231

Pulled By: SsnL

fbshipit-source-id: 2732a2466ac1d2f3fdabfd1eaccddec96e89ba1b
2018-10-30 11:52:35 -07:00
1fe8278559 Batched Inverse (#9949)
Summary:
Complete billing of changes:

Related to Batch Inverse:
- [x] Add batched inverse (CPU)
- [x] Add batched inverse (CUDA)
- [x] Modify autograd entry
- [x] Add tests
  - [x] test_autograd
  - [x] test_cuda
  - [x] test_torch
- [x] Modify docs
- [x] Remove `_batch_inverse` in `MultivariateNormal`.
- [x] Allow batch matrices as inputs for negative powers in `matrix_power`

Miscellaneous modifications:
- [x] Move all batch operations to BatchLinearAlgebra.cpp/.cu and provide general framework for adding more batch ops.
- [x] Add a RAII structure for MAGMA queue management.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9949

Differential Revision: D10559089

Pulled By: zou3519

fbshipit-source-id: 7da24977f8a79d97dd42883302e13e708c1726e4
2018-10-27 23:42:46 -07:00
cf235e0894 fix lint after new flake8 release added new style constraints (#13047)
Summary:
fix lint after new flake8 release added new style constraints
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13047

Differential Revision: D10527804

Pulled By: soumith

fbshipit-source-id: 6f4d02662570b6339f69117b61037c8394b0bbd8
2018-10-24 09:03:38 -07:00
e027f7a913 Fix character with wrong encodding in documentation (#12761)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12761

, is not really , and thus it can fail some of the Python 2 import.

Reviewed By: weiyangfb

Differential Revision: D10423231

fbshipit-source-id: 3738c0b9d2f52aa47eef06250f84c5933a38783f
2018-10-17 10:20:45 -07:00
1a6071d436 fixing seq to tensors in documentation (#12741)
Summary:
Fixes #12251

In the docs the actual key word argument was supposed to be `tensors` but instead it is given as `seq` for doing `torch.cat` operation.

zou3519 can you review this code? I don't have access to request for code reviews.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12741

Differential Revision: D10419682

Pulled By: ezyang

fbshipit-source-id: a0ec9c3f4aeba23ac3a99e2ae89bd07d2b9ddb58
2018-10-17 09:16:04 -07:00
0521c47c91 Amend nondeterminism notes (#12217)
Summary:
include atomicAdd commentary as this is less well known

There is some discussion in #12207

Unfortunately, I cannot seem to get the ..include working in `_tensor_docs.py` and `_torch_docs.py`. I could use a hint for that.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12217

Differential Revision: D10419739

Pulled By: SsnL

fbshipit-source-id: eecd04fb7486bd9c6ee64cd34859d61a0a97ec4e
2018-10-16 23:59:26 -07:00
d34578026c Various example code fixes (#12707)
Summary:
- Fix broken sparse_coo_examples, update output
- Tensor(...) to tensor(...)
- Fix arguments to math.log to be floats

While the last might be debateable, mypy currently complains when passing an int to math.log. As it is not essential for our examples, let's be clean w.r.t. other people's expectations.

These popped up while checking examples in the context of  #12500 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12707

Differential Revision: D10415256

Pulled By: SsnL

fbshipit-source-id: c907b576b02cb0f89d8f261173dbf4b3175b4b8d
2018-10-16 21:59:40 -07:00
81975a497f update docs for sparse tensor (#12221)
Summary:
- update docs examples at sparse tensor after print format changed
- update example to create empty sparse tensor:
```
>>> torch.sparse_coo_tensor(torch.LongTensor(size=[1,0]), [], torch.Size([1]))
tensor(indices=tensor([], size=(1, 0)),
       values=tensor([], size=(0,)),
       size=(1,), nnz=0, layout=torch.sparse_coo)
```

zou3519 SsnL yf225
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12221

Differential Revision: D10412447

Pulled By: weiyangfb

fbshipit-source-id: 155b8cb0965f060e978f12239abdc1b3b41f6ab0
2018-10-16 19:56:51 -07:00
5b8a640d0b Update fft docs for new cache size (#12665)
Summary:
Follow up of #12553
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12665

Differential Revision: D10385615

Pulled By: SsnL

fbshipit-source-id: 44fe9ec75cb735de37c56270f160a16a1d2bfb64
2018-10-16 01:47:36 -07:00
0740a5d521 compute_uv for SVD (#12517)
Summary:
Adds a `compute_uv` argument that defaults to `True` for optionally computing the singular vectors during SVD.

Closes https://github.com/pytorch/pytorch/issues/12420 .
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12517

Differential Revision: D10384554

Pulled By: SsnL

fbshipit-source-id: 704998a257afa815eda901b8ae830e8a661695be
2018-10-15 12:35:56 -07:00
a98958d3bd dtype option for softmax (#11719)
Summary:
Add dtype argument to softmax/log_softmax functions.
Computing softmax in fp32 precision is necessary for mixed precision training, and converting output of the previous layer into fp32 and then reading it as fp32 in softmax is expensive, memory and perf-wise, this PR allows one to avoid it.
For most input data/dtype combinations, input data is converted to dtype and then softmax is computed. If input data is half type and dtype is fp32, kernels with the corresponding template arguments are called.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11719

Reviewed By: ezyang

Differential Revision: D10175514

Pulled By: zou3519

fbshipit-source-id: 06d285af91a0b659932236d41ad63b787eeed243
2018-10-13 17:57:10 -07:00
ecb3835387 change \gamma to \Gamma (#12214)
Summary:
- revert `\gamma` changes at landed PR: https://github.com/pytorch/pytorch/pull/12126
- minor fix for docs of `torch.norm()`

SsnL
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12214

Differential Revision: D10127337

Pulled By: weiyangfb

fbshipit-source-id: 15eb8abda39ec9e8b2e815e2a22096cae786995a
2018-10-01 11:31:18 -07:00
5ffc915f26 fix docs (#12126)
Summary:
- fix https://github.com/pytorch/pytorch/issues/12120
- add `torch.argsort`, `torch.pdist`, `broadcast_tensors` to *.rst files
- add parameter dim to `torch.unique` doc
- fix table and args for `torch.norm`
- test plan: make html and check docs in browser

gchanan
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12126

Differential Revision: D10087006

Pulled By: weiyangfb

fbshipit-source-id: 25f65c43d14e02140d0da988d8742c7ade3d8cc9
2018-09-29 22:26:45 -07:00
817e83fc01 fix PR #11061 (#11815)
Summary:
- fix PR https://github.com/pytorch/pytorch/pull/11061 by moving `detach_()` and `set_requires_grad()` to `torch.tensor_ctor()` and `tensor.new_tensor`, and also removed warnings and `args_requires_grad` from `internal_new_from_data `
- with this patch, the returned tensor from `tensor_ctor()` and `new_tensor` will be detached from source tensor, and set requires_grad based on the input args
- `torch.as_tensor` retains its behavior as documented

gchanan apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11815

Differential Revision: D9932713

Pulled By: weiyangfb

fbshipit-source-id: 4290cbc57bd449954faadc597c24169a7b2d8259
2018-09-21 11:04:19 -07:00
b91b15d86e Implementing Matrix Norm for torch.norm (#11261)
Summary:
Currently, norm function only supports vector norm. This PR extends vector norm to matrix norm.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11261

Reviewed By: li-roy

Differential Revision: D9652379

Pulled By: yya007

fbshipit-source-id: 519b3fb80b563c17c56a24675c7b0e46bf5a3a1c
2018-09-20 14:43:13 -07:00
24e958a0a7 Move bernoulli into ATen (#10273)
Summary:
+ https://github.com/pytorch/pytorch/issues/10236 : torch.bernoulli's out kwarg is broken
  fixed in moving `bernoulli_out` to ATen
+ https://github.com/pytorch/pytorch/issues/9917 : BUG torch.bernoulli(p.expand(shape)) is broken
  fixed in moving all `bernoulli` ops in ATen to use the modern apply utils methods
+ https://github.com/pytorch/pytorch/issues/10357 : torch.bernoulli inconsistent gpu/cpu results
  fixed by adding CUDA asserts

In order to use `curand_uniform4`, I made some changes to `CUDAApplyUtils.cuh`. Specifically, I introduced an optional template parameter `int step` to the `CUDA_tensor_applyN` methods, representing that we want to process `step` values at each time for each of the `N` tensors.

The calling convention for `step = 1` (default) isn't changed. But if `step > 1`, the given lambda `op` must take in `int n` as its first argument, representing the number of valid values, because there may not be full `step` values at the boundary. E.g., here is what the `bernoulli(self, p_tensor)` call look like:
```cpp

  // The template argument `4` below indicates that we want to operate on four
  // element at each time. See NOTE [ CUDA_tensor_applyN helpers ] for details.
  at::cuda::CUDA_tensor_apply2<scalar_t, prob_t, 4>(
      ret, p,
      [seeds] __device__(
          int n, scalar_t& v1, scalar_t& v2, scalar_t& v3, scalar_t& v4,
          const prob_t& p1, const prob_t& p2, const prob_t& p3, const prob_t& p4) {
        curandStatePhilox4_32_10_t state;
        curand_init(
            seeds.first,
            blockIdx.x * blockDim.x + threadIdx.x,
            seeds.second,
            &state);
        float4 rand = curand_uniform4(&state);
        switch (n) {
          case 4: {
            assert(0 <= p4 && p4 <= 1);
            v4 = static_cast<scalar_t>(rand.w <= p4);
          }
          case 3: {
            assert(0 <= p3 && p3 <= 1);
            v3 = static_cast<scalar_t>(rand.z <= p3);
          }
          case 2: {
            assert(0 <= p2 && p2 <= 1);
            v2 = static_cast<scalar_t>(rand.y <= p2);
          }
          case 1: {
            assert(0 <= p1 && p1 <= 1);
            v1 = static_cast<scalar_t>(rand.x <= p1);
          }
        }
      }
    );
```

Benchmarking on `torch.rand(200, 300, 400)` 20 times, each time with 20 loops:

post patch
```
➜  ~ numactl --cpunodebind 1 --membind 1 -- taskset -c 12,13,14,15,16,17,18,19,20,21,22,23 env CUDA_LAUNCH_BLOCKING=1 python bern.py
torch.bernoulli(x)
6.841588497161865 +- 0.05413117632269859
torch.bernoulli(xc)
0.05963418632745743 +- 0.0008014909108169377
x.bernoulli_()
0.4024486541748047 +- 0.0021550932433456182
xc.bernoulli_()
0.02167394384741783 +- 2.3818030967959203e-05

```

pre-patch
```
➜  ~ numactl --cpunodebind 1 --membind 1 -- taskset -c 12,13,14,15,16,17,18,19,20,21,22,23 env CUDA_LAUNCH_BLOCKING=1 python bern.py
torch.bernoulli(x)
12.394511222839355 +- 0.0966421514749527
torch.bernoulli(xc)
0.08970972150564194 +- 0.0038722590543329716
x.bernoulli_()
1.654480218887329 +- 0.02364428900182247
xc.bernoulli_()
0.058352887630462646 +- 0.003094920190051198

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10273

Differential Revision: D9831294

Pulled By: SsnL

fbshipit-source-id: 65e0655a36b90d5278b675d35cb5327751604088
2018-09-19 16:45:47 -07:00
4ee0a78ee6 varargs for meshgrid (#11600)
Summary:
Adds vararg support for meshgrid and adds checks for all the tensor arguments to have the same dtype and device.

Fixes: [#10823](https://github.com/pytorch/pytorch/issues/10823), #11446

The earlier pull request closed without any changes because I had some rebasing issues, so I made another pull request to close out #10823. Sorry for the inconvenience.

Differential Revision: D9892876

Pulled By: ezyang

fbshipit-source-id: 93d96cafc876102ccbad3ca2cc3d81cb4c9bf556
2018-09-18 07:41:31 -07:00
407a9fee0c make copy constructed tensor a leaf variable when using torch.tensor(sourceTensor) (#11061)
Summary:
- fix https://github.com/pytorch/pytorch/issues/10876
- the cause of the bug is because copy constructor cannot distinguish between default value of requires_grad and requires_grad=False, thus it makes a copy from source tensor along with its grad_fn if requires_grad=True at source
- with this fix, the behavior becomes
```
>>> source = torch.randn(2, 2, requires_grad=True)
>>> copy = torch.tensor(source, requires_grad=True)
>>> print(copy)
tensor([[-1.2001,  1.9869],
        [-1.0134,  1.3096]], grad_fn=<CopyBackwards>)

>>> source = torch.randn(2, 2, requires_grad=True)
>>> copy = torch.tensor(source, requires_grad=False)
>>> print(copy)
tensor([[-0.7402,  0.0467],
        [ 0.4344, -0.0420]])

>>> source = torch.randn(2, 2, requires_grad=True)
>>> copy = torch.tensor(source)
>>> print(copy)
tensor([[-0.7402,  0.0467],
        [ 0.4344, -0.0420]])
```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11061

Differential Revision: D9569714

Pulled By: weiyangfb

fbshipit-source-id: ea368688bdc0f1ce5997870e164e42835b64b4a1
2018-09-17 23:29:09 -07:00