Commit Graph

95 Commits

Author SHA1 Message Date
5b78a5eadb Memory format support for contiguous and is_contiguous (#20455)
Summary:
#19975 was separated by 2 PRs.

This one:

Introduce MemoryFormat argument to the `x.is_contiguous(memory_format=torch.channels_last)` and to the `y = x.contiguous(memory_format=torch.channels_last)` functions.

At this moment both functions just operate with strides and doesn't store any tensor state.

(Original RFC #19092)

-----

Expands functionality of two tensor functions `.is_contiguous` and `.contiguous` (both python and c++ api).

Note: We had several complaints about `.to(memory_format)` function, and decided not to support it.

1.  `.contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`.

    - Using `torch.contiguous_format` will preserve existing `.contiguous()` behavior.

    - Calling `x.contiguous(memory_format=torch.channels_last)` returns new tensor which maintain same semantical layout (NCHW), but have different memory allocation pattern.

        `x.contiguous(memory_format=torch.channels_last)` expects input tensor to be 3d, 4d or 5d; and fails otherwise.

2. `.is_contiguous` now support optional keyword-only argument - `memory_format`, which can be either `torch.contiguous_format` or `torch.channels_last`.

    - `x.is_contiguous(memory_format=torch.contiguous_format)` preserves same functionality as `x.is_contiguous()` and remains unchanged.

    - `x.is_contiguous(memory_format=torch.channels_last)` returns true if A) input tensor is contiguous in memory AND B) allocated in the memory in NWHC (or similar for 3d,5d) format.

Note: By the end of the phase one `x.is_contiguous(memory_format=torch.channels_last)` will calculate state of the Tensor on every call. This functionality going to be updated later.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/20455

Differential Revision: D15341577

Pulled By: VitalyFedyunin

fbshipit-source-id: bbb6b4159a8a49149110ad321109a3742383185d
2019-05-16 07:18:24 -07:00
1c5073fb4b Adding pin_memory kwarg to zeros, ones, empty, ... tensor constructors (#18952)
Summary:
Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary.

Supported functions:
```python
torch.rand_like(t, pin_memory=True)
torch.randn_like(t, pin_memory=True)
torch.empty_like(t, pin_memory=True)
torch.full_like(t, 4, pin_memory=True)
torch.zeros_like(t, pin_memory=True)
torch.ones_like(t, pin_memory=True)
torch.tensor([10,11], pin_memory=True)
torch.randn(3, 5, pin_memory=True)
torch.rand(3, pin_memory=True)
torch.zeros(3, pin_memory=True)
torch.randperm(3, pin_memory=True)
torch.empty(6, pin_memory=True)
torch.ones(6, pin_memory=True)
torch.eye(6, pin_memory=True)
torch.arange(3, 5, pin_memory=True)
```

Part of the bigger: `Remove Storage` plan.

Now compatible with both torch scripts:
 `  _1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"), pin_memory=False)`
and
`  _1 = torch.zeros([10], dtype=6, layout=0, device=torch.device("cpu"))`

Same checked for all similar functions `rand_like`, `empty_like` and others

It is fixed version of #18455
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18952

Differential Revision: D14801792

Pulled By: VitalyFedyunin

fbshipit-source-id: 8dbc61078ff7a637d0ecdb95d4e98f704d5450ba
2019-04-16 11:06:15 -07:00
f1c8e01524 Add input information in RecordFunction calls (#18717)
Summary:
Add input information into generated RecordFunction calls in
VariableType wrappers, JIT operators and a few more locations
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18717

Differential Revision: D14729156

Pulled By: ilia-cher

fbshipit-source-id: 811ac4cbfd85af5c389ef030a7e82ef454afadec
2019-04-15 20:28:08 -07:00
b7c830b916 Revert "Adding pin_memory kwarg to zeros, ones, empty,... (#18854)
Summary:
This reverts commit c484cf43a02863efd2f4a76aad43246fb0191ab5.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18854

Differential Revision: D14778393

Pulled By: VitalyFedyunin

fbshipit-source-id: 4b5a1f5b1c091bbc4a8e75614734cc011d26b452
2019-04-05 06:25:33 -07:00
a21e256e8d Fix contiguous AD and Autogradzero inconsistency (#18633)
Summary:
Fixes #17962
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18633

Differential Revision: D14700449

Pulled By: wanchaol

fbshipit-source-id: 3d15d67c01b69b28394a0f2f001db90ed9fd31dc
2019-04-03 12:47:28 -07:00
c484cf43a0 Adding pin_memory kwarg to zeros, ones, empty, ... tensor constructors. (#18455)
Summary:
Make it possible to construct a pinned memory tensor without creating a storage first and without calling pin_memory() function. It is also faster, as copy operation is unnecessary.

Supported functions:
```python
torch.rand_like(t, pin_memory=True)
torch.randn_like(t, pin_memory=True)
torch.empty_like(t, pin_memory=True)
torch.full_like(t, 4, pin_memory=True)
torch.zeros_like(t, pin_memory=True)
torch.ones_like(t, pin_memory=True)
torch.tensor([10,11], pin_memory=True)
torch.randn(3, 5, pin_memory=True)
torch.rand(3, pin_memory=True)
torch.zeros(3, pin_memory=True)
torch.randperm(3, pin_memory=True)
torch.empty(6, pin_memory=True)
torch.ones(6, pin_memory=True)
torch.eye(6, pin_memory=True)
torch.arange(3, 5, pin_memory=True)
```

Part of the bigger: `Remove Storage` plan.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18455

Reviewed By: ezyang

Differential Revision: D14672084

Pulled By: VitalyFedyunin

fbshipit-source-id: 9d0997ec00f59500ee018f8b851934d334012124
2019-04-02 08:48:19 -07:00
173f224570 Turn on F401: Unused import warning. (#18598)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18598
ghimport-source-id: c74597e5e7437e94a43c163cee0639b20d0d0c6a

Stack from [ghstack](https://github.com/ezyang/ghstack):
* **#18598 Turn on F401: Unused import warning.**

This was requested by someone at Facebook; this lint is turned
on for Facebook by default.  "Sure, why not."

I had to noqa a number of imports in __init__.  Hypothetically
we're supposed to use __all__ in this case, but I was too lazy
to fix it.  Left for future work.

Be careful!  flake8-2 and flake8-3 behave differently with
respect to import resolution for # type: comments.  flake8-3 will
report an import unused; flake8-2 will not.  For now, I just
noqa'd all these sites.

All the changes were done by hand.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

Differential Revision: D14687478

fbshipit-source-id: 30d532381e914091aadfa0d2a5a89404819663e3
2019-03-30 09:01:17 -07:00
ea652973f2 Fix truncation of default float values in JIT signatures. (#18044)
Summary:
In python2, float values get truncated.  We are storing default float values as floats (not 100% sure why?), which results in the defaults being truncated in the JIT and not matching the (specified) native function signatures.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/18044

Reviewed By: ezyang

Differential Revision: D14469868

Pulled By: gchanan

fbshipit-source-id: a456de599e8dab106966bcac7a6033f02ce3cdd2
2019-03-15 07:43:15 -07:00
02c48cced9 Remove (almost all) TensorOptions from native_functions.yaml (#17385)
Summary:
Stacked on top of https://github.com/pytorch/pytorch/pull/17386

Brings us to 1014/1106 of writing.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17385

Differential Revision: D14248008

Pulled By: cpuhrsch

fbshipit-source-id: 033e00de91e3edf7ae01ca03ebe436c0446b3b5c
2019-03-12 08:00:00 -07:00
b290a16b2d Use return names in JIT operators
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/17638

Differential Revision: D14295606

Pulled By: cpuhrsch

fbshipit-source-id: 62040ac65434411357808735f0fe6cd33cc1c30f
2019-03-07 23:34:42 -08:00
e47aeede32 Use name for output variables instead of out in JIT (#17386)
Summary:
This adds 88 matches.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/17386

Differential Revision: D14179139

Pulled By: cpuhrsch

fbshipit-source-id: 2c3263b8e4d084db84791e53290e8c8b1b7aecd5
2019-02-27 14:03:33 -08:00
eae139e18f Support named tuple return from operators on JIT (#16253)
Summary:
Fixes: https://github.com/pytorch/pytorch/issues/16233

The following changes are made:
- Modify `TupleType` to store optional field names
- Modify schema matching to return fill in those field names when creating  `TupleType` as return type.
- Modify codegen of JIT to copy field names to schema string
- Modify `SchemaParser` to set field names of returned schema.
- Modify `SimpleValue::attr` to emit tuple indexing for named tuple.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16253

Reviewed By: ezyang

Differential Revision: D13954298

Pulled By: zdevito

fbshipit-source-id: 247d483d78a0c9c12d1ba36e1f1ec6c3f1a3007b
2019-02-10 18:15:56 -08:00
ac00e85e36 Remove undefined tensor in jit script (#16379)
Summary:
This PR is a follow up of #15460, it did the following things:

* remove the undefined tensor semantic in jit script/tracing mode
* change ATen/JIT schema for at::index and other index related ops with `Tensor?[]` to align with what at::index is really doing and to adopt `optional[tensor]` in JIT
* change python_print to correctly print the exported script
* register both TensorList and ListOfOptionalTensor in JIT ATen ops to support both
* Backward compatibility for `torch.jit.annotate(Tensor, None)`

List of follow ups:

* remove the undefined tensor semantic in jit autograd, autodiff and grad_of
* remove prim::Undefined fully

For easy reviews, please turn on `hide white space changes` in diff settings.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16379

Differential Revision: D13855677

Pulled By: wanchaol

fbshipit-source-id: 0e21c14d7de250c62731227c81bfbfb7b7da20ab
2019-02-07 11:02:14 -08:00
4404762d7d Rename IntList to IntArrayRef. (#16751)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16751

This was made more complicated by the fact that ivalue::IntList
is a thing.  So I had to fix all of the sites where we referring
to IValue post facto.

The following codemods were run, in this order:

```
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntList IntArrayRef
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in IntArrayRef::create IntList::create
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in ivalue::IntArrayRef ivalue::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in Tag::IntArrayRef Tag::IntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in isIntArrayRef isIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in toIntArrayRef toIntList
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'Shared<IntArrayRef>' 'Shared<IntList>'
codemod -m -d . --extensions cc,cpp,cu,cuh,h,hpp,py,cwrap,yaml,in 'intrusive_ptr<IntArrayRef>' 'intrusive_ptr<IntList>'
```

Some manual fixups were done afterwards; they can be reviewed separately
at https://github.com/pytorch/pytorch/pull/16752

Reviewed By: dzhulgakov

Differential Revision: D13954363

fbshipit-source-id: b5c40aacba042402155a2f5a229fa6db7992ac64
2019-02-05 14:54:34 -08:00
dfb081a7e4 Fix a lot of C++ build warnings (#16411)
Summary:
I went through my build log and did what I thought were reasonable fixes to all the C++ compilation warnings that came up
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16411

Differential Revision: D13901006

Pulled By: jamesr66a

fbshipit-source-id: 02df4e3e5a5c8dd9e69ac9f065cd3f2a80645033
2019-01-31 14:35:56 -08:00
1905bbb01d Include ATen/core/functional.h directly instead of torch/csrc/utils/functional.h. (#16377)
Summary:
One more shim removed.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16377

Differential Revision: D13821816

Pulled By: ZolotukhinM

fbshipit-source-id: 007f014d404de51841437db7eef28367a2f6e46b
2019-01-30 14:02:34 -08:00
47bf30661f Directly include headers from ATen.
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/16287

Differential Revision: D13792949

Pulled By: ZolotukhinM

fbshipit-source-id: d627d8dc469df048063c70d0b5b8d33fede809a3
2019-01-24 11:22:27 -08:00
a667767220 Add matches_jit_signature attribute to native_functions.yaml (#16040)
Summary:
If "matches_jit_signature" is set to True for a particular function, we will assume that the func syntax follows the JIT signature syntax. This is a temporary attribute and doesn't need to be set by developers outside the core team. It serves as a means of tracking an ongoing schema unification with the goal of aligning func syntax with other components of PyTorch in order to reduce overall complexity and match coverage of different function descriptions.

Followup PRs might be about removing _out from native_functions.yaml and using Tensor annotations instead, etc.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/16040

Reviewed By: ezyang

Differential Revision: D13703176

Pulled By: cpuhrsch

fbshipit-source-id: ce248e1823a6f18efa95502f9f3eebf023b4a46c
2019-01-17 12:39:08 -08:00
bebf1f7463 Torch tensor (#15224)
Summary:
Support torch.tensor in script. Already been accepted, trying to reland
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15224

Differential Revision: D13466616

Pulled By: eellison

fbshipit-source-id: f7850da07b0eb11af98f255fc15bd3cf861f2a40
2019-01-03 17:35:17 -08:00
934fc28656 Remove NoneGenerator
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/15335

Differential Revision: D13540357

Pulled By: driazati

fbshipit-source-id: a289e5944b65872103f68faac74e18f10e7c6fff
2018-12-21 16:33:37 -08:00
b89b46abfb Remove python_default_init from ATen and use Optional (#15234)
Summary:
Optional clean up. This PR remove python_default_init from the yaml files, and the code-gen, and utilize optional type to do the work.

This also fix the bug in the #13149 to correctly adopt as_strided backward.

Fixes #9941
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15234

Differential Revision: D13502044

Pulled By: wanchaol

fbshipit-source-id: 774b61fc4414482cf11d56e22bd0275aefb352a4
2018-12-19 21:38:50 -08:00
560530aeec Optional ScalarType support for native functions & JIT (#15154)
Summary:
For #6593 and #9515

This completes the support for optional<ScalarType> in native, JIT and autograd.

Note: Mostly following the existing implementation for optional<Scalar> that was added in https://github.com/pytorch/pytorch/pull/12582.

This PR introduces a way to make functions accept an optional dtype and it will unblock #9515 by allowing the `dtype` param for type promotion interface:
```
func: name(inputs, *, ScalarType? dtype=None, Casting casting=same_kind)
```

An alternative approach could have been using `ScalarType::Undefined` for the same purpose but without optional, though it would have been a bit hacky.
```
func: name(inputs, *, ScalarType dtype=Undefined, Casting casting=same_kind)
```

Here's an example use of this in action: 971f69eac6

There are already a bunch of native functions that were getting optional `dtype` through function overloading. https://github.com/pytorch/pytorch/pull/15133 is the attempt to migrate all of those. I will send those changes separately after this since some functions (e.g. sum) need quite a bit of change in the codebase. See the commits over there.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15154

Differential Revision: D13457760

Pulled By: tugrulates

fbshipit-source-id: 706134f0bd578683edd416b96329b49a1ba8ab48
2018-12-19 10:45:35 -08:00
73ee7fda4c Remove deprecated variable_tensor_functions (#15003)
Summary:
Removing the deprecated functions in `torch/csrc/variable_tensor_functions.h` (like `torch::CPU`) and corresponding implementations from `torch/csrc/torch.cpp` from master after the release.

ezyang gchanan soumith
Pull Request resolved: https://github.com/pytorch/pytorch/pull/15003

Differential Revision: D13418086

Pulled By: goldsborough

fbshipit-source-id: a0accdf6f7b0efa1ec07ac7b74b86ff2da37543f
2018-12-11 17:16:11 -08:00
78d594f46c Implement Device as a type in the script (#14666)
Summary:
[ note:  stacked on expect files changes, will unstack once they land ]
This adds DeviceObjType (cannot use DeviceType it is already an enum)
to the type hierarchy and an isDevice/toDevice pair to IValue.
Previous hacks which used an int[] to represent Device are removed
and at::Device is used instead.

Note: the behavior or .to is only a subset of python, we need to
fix the aten op so that it accepts Option[Device] and Optional[ScalarType].
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14666

Reviewed By: suo

Differential Revision: D13290405

Pulled By: zdevito

fbshipit-source-id: 68b4381b292f5418a6a46aaa077f1c902750b134
2018-12-03 16:54:40 -08:00
79ceecec8e Optional undefined tensor support (#13650)
Summary:
This PR is a part of task to unblock standard library export.
* we treat None differently from Tensor and other types, when passing None as Tensor, it's an undefined tensor rather than the None IValue.
* Refine the type system so that we have correct tensor types hierarchy (Dynamic/Tensor/CompleteTensor), Dynamic should be at the top of the inheritance hierarchy.
* It also tries to export bilinear as an example of undefined tensor(None) input.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13650

Differential Revision: D12967026

Pulled By: wanchaol

fbshipit-source-id: 6aedccc7ce2a12fadd13d9e620c03e1260103a5a
2018-11-09 11:29:57 -08:00
464dc31532 Add README to tools, delete defunct scripts. (#13621)
Summary:
Some extra documentation for other bits too.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13621

Differential Revision: D12943416

Pulled By: ezyang

fbshipit-source-id: c922995e420d38c2698ce59c5bf4ffa9eb68da83
2018-11-06 11:20:53 -08:00
99a5d19591 Rename elementwise_mean to mean (#13419)
Summary:
Closes #12459
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13419

Differential Revision: D12883299

Pulled By: SsnL

fbshipit-source-id: 8b4512ff73b66fdc674412904dbb3bf497ba70a7
2018-11-01 10:31:26 -07:00
e5d56659ec Delete DeviceGuard(int64_t) constructor. (#13232)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13232

DeviceGuard should be device agnostic, which means that it shouldn't
assume that int64_t means select the CUDA device.

Reviewed By: gchanan

Differential Revision: D10858024

fbshipit-source-id: b40e8337e4046906fd8f83a95e6206367fb29dbe
2018-10-31 07:55:11 -07:00
8c2d0c831f Speed up tensor.storage_offset (#13267)
Summary:
This PR special cases tensor.storage_offset to avoid dispatches in the
common case. tensor.storage_offset is important for torch.as_strided
performance, because as_strided(sizes, strides) shares an implementation
with as_strided(sizes, strides, storage_offset) and it might not be the
best if there were two separate implementations (including backward
implementations).

This PR reduces times on a tensor.storage_offset
microbenchmark from 22ns to 2ns (these numbers are pretty stable). For
a torch.as_strided benchmark, this PR reduces numbers from 1042 to
928ns, a 100ns improvement, but this number is noisy and goes up and
down.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13267

Reviewed By: ezyang

Differential Revision: D12829828

Pulled By: zou3519

fbshipit-source-id: df907731e2398ce2baf1c8b1860a561ccc456f78
2018-10-30 07:36:21 -07:00
d8dab6ffa8 Add tensor.to(options) (#13146)
Summary:
ezyang on the template hack
smessmer on SFINAE of the `TensorOptions(Device)`
goldsborough on the C++ API test changes
zdevito on the `jit` codegen changes
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13146

Reviewed By: ezyang

Differential Revision: D12823809

Pulled By: SsnL

fbshipit-source-id: 98d65c401c98fda1c6fa358e4538f86c6495abdc
2018-10-29 16:26:06 -07:00
ce0d3e9b35 Bind inplace and _out variants into JIT (#13093)
Summary:
This commit is a minimial initial pass at adding inplace and _out variants to the JIT.
It changes gen_jit_dispatch.py to add bindings for these operators, and it also
supplements the FunctionSchema with alias information for these operators and for
viewing operators.

Tests are very minimal and will need to be improved in future commits.

Notes:

* Custom operator tests needed to be changed since _out variants add overloads, which
  the custom operator pipeline does not handle when called from python. This commit
  registers special test ops in the _test namespace for this purpose.
* Extends the schema parser to parse alias annotations more robustly.
* Extends FunctionSchema with `writes()` a set of alias set names that the op will write to,
  and `annotatedType()` which will return AnnotatedType objects which contain the alias_set
  information that was parsed from the schema.
* Disables all optimizations in graph executor when a mutable operator is found. This
  is something that will be improved in the future but is necessary for correctness now.
* Adds annotate_ops to gen_jit_dispatch which adds aliasing information to all of the
  aten ops.
* Adds AnnotatedType to the type hierarchy which is used to mark List and Tensor types
  with their alias_set. These types only appear in schema when you call annotatedType
  and are erased from types in normal use.
* Extends jit::Type with .containedTypes() and .withContained(new_types). The first returns all types contained
  within the type (e.g. T for T[], or {T,L} for a tuple (T, L)). The second constructs a new
  version of the same type, replacing the contained types with new_types. This simplifies
  a lot of logic for recursively cleaning up types.
* Refactor List[T] into a common part that is shared with Annotated[T] and can be shared
  with Optional[T] and Future[T] when they are merged.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13093

Differential Revision: D10848176

Pulled By: zdevito

fbshipit-source-id: d057f23eeb99cde8881129b42d3f151ed5e7655d
2018-10-26 10:37:20 -07:00
efab8e8fdf Speed up tensor.get_device(), is_cuda(), is_sparse() by avoiding dispatches (#12841)
Summary:
`tensor.get_device()` went through two dispatches: once to the native
function
`get_device()`, and another when `get_device` calls `_th_get_device()`.
This PR avoids the dispatch by directly implementing the `get_device`
function
as a method on Tensor.

Future Work:
- Investigate caching Device on TensorImpl. This will probably bring the
  tensor.get_device down to 2ns, but I'm not sure it's worth it.

before:
```
------------------------------------------------------------------------
Benchmark                                 Time           CPU Iterations
------------------------------------------------------------------------
BM_TensorTypeId                           0 ns          0 ns 1000000000
BM_TensorType                             8 ns          8 ns   89407911
BM_TensorIsCuda                          24 ns         24 ns   29313017
BM_TensorIsSparse                        27 ns         27 ns   26083160
BM_TensorTypeIsCuda                      11 ns         11 ns   65128120
BM_TensorNumel                           11 ns         11 ns   68314492
BM_TensorGetDevice                       71 ns         71 ns    9633125
BM_DeviceGuardCtor                      173 ns        173 ns    4067173
BM_DeviceGuard                          232 ns        232 ns    3009690
```

after:
```
------------------------------------------------------------------------
Benchmark                                 Time           CPU Iterations
------------------------------------------------------------------------
BM_TensorTypeId                           0 ns          0 ns 1000000000
BM_TensorType                            10 ns         10 ns   69803872
BM_TensorIsCuda                           2 ns          2 ns  321626683
BM_TensorIsSparse                         6 ns          6 ns  177045382
BM_TensorNumel                           12 ns         12 ns   58770533
BM_TensorGetDevice                        4 ns          4 ns  128113396
BM_DeviceGuardCtor                       52 ns         52 ns   14997278
BM_DeviceGuard                          158 ns        158 ns    5767248

```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12841

Differential Revision: D10489353

Pulled By: zou3519

fbshipit-source-id: a596bc77352f21d5d35433c6de02c2f65aab5f9e
2018-10-25 19:57:52 -07:00
4e1c64caee Add c10::optional to type syntax (#12582)
Summary:
This PR adds optional type to ATen native, autograd, JIT schema and Python Arg parser, closes #9513. It allows us to use optional default values (including None) for function signature and implementations like clamp, etc., and also let us remove the python_default_init hack.

Follow up:

remove python_default_init completely.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12582

Differential Revision: D10417423

Pulled By: wanchaol

fbshipit-source-id: 1c80f0727bb528188b47c595629e2996be269b89
2018-10-25 16:08:29 -07:00
4c21b2f2d3 split register_aten_ops.cpp into shards (#12615)
Summary:
after an analogous breakup of VariableType.cpp, the generated
register_aten_ops.cpp is now the slowest-to-compile file in a typical
incremental rebuild by a wide margin. Therefore, give it the same
treatment - the generated code is split across several files to allow
parallel compilation.

Note that the existing code takes some care to arrange that overloads
of the same op name are given in a particular order. This diff
preserves that behavior, by treating all overloads of the same name as
a single indivisible unit, and sharding based on these groups rather
than on individual constructors.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12615

Reviewed By: ezyang

Differential Revision: D10367363

Pulled By: anderspapitto

fbshipit-source-id: 07db5f9cb79748040909716349626412a13bc86e
2018-10-15 14:12:27 -07:00
d1ac1eba3b Add bool type to IR (#11834)
Summary:
This PR adds a bool type to `IValue` and puts it into place.

* changes conds for `prim::If` and `prim::Loop` to use `bool` type
* changes operators that take `bool`s to match their native ops
* fixes ambiguous `aten` ops `aten::std` and `aten::var`
	* fixes tests in `test_jit.py TestJitGenerated`
		```
		'test_std_dim',
		'test_std_dim_1d',
		'test_std_dim_1d_neg0',
		'test_std_dim_neg0',
		'test_var_dim',
		'test_var_dim_1d',
		'test_var_dim_1d_neg0',
		'test_var_dim_neg0'
		```
* adds `prim::BoolToTensor` and `prim::TensorToBool`

apaszke zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11834

Differential Revision: D9928570

Pulled By: driazati

fbshipit-source-id: 373c53df2f1a8ffa9e33d9a517002fbeef25f3eb
2018-10-03 12:40:03 -07:00
b7c302da1a Make gen_jit_dispatch runnable (#12018)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12018

Tried to use the file and ran into a small bug, this fixes it

Differential Revision: D10013231

fbshipit-source-id: 4cf8c29cf9e2cedd7a28fa0cc0196e5144a54bf2
2018-09-24 16:09:48 -07:00
62c9d4ac96 Make .to() methods native functions (to fix JIT tracing)
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/11491

Differential Revision: D9771121

Pulled By: apaszke

fbshipit-source-id: 08d11101fb12093f8cf913b06359adddf3af9da7
2018-09-11 21:55:42 -07:00
8b196d671b Allow tracing random functions (only when using default generators) (#11539)
Summary:
Fixes #11504.

zdevito, neerajprad, fritzo
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11539

Differential Revision: D9777897

Pulled By: apaszke

fbshipit-source-id: 56983260f5b93da7d5540a6242769ea7bd50eb06
2018-09-11 17:56:39 -07:00
120d769432 Add support for tracing strings (#11506)
Summary:
This enabled `torch.einsum` both in tracing and in script mode. It's used all over Pyro at the moment, and is needed for any use of the JIT in there.

Fixes #11157.

zdevito fritzo neerajprad
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11506

Differential Revision: D9764787

Pulled By: apaszke

fbshipit-source-id: 9b5251b9e7c5897034602bd07ff67b425d33326c
2018-09-11 06:02:41 -07:00
0ddbe668cd Improve shape analysis to cover all most commonly used ops (#11358)
Summary:
[Here's a list](https://gist.github.com/apaszke/f0821840bdcc67a977832dc58acc1b85) of ops that are in `register_aten_ops.cpp`, but aren't supported in shape prop. Everything else should work now.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/11358

Differential Revision: D9753693

Pulled By: apaszke

fbshipit-source-id: efeae0126ce16cb56b8797fc5246405588bcae3c
2018-09-11 06:02:39 -07:00
ae635b16f7 Record tensor factory functions in trace (#10935)
Summary:
Things like torch.zeros now appear in traces rather than constants.

To continue to support our current level of ONNX export, we run
constant prop to turn these back into constants where possible before
export.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10935

Differential Revision: D9527427

Pulled By: zdevito

fbshipit-source-id: 552a8bcc01b911251dab7d7026faafdd7a3c758a
2018-08-29 17:10:24 -07:00
c101a57a74 Build mechanism for custom operators (#10226)
Summary:
This is the last step in the custom operator implementation: providing a way to build from C++ and Python. For this I:

1. Created a `FindTorch.cmake` taken largely from ebetica with a CMake function to easily create simple custom op libraries
2. Created a ` torch/op.h` header for easy inclusion of necessary headers,
3. Created a test directory `pytorch/test/custom_operator` which includes the basic setup for a custom op.
    1. It defines an op in `op.{h,cpp}`
    2. Registers it with the JIT using `RegisterOperators`
    3. Builds it into a shared library via a `CMakeLists.txt`
    4. Binds it into Python using a `setup.py`. This step makes use of our C++ extension setup that we already have. No work, yey!

The pure C++ and the Python builds are separate and not coupled in any way.

zdevito soumith dzhulgakov
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10226

Differential Revision: D9296839

Pulled By: goldsborough

fbshipit-source-id: 32f74cafb6e3d86cada8dfca8136d0dfb1f197a0
2018-08-16 18:56:17 -07:00
dad6e8bb6c Remove capture specifiers in register_aten_ops when they're not needed. (#9669)
Summary:
zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9669

Differential Revision: D8952335

Pulled By: resistor

fbshipit-source-id: 8fbbec7a7f55fbeeda3509cb3d339e1db90a53e6
2018-08-02 13:40:31 -07:00
080ae5ea1f Remove implicit ArrayRef -> vector conversion (#9740)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9740

- Remove implicit ArrayRef -> vector conversion
- Fix 4 call sites that accidentally did an implicit expensive vector conversion but wouldn't have needed to
- Remove explicit vector conversion from 4 call sites that also didn't need to do that

Reviewed By: ezyang

Differential Revision: D8961693

fbshipit-source-id: 980da9f988083c0072497f9dbcbbf6f516fa311c
2018-08-01 15:34:52 -07:00
87d57dc5f5 Simplified Operator (#10080)
Summary:
zdevito explained that the attributed versions of `Operator`s are no longer necessary. This PR does two things:

1. Removes all code associated with attributed operators,
2. Adds a second kind of state to `Operator` where it is constructed with an `Operation` directly instead of an `OperationCreator`. This will be useful to test custom operators which don't require a node (you can just retrieve it directly).

Now rebased on top of https://github.com/pytorch/pytorch/pull/9801

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/10080

Differential Revision: D9113668

Pulled By: goldsborough

fbshipit-source-id: 1276a191c7cf89da1c38488769f2105ce2664750
2018-08-01 09:41:08 -07:00
5e5c15dd42 Add (constant size) TensorLists to JIT, use them in cat and stack nodes (#9948)
Summary:
zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9948

Reviewed By: ezyang

Differential Revision: D9033666

Pulled By: apaszke

fbshipit-source-id: 02d75e391ed6dee62500842df50f0b6ee5e38846
2018-07-31 07:39:52 -07:00
8cb1eef7b9 Unify IR operator representation (stop using attributes in the JIT) (#9807)
Summary:
Based on top of #9763 (first 3 commits belong to that PR). The first commits from this PR are "Stop using attributes ..."

I tried to separate the changes into fairly meaningful commits. I can't split them up into smaller PRs, because everything starts working and all tests pass only after the whole sequence, but hopefully this will make reviewing somewhat easier.

Known issues/regressions/future tasks:
- `aten::lerp` and `aten::clamp` are no longer fusable
- `CreateAutodiffSubgraphs` needs a rewrite
  - It is much more strict now, and will miss a lot of opportunities, especially when viewing ops are involved. Our previous approach was "ignore the assumption on shape availability in gradient formulas to determine differentiability, and hope that shape prop will be robust enough to actually deliver them before we differentiate", which obviously doesn't scale well to more complex cases. We should either work on reducing the size dependency of grad formulas (feasible e.g. for `view`/`reshape`, unfeasible for `squeeze`/`unsqueeze`), or make `CreateAutodiffSubgraphs` integrate some kind of "I could integrate this node into an AD subgraph, but will I be able to infer the shape of its input" reasoning (kind of like a limited shape prop, that doesn't infer anything, and only tells if it *could* infer something).
  - It sometimes creates constant-only (or constants + one node) graphs, which is useless
- Broken `aten::add` in auto-batching, because it gained a non-tensor input. I changed the test for pointwise operations to use `aten::mul` instead, but I needed to disable the LSTM cell test. I'm not sure how scalar constants should be implemented in this case, because I don't fully understand our format. cc: ChunliF
- Graph import does some hacks to recover type of constants. This code should be removed once we'll gain the ability to export the IR along with value types.
- There's still a fair amount of dead code that can be removed. I didn't want to make this diff any bigger, and removing it is an easy task.
- Graph fuser could be improved to use signature matching (possibly using `OperatorSet`) instead of basing on node kinds.
- Manual constant propagation for the `ListConstruct` node in `torch/onnx/utils.py` should be replaced with a proper constant propagation pass (or we should ensure that the one we have handles at least this case before we remove this code).

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9807

Reviewed By: ezyang

Differential Revision: D9004285

Pulled By: apaszke

fbshipit-source-id: fe88026a765f6b687354add034c86402362508b7
2018-07-26 22:11:50 -07:00
302adb7cc8 added torch.rot90() to ATen (#8628)
Summary:
1. fixes #6271
2. implemented torch.rot90() following [numpy.rot90()](6a58e25703/numpy/lib/function_base.py (L54-L138))
Pull Request resolved: https://github.com/pytorch/pytorch/pull/8628

Reviewed By: ezyang

Differential Revision: D8987860

Pulled By: weiyangfb

fbshipit-source-id: 8dac3b2a1f6d3288672977aba8b547706ce97fe9
2018-07-25 15:11:44 -07:00
a949245a86 Switch interpreter to use IValue's primitive int/floats (#9718)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9718

This patch switches the interpreter to use IValue's primitive numbers rather than tensors for computing on integers and floats. In addition to preparing the interpreter for first-class support of other types, this cleans up the handling of primitive numbers, making it possible to just use the normal operator overloading dispatch to find the right implementation for numbers. As a result of this change, a lot of other functionality needed to be updated since it was the first time we use non-tensors in a lot of places in the code base.

Notes:
* Fixes code_template.py so that multi-line strings are indented correctly when used on a standalone line
* Cast operators (`int(x)`) now are functional. Some tests have addition conversions to integers because
we no longer allow implicit tensor -> integer conversions following the same convention as in python
* prim::ListConstruct/createList has been added to the interpreter for creating lists and this has
replaced aten::stack for integers lists
* gen_jit_dispatch.py has been refactored so that non-tensor types use operators on IValues to extract
the primitives
* IValue gains a .to<T> method that is the equivalent of tensor_as but for IValue instead of at::Tensor
* `constant_as<T>` is switched over to using IValues's `.to<T>` method, to make conversion from constant->IValue->C++ type
more consistent. This functionality combined with `toIValue(Value*)` replaces the `tensor_as` and `as_tensor` family of functions.
* conditional expressions (if, loop) and operators related to them are now computed on integers rather than tensors
* IValue gains constructors for constructing from at::Scalar and converting to it. However, IValue itself will always store
the scalars as a double or int64.
* To align with python 3 syntax, TK_INT, TK_FLOAT, and TK_BOOL have been removed from the parser, and int/float/bool are just treated as special identifiers in the compiler,
along with print. These are represented as special sugared values with a `call` method implemented. For int/float/bool this implements casting behavior.
* Dropped shared_from_this from Type/Module. They were not needed and they making debugging harder because they internally throw/catch exceptions.
* Shape propagation has been updated to support running nodes that include floating point primitive types, this required some refactoring of internal functions.
* TensorToNum and NumToTensor have actual implementations as operators now
* regster_prim_ops now contains implementations of math operators for float/int primitive types, and for mixed (prim <+> tensor) versions. This removes the need for special handling in compiler.cpp
* Primitive math is now entirely handled by letting the compiler choose the right overloads. This removes tons of special casing in the compiler.
* incorporates eellison's change to allow casting from return values. Due to the addition of primitive support, the code need slight modifications, so I just pre-merged it here.
* stack.h gains generic vararg versions of push/pop that know how to convert to/from C++ types:

```
at::Tensor a;
at::Scalar b;
pop(stack, a, b);
at::Tensor c = a + b;
push(stack, c);
```
apaszke
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9584

Reviewed By: apaszke

Differential Revision: D8910546

Pulled By: zdevito

fbshipit-source-id: 0f3e60d4d22217f196a8f606549430e43b7e7e30
2018-07-23 14:11:11 -07:00
1d4d9fc7da Prepare to stop using attributes in the JIT (#9505)
Summary:
This PR adds machinery to cache the schema in an IR node, and allows lookups of (possibly) constant inputs by their names (instead of position). The new methods are:

- `at::optional<T> get<T>(Symbol name)` - if the argument called name is a constant, then casts it to type `T` and returns it. If it's not constant returns `nullopt`. Raises an error if there's no argument with that name.
- `at::optional<IValue> get<T>(Symbol name)` - like above, but packs the result in an IValue
- `Value* getValue(Symbol name)` - retrieves a `Value*` for an argument (no need to know its position).

All above functions currently inspect the attributes as well, but that's only so that I could start using them in other places in the JIT without disrupting our current functionality. I wanted this diff to be a preparation that doesn't change the semantics too much, and so both the tracer and script create nodes with attributes. The next PR will put that to a stop, and hopefully the changes we need to make to other components will be simpler thanks to what I did here.

One more thing I'd like to do before actually stopping creating the non-attributed nodes is to have a convenient way of creating a schema programmatically, matching nodes against it, and creating them without having to pack inputs into flat argument lists (which is quite error prone).

zdevito
Pull Request resolved: https://github.com/pytorch/pytorch/pull/9505

Reviewed By: ezyang

Differential Revision: D8915496

Pulled By: apaszke

fbshipit-source-id: 39d14fc9a9d73d8494f128367bf70357dbba83f5
2018-07-20 10:56:00 -07:00