1715 Commits

Author SHA1 Message Date
4007dd76e2 Add missing ONNX symbolics and fix fusible expand logic (#5654)
This includes various fixes required to export the NMT decoder to ONNX

* Add missing ONNX symbolics and fix fusible expand logic

* Update comments and use of at::optional

* Use _unimplemented
2018-03-12 15:39:39 -04:00
e9d1a5f6d5 support non-Variable arguments to functions in symbolic overrides (#5645)
simply pass them through unmodified. This is just the final tweaks,
after the bulk of the work getting rid of ExportProxy
2018-03-10 17:51:49 -05:00
7f44c0d011 rename onnx/utils/__init__.py -> onnx/utils.py (#5639) 2018-03-08 22:17:59 -05:00
06df037d9a do away with ExportProxy hack in onnx export (#5614)
ExportProxy was a mechanism to reuse the code that supported exporting
autograd Functions to support overriding arbitrary python
functions. However, it had some serious downsides

- only works on some functions (all args must be Variable)
- complicated
- bad error messages in some cases

Instead, just expose enough functionality to python to perform the
necessary logic explicitly.
2018-03-08 22:17:30 -05:00
28b1c94f0f allow application of @symbolic decorators without circular imports (#5595) 2018-03-08 12:44:16 -05:00
ef0ef70cf5 Don't spuriously raise warning for Constant nodes, fixes #5101 (#5469)
Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2018-03-01 12:08:48 -05:00
c6d47f6386 add @torch.jit.script, @torch.jit.compile, torch.jit.CompilationUnit(str) (#5367)
* torch.jit.trace annotation now creates a GraphExecutor

The other torch.jit.trace, which was used for testing purposes and for onnx to get the trace graph, is now called torch.jit. torch.jit.get_trace_graph.

* @script annotation, and compilation unit for strings
2018-02-26 13:22:45 -08:00
30ec06c140 Merge Variable and Tensor classes (#5225)
This replaces the torch.Tensor constructors with factories that produce
Variables. Similarly, functions on the torch module (e.g. torch.randn)
now return Variables.

To keep the PR to a reasonable size, I've left most of the unused tensor
code. Subsequent PRs will remove the dead code, clean-up calls to
torch.autograd.Variable, and rename Variable to Tensor everywhere.

There are some breaking changes because Variable and Tensors had
slightly different semantics. There's a list of those changes here:

 https://github.com/pytorch/pytorch/wiki/Breaking-Changes-from-Variable-and-Tensor-merge
2018-02-23 18:03:31 -05:00
77036704aa add a third output in LSTM onnx export (#5359)
since that output has been added to the ONNX spec
2018-02-23 10:58:45 -05:00
1848cad108 [ready] Layer Normalization (#4922)
* at::maybe_data_ptr and Check.h => TensorUtils.h

* THNN support for optional BN running_*

* ATen support for optional BN running_*

* Python nn.* support for optional BN running_*; Improve IN and BN doc

* Add tests for IN and BN new option

* Layer Norm

* Fix LRN doc

* functional interface for LN and IN

* Layer norm tests

* fix BN double backward returning undefined tensors

* fix jit test using wrong dim inputs for BN

* add/improve BN, IN and LN GPU tests with half type

* Udpate docs to be consistent with Conv notation
Fix onnx
Clarified onnx symbokic wrapper

* fix typo

* Address comments
2018-02-22 11:56:41 -05:00
f96f3c312d Implement symbolic for slice operation (#5204) 2018-02-13 10:12:59 -08:00
8243e898ab allow dropout in RNN ONNX export except in training mode (#5160) 2018-02-09 16:04:27 -05:00
65fb885467 Bidirectional RNN export to ONNX (Elman/LSTM/GRU) (#5120) 2018-02-08 20:30:50 -05:00
c111cdfd1d Add onnx support for InstanceNorm (#4626)
* Add ONNX symbolic for instancenorm

* Fix some bugs
2018-02-07 10:54:30 -05:00
b2cfd961d3 Handle sequence lengths correctly when exporting RNNs to ONNX (#4695)
* PackedSequence: store batch_sizes as tensor

rather than converting to a list of python integers. This maintains
the invariant that module's inputs/outputs are collections of
Variables.

In particular, this causes the JIT to no longer choke when flattening
and unflattening arguments.

* Handle sequence lengths correctly when exporting RNNs to ONNX

- when uniform sequence lengths are provided, correctly omit the
  argument when constructing the ONNX graph, so as to not fix the
  graph to the batch size.

- handle PackedSequences by floating them through the graph and
  eliminating them in an optimization pass. ONNX does not have packed
  sequences, but operates on a representation equivalent to
  PaddedSequence, so we hide the representation-switching from ONNX

- as a preliminary step towards handling PackedSequences, not directly
  tied to ONNX export, change batch_sizes from being an argument to
  the RNN operators into being an argument to the forward() function
  of those RNN operators. This more closely models the reality that
  batch_sizes are effectively part of the input sequences.
2018-02-06 21:40:27 -05:00
5b43c22f73 Add symbolic_override_first_arg_based (#4799)
* Add symbolic_override_first_arg_based

* flake fix

* comment

* remove comment (keep forgetting about this PR)
2018-01-30 16:41:43 +01:00
27505e6429 Fix #4480 by tracing inputs before running function. (#4807)
* Fix #4480 by tracing inputs before running function.

The DCE trick says that if I have y = f(x), and f is internally implemented as
g, it's OK to trace both g and f. Recall the tracing algorithm is:

    enter f(x)
    compute its result y
    trace y = f(x)
    return from f

So when you run the example above, you'll do this:

    # suppose x is mapped to %1
    enter f(x)
    enter g(x)
    result of g is y
    trace y = g(x a.k.a. %1) (mapping y to %2)
    return from g
    result of f is y
    trace y = f(x a.k.a. %1) (remapping y to %3)
    return from f

and end up with a trace like this:

    %2 = g(%1)
    %3 = f(%1)

... only %3 is live, because %2 was killed from the mapping...  Subsequent DCE
will eliminate the invocation of g and you'll only see f in the final trace.

However, if f and g are inplace functions, the machinery breaks:

    # suppose x is mapped to %1
    enter f(x)
    enter g(x)
    result of g is x
    trace x = g(x a.k.a. %1) (remapping x to %2)
    return from g
    result of f is x
    trace x = f(x a.k.a. %2) (remapping x to %3)
    return from f
    resulting in:

    %2 = g(%1)
    %3 = f(%2) # OOPS

This commit changes the strategy so we instead do this:

    enter f(x)
    trace f(x)
    compute its result y
    trace y = f(x)  (computed above)
    return from f

Now we get the correct Value before it is overwritten.
Here is what the new trace code looks like:

    jit::tracer::PreTraceInfo trace_info;
    if (jit::tracer::isTracing( self, index )) {
      trace_info = jit::tracer::preRecordTrace( "index_fill", { self, index } );
      setattr(trace_info.n, jit::Symbol("dim"), dim);
      setattr(trace_info.n, jit::Symbol("value"), value);
    }
    baseType->index_fill_(self_, dim, index_, value);
    increment_version(self);
    rebase_history(self, grad_fn);
    if (trace_info.state != nullptr) {
      jit::tracer::postRecordTrace( trace_info,  { self } );
    }

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Revert "Hot patch ONNX _run_symbolic_function"

This reverts commit d1c973fee1a20da86d60d526e253ce89f5840baf.

* lintfix

Signed-off-by: Edward Z. Yang <ezyang@fb.com>

* Add missing expect file

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2018-01-23 18:06:55 -05:00
70f0436335 add Elman RNN export to ONNX (#4613) 2018-01-23 13:56:11 -05:00
b42f163835 [ONNX] export sum, prod, sqrt improve log_softmax. (#4579)
* ONNX: export sum, prod, sqrt improve log_softmax and fix a typo in doc.

Signed-off-by: HE, Tao <sighingnow@gmail.com>

* Add new exported op to doc.

Signed-off-by: HE, Tao <sighingnow@gmail.com>

* Double quotes.

Signed-off-by: HE, Tao <sighingnow@gmail.com>

* Update trace log of log_softmax.

Signed-off-by: HE, Tao <sighingnow@gmail.com>

* Improve export when dim is None and axes_i should be a list of ints.

Signed-off-by: HE, Tao <sighingnow@gmail.com>

* Fix prod when no dim given.

Signed-off-by: HE, Tao <sighingnow@gmail.com>

* Update line ends in test expected file.

Signed-off-by: HE, Tao <sighingnow@gmail.com>
2018-01-12 07:44:56 -05:00
77523df413 Add more check on softmax ONNX exporting logic (#4592)
* Add more check on softmax exporting logic

* Add more comments about axis and dim
2018-01-11 15:14:33 -05:00
d3b6c5e556 Support output_padding in ConvTranspose while doing ONNX exporting (#4583) 2018-01-11 12:31:06 -05:00
a3f4fa254c support GRU export to ONNX (#4390) 2018-01-08 19:56:29 -05:00
d1c973fee1 Hot patch ONNX _run_symbolic_function 2018-01-04 13:17:21 -05:00
5b91b240d2 adds missing argument (#4446) 2018-01-04 01:51:47 -05:00
e6cbe84bf6 Handle repeated inputs in JIT tracer 2018-01-03 17:29:27 +01:00
f05ca657dd added fix for #4408 and a test (#4452)
* added fix for #4408 and a test

* forgot import

* moved test to onnxbot/onnx-fb-universe
2018-01-03 10:23:50 -05:00
20b5e82155 Implement embedding in ATen (#4322)
Implements nn.Embedding (lookup table) in ATen.

Breaking change: new optional argument padding_idx in F.embedding to
match nn.Embedding.

Note that there are a few bugs in Embedding that are inherited from the
previous code:

 - CUDA renorm has race conditions if index contains duplicate entries
 - sparse gradient doesn't work with scale_grad_by_freq
2018-01-02 15:44:46 -05:00
fec3d4a079 RNN support has been implemented (#4409)
* RNN support has been implemented

4447b80b5e was merged in and now support RNN
2017-12-30 09:26:36 +09:00
410fd58b4f support RNN export (#4163)
Currently 1-layer RNN is supported
2017-12-27 18:10:53 -05:00
5b8fe5cbb5 Batchnorm in ATen (#4285)
* Batchnorm in ATen

This commit moves BatchNorm derivatives into ATen, eliminating
torch/csrc/autograd/functions/batch_normalization.cpp

Some refactoring along the way:

- Functions got renamed to remove _forward from their names
- CuDNN batchnorm forward was modified to return save_mean/save_std instead of
  take it as parameters. To avoid returning undefined Variables, these return
  (small) uninitialized tensors when they are not used.
- THNN batch normalization takes care of resizing save_mean and save_std on
  forward.
- There are some shenanigans re batchnorm backwards in eval mode. I'm tracking
  that in #4284
- I decided not to introduce buffers as a proper concept in ATen, which means
  that tensors like running_mean/running_var are variables in ATen.  This meant
  there needed to be some adjustments to how we *trace* such variables; the
  new strategy is if we can't find a Value for a variable, we look and see
  if we have a Value for the buffer pointed to by the variable, before
  finally falling back on constant.
- This PR finally reliably triggered OOM on Travis builds; I fixed this by reducing
  the number of parallel jobs.
- Stop using std::string when it's not necessary.
- Remove training parameter from cudnn_batch_norm_backward, because it
  doesn't make sense; cuDNN doesn't implement the math for evaluation mode
  batchnorm backwards.
- batchnorm_double_backward is now in an anonymous namespace, as it
  no longer needs to be called from torch/csrc

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-12-21 11:38:31 -05:00
b6a30f7ede Move SELU to ATen (#4269)
Fuse scale multiplication into ELU
2017-12-20 16:32:21 -05:00
689ef9cba3 Move upsampling to ATen (#4264) 2017-12-20 15:12:07 -05:00
a88a8ec827 Convolution derivatives in ATen (#4116)
* Convolution derivatives in ATen

This PR introduces ATen implementation of convolution, which dispatches to
THNN/CuDNN/nnpack based on input parameters. The general strategy is to compose
this function out of the various forward-backward pairs of specific
implementations, rather than write a monolithic function with backwards (which
is what we did before because the boilerplate of doing it otherwise would have
been very high.) The new API provides the following functions:

  - _convolution, which is a fully generic, native convolution implementation
    that dispatches to various other convolution implementations depending on
    input characteristics. This is prefixed with an underscore because it
    explicitly takes benchmark, deterministic and cudnn_enabled which are
    implementation details for CuDNN. The intent is to eventually provide a
    convolution that reads these parameters out of the context using #4104.
  - _convolution_nogroup is a convolution implementation for non-CuDNN
    algorithms which don't support group convolution natively.
  - _convolution_double_backward is the generic double-backwards implementation
    for convolution.

In more detail:

- Most functionality from torch/csrc/autograd/functions/convolution.cpp has been
  moved into aten/src/ATen/native/Convolution.cpp
- We continue to make use of ConvParams, but we now construct the parameters
  upon entry to a function from the function signature (which does not use
  ConvParams; having convolution take ConvParams directly would require teaching
  the code generator how to accept these as parameters, complicating ATen's API
  model) and destruct them when making subprocedure calls.
- I introduce a new idiom, input_r, which represents a const Tensor& reference,
  which will subsequently be assigned to a local Tensor input. This is helpful
  because a lot of the existing algorithms relied on being able to assign to
  locals, which is not permitted with a const reference.
- The native argument parser now supports std::array<bool,2> inputs (NB: there
  MUST NOT be a space; this is the same hack as is applied to derivatives.yaml)
- Native parser now supports Tensor? arguments, which indicates a nullable
  tensor. Previously this function was only used by NN methods.
- Documentation updates on THNN library
- I added an extra fgradInput argument to VolumetricConvolutionMM_updateOutput
  and VolumetricConvolutionMM_accGradParameters so that its buffer list lines up
  with the backward argument list. This makes it possible to write derivative
  for conv3d which previously was not supported (commented out in
  derivatives.yaml)
- Extra double_backward declarations for all convolution backwards functions was
  added.
- You can now use the syntax Tensor? in native_functions.yaml to indicate that a
  tensor argument is nullable.  There are adjustments to propagate this to the
  Python argument parser.
- NNPACK was ported to ATen, and ATen now builds and links against ATen if
  possible. New AT_NNPACK_ENABLED macro.  The nnpack functions are
  nnpack_spatial_convolution.
- Some modest CuDNN convolution refactoring to remove _forward from names.
- There's a new cudnn_convolution_backward function to deal with the fact that
  CuDNN convolution double backward requires you to have computed all gradients
  in one go.
- Variable set_flags now checks if the tensor is undefined, fixing a silent memory
  corruption.
- checkSameType updated to not raise an exception if called with Variable arguments
- "no ATen declaration found for" error message is improved to say what available declarations are
- make_variable now accepts undefined tensors, and returns an undefined tensor in this case.
2017-12-20 14:19:27 -05:00
b476d10c64 Move max_pool1d to ATen (#4257) 2017-12-19 20:10:11 -05:00
8c8114801b Fix onnx export of replication pad (#4263) 2017-12-19 20:07:01 -05:00
cb4f6c3148 conv_tbc (#3730)
attempt to rebase

skip conv_tbc in preprocess_nn_functions

Add conv_tbc symbolic

Fix backward issue with dBias

ConvTBC nn wrapper and unit test
2017-12-18 23:52:36 -05:00
cab5921227 Improve symbolic hack a bit (#4143) 2017-12-16 18:44:26 +01:00
6d72c82985 Trace ATen native functions as themselves, not their implementations. (#4127)
* Trace ATen non-primitive functions as themselves, not their implementations.

Previously, if I invoked an ATen non-primitive function foo, which in turn
called subfoo, I would always see 'subfoo' in the trace (e.g., tracing
'inlines' all of these operations.)  Such inlining is bad for ONNX
(and can be bad for optimization) as it prevents high-level
optimizations from taking advantage of the structure.  It might
be right to inline, but give the optimizer a chance to work before
inlining happens!

The implementation here is surprisingly simple, because it uses
the "DCE trick".  Essentially, it doesn't matter if the constituent
calls perform tracing, because you can always trace it again, and
override the trace nodes associated with the returned variables.
The original trace becomes dead and can be DCE'd.

While implementing this, I also refactored how 'isTracing' and
'trace_outputs' works:

- isTracing was previously a single function with overloads for
  both Tensor and Variable arguments.  Unfortunately, such overloads
  are not safe, because of how C++ implicit conversions work.  You
  would think that C++ should never confuse an overload for
  Variable with ArrayRef<Tensor>, but this is exactly what can
  happen: Tensor is convertible to both Variable and ArrayRef<Tensor>,
  thus it's ambiguous and C++ doesn't like it.  The last time I ran
  into this problem, I applied initializer lists to everything and
  called it a day.  A more robust fix is to separate out the
  Variable and Tensor overloads, which I have done in this patch.

- trace_outputs was fed as an initializer list, which doesn't work
  when you have heterogenous inputs.  So instead we first feed
  everything through 'flatten', which has overloads for each of the
  argument patterns in ATen, which then goes on to the recordTrace
  (which takes an ArrayRef).  This is *no less efficient*, because
  we were allocating a vector anyway (to do the conversion from
  vector of Tensor to vector of Variable).

This fixes mean that 'index' can properly be traced... although the
JIT still does not support it.  A failing test case has been added to
this effect.

Some knock-on effects:

- The fuser now knows about chunk as well as split.  They're pretty
  similar so there is no problem.

- There is a new 'canonicalize' pass in the JIT which renumbers a graph
  so that all structurally equivalent graphs render the same.

- We run DCE before the fuser tests, to make sure dead nodes don't
  block fusion.

- There are new ONNX exports for the newly introduced higher level ATen
  operations.  This includes type_as (no-op case only), chunk, select.

Zach didn't like the extra use of 'native' in the new codegen, so
we've introduced a new concept, 'abstract'.  An abstract function
is one that is implemented in derived types (e.g., CPUDoubleType),
where as a concrete one is implemented in the base type (Type).

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-12-15 13:50:32 -05:00
595c6dea71 Create an ONNX ATen exporting mode (#3489) 2017-12-14 22:36:53 -05:00
def4b78b6f adding index_select to symbolic.py (#4061) 2017-12-14 22:33:53 -05:00
84d8e81311 Fix the symbolic for view 2017-12-05 02:31:49 -05:00
fcc142386b Make pool export compliant with onnx spec 2017-12-05 02:27:02 -05:00
7ddcb91c7f Add more ONNX symbolics 2017-12-04 07:15:35 -05:00
de00aab720 PyTorch now uses operator versioning.
Also move some of the exporter info out of the ModelProto constructor.

Signed-off-by: Edward Z. Yang <ezyang@fb.com>
2017-11-30 23:09:45 -05:00
ab0a7eb7bf Add ONNX symbolics for several ops (#3956) 2017-11-30 16:40:13 -05:00
b97dfc8a92 Pretty names: support names set via export or Variable constructor (#3371)
Add (fully opt-in) functionality to support setting pretty names for
nodes in the graph. In particular

- Variable now has a `name` parameter in the constructor
- export now has `input_names` and `export_names` parameters

Nodes that are not named via this mechanism continue to be named
internally with unique integers.

Names have a few rules.

- They must all be unique in the graph.
- They may not be integers (because of potential conflicts with
  internally generated names).
2017-11-16 21:11:34 -05:00
5b4a438563 Implement bmm symbolic (#3681) 2017-11-16 19:57:02 -05:00
ef4b19f767 Refactor ir.h to distinguish Nodes and Values
This commit adds a Value type similar to the one @ezyang suggested a while
ago for handling multi-return nodes.

Previously if we had a graph like:

  a = op1(b)
  c, d = op2(a)

Then its in-memory format would look like:

  %0 = op1(b)
  %1 = op2(%0)
  %2 = select(%1, 0)
  %2 = select(%1, 1)

Select nodes were used only to handle the multi-output case. In the
single-output case ops referred directly to their uses.

This required special handling for the single- and multi- output cases,
and was confusing when used with ONNX which distinguishes values (the
inputs/outputs of a node) from the nodes themselves (e.g. a Conv).

This commit adds the Node/Value distinction to the IR. In the example
above, `a`, `b`, `c`, and `d` are now Value objects, while `op1` and
`op2` are now Node objects. Inputs/Outputs to the graph are values.

* Nodes now always have multiple outputs, accessible through their `output()`
  method.
* Methods exist for adding/removing outputs from a node.
* Nodes own their output Values, destroying a node destroys its outputs and it
is only valid to destroy a node when no uses of its outputs remain.
* Unlike select, Values do not appear in the nodes list.
* The method `node()` on `Value` retrieves its defining node. Calling it
is always valid. For inputs, its kind is "Param". Like "Return" there is a single Param
node representing all inputs.
* For single-output Nodes, the method `output()` retrieves the single
output Value, asserting that the node is in-fact single output.
* Functions are the same, but some functions like `type()` have moved to
Value.
* `replaceAllUsesWith` is now sanely defined for both Values and Nodes.
In the case of Nodes, it replaces all outputs of the node with the outputs
of the replacement node.
* stage is defined both on Node/Value. This is because Inputs require a stage.
* Apart from changing data types from Node->Value most passes remain the same.
  Things that previously assumed single-output nodes now have to call output()
  to get the node.
* This removes the uses = [...] field in the outputs because it was
getting confusing even before this commit when uses would refer to nodes,
but we print the names of Values. The lint pass validates the use list,
so printing it out seems less necessary.
2017-11-15 11:47:18 -08:00
1c1519d7cf Fix export for recent changes in ONNX (#3708) 2017-11-14 21:46:53 -05:00
47ac468504 Remove dilations for pooling in onnx export and other small fixes (#3698)
* fix optimization pass issues

* remove pool dilations
2017-11-14 14:28:05 -05:00