Files
pytorch/test/quantization/serialized/TestSerialization.test_conv2d_nobias_graph_v3.input.pt
David Reiss d63c236fb3 Introduce quantized convolution serialization format 3 (#60241)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/60241

We're going to make a forward-incompatible change to this serialization
format soon, so I'm taking the opportunity to do a little cleanup.

- Use int for version.  This was apparently not possible when V2
  was introduced, but it works fine now as long as we use int64_t.
  (Note that the 64-bits are only used in memory.  The serializer will
  use 1 byte for small non-negative ints.)
- Remove the "packed params" tensor and replace it with a list of ints.
- Replace the "transpose" field with "flags" to allow more binary flags
  to be packed in.
- Unify required and optional tensors.  I just made them all optional
  and added an explicit assertion for the one we require.

A bit of a hack: I added an always-absent tensor to the front of the
tensor list.  Without this, when passing unpacked params from Python to
the ONNX JIT pass, they type would be inferred to `List[Tensor]` if all
tensors were present, making it impossible to cast to
`std::vector<c10::optional<at:Tensor>>` without jumping through hoops.

The plan is to ship this, along with another diff that adds a flag to
indicate numerical requirements, wait a few weeks for an FC grace
period, then flip the serialization version.

Test Plan: CI.  BC tests.

Reviewed By: vkuzo, dhruvbird

Differential Revision: D29349782

Pulled By: dreiss

fbshipit-source-id: cfef5d006e940ac1b8e09dc5b4c5ecf906de8716
2021-06-24 20:52:43 -07:00

1.1 KiB