Avoid temporary buffers for tensors with torch.save. (#80404)

Fix torch.save _open_zipfile_writer optimization that uses a c++ stream when `f` is a os.PathLike.
This fastpath requires that we don't `open()` in python if possible, so don't do it unconditionally.

Fix PyTorchStreamWriter construction binding that takes a buffer object.
Use py::memoryview instead of py::bytes as the former doesn't copy the data.

Validated with a trivial benchmark that calls torch.save in a loop 20x with a 10M elements float32 tensor
either on cpu or cuda. Saved to /dev/null.

Tried two variants 'str' and 'open'
    In 'str' we pass the string "/dev/null" to torch.save.
    In 'open' we pass `open("/dev/null", "wb")` to torch.save.

Timing in seconds.

Before this patch:
str-cpu :: 0.757
open-cpu :: 0.757
str-cuda :: 1.367
open-cuda :: 1.366

After this patch:
str-cpu :: 0.256
open-cpu :: 0.251
str-cuda :: 0.896
open-cuda :: 0.834

Fixes #ISSUE_NUMBER

Pull Request resolved: https://github.com/pytorch/pytorch/pull/80404
Approved by: https://github.com/jamesr66a
This commit is contained in:
Rodrigo Kumpera
2022-06-30 00:19:42 +00:00
committed by PyTorch MergeBot
parent fa6b6842e1
commit b4e491798c
2 changed files with 10 additions and 8 deletions

View File

@ -1339,8 +1339,9 @@ void initJITBindings(PyObject* module) {
.def(py::init<std::string>())
.def(py::init([](const py::object& buffer) {
auto writer_func = [=](const void* data, size_t size) {
auto bytes = py::bytes(reinterpret_cast<const char*>(data), size);
buffer.attr("write")(std::move(bytes));
auto memory_view = py::memoryview::from_memory(
reinterpret_cast<const char*>(data), size);
buffer.attr("write")(std::move(memory_view));
return size;
};
return std::make_unique<PyTorchStreamWriter>(std::move(writer_func));