Avoid temporary buffers for tensors with torch.save. (#80404)

Fix torch.save _open_zipfile_writer optimization that uses a c++ stream when `f` is a os.PathLike. This fastpath requires that we don't `open()` in python if possible, so don't do it unconditionally. Fix PyTorchStreamWriter construction binding that takes a buffer object. Use py::memoryview instead of py::bytes as the former doesn't copy the data. Validated with a trivial benchmark that calls torch.save in a loop 20x with a 10M elements float32 tensor either on cpu or cuda. Saved to /dev/null. Tried two variants 'str' and 'open' In 'str' we pass the string "/dev/null" to torch.save. In 'open' we pass `open("/dev/null", "wb")` to torch.save. Timing in seconds. Before this patch: str-cpu :: 0.757 open-cpu :: 0.757 str-cuda :: 1.367 open-cuda :: 1.366 After this patch: str-cpu :: 0.256 open-cpu :: 0.251 str-cuda :: 0.896 open-cuda :: 0.834 Fixes #ISSUE_NUMBER Pull Request resolved: https://github.com/pytorch/pytorch/pull/80404 Approved by: https://github.com/jamesr66a
2025-10-21 13:44:15 +08:00 · 2022-06-30 00:19:42 +00:00
parent fa6b6842e1
commit b4e491798c
2 changed files with 10 additions and 8 deletions
--- a/torch/csrc/jit/python/init.cpp
+++ b/torch/csrc/jit/python/init.cpp
@ -1339,8 +1339,9 @@ void initJITBindings(PyObject* module) {
      .def(py::init<std::string>())
      .def(py::init([](const py::object& buffer) {
        auto writer_func = [=](const void* data, size_t size) {
-          auto bytes = py::bytes(reinterpret_cast<const char*>(data), size);
-          buffer.attr("write")(std::move(bytes));
+          auto memory_view = py::memoryview::from_memory(
+              reinterpret_cast<const char*>(data), size);
+          buffer.attr("write")(std::move(memory_view));
          return size;
        };
        return std::make_unique<PyTorchStreamWriter>(std::move(writer_func));