2025, Nov 01 13:00

In-place outputs in pybind11 callbacks: avoiding copies with reference-like views, Eigen::Ref, or shared_ptr

Learn why pybind11 copies break in-place C++ to Python callback updates, and fix them with reference-like views (Eigen::Ref), shared_ptr, or NumPy buffers.

Bridging a C++ optimization core with a Python API often hinges on callbacks that fill output buffers in place. A common surprise with pybind11 is that a Python-side write appears to succeed, yet the original C++ object remains unchanged. If your callback takes a Vector by non-const reference and still nothing sticks after the call, you’re most likely hitting an ownership and copy boundary.

Problem setup

Consider a solver that passes two arguments into a Python callback: the input state and an output container that should be updated in-place. The essence is clear, but the result in C++ doesn’t change.

// C++ side (pybind11 bindings and call site)
#include "DenseVec.hpp"
#include <pybind11/pybind11.h>
namespace py = pybind11;

void optimize(const std::function<void(const DenseVec&, DenseVec&)>& apply_constraints) {
  const DenseVec xi = ...;
  DenseVec bounds = ...;
  apply_constraints(xi, bounds);
}

PYBIND11_MODULE(corebind, m) {
  py::class_<DenseVec>(m, "DenseVec")
    .def(py::init<size_t>(), "Constructor")
    .def("__getitem__", [](const DenseVec& v, size_t i) {
      return v[i];
    })
    .def("__setitem__", [](DenseVec& v, size_t i, double val) {
      v[i] = val;
    });

  m.def("optimize", &optimize);
}
# Python side
import corebind

def apply_constraints(xx, out_v):
    out_v[0] = /* function of xx */
    out_v[1] = /* function of xx */
    ...

corebind.optimize(apply_constraints)

Despite writes on the Python object, the C++ container doesn’t reflect those changes after the callback returns.

What’s really happening

pybind11 keeps the C++ instance inside the Python object by value. Conceptually, the Python wrapper is a small struct: one part is the Python control block, the other is your C++ class instance. To construct that Python object, pybind11 needs to copy the C++ object. As a result, Python mutates the copy held by the wrapper, not the original object you created on the C++ stack. Adjusting return value policies like py::return_value_policy::reference_internal won’t fix this specific situation, because the wrapper still owns its by-value C++ payload.

A practical way out: reference-like view

One effective approach is to expose a lightweight view that only points to the underlying C++ data. Instead of wrapping the vector itself, wrap a reference-like object that forwards reads and writes to the original instance. Copying that wrapper only copies a pointer, so Python modifications hit the original storage. You must be very careful with lifetimes, though: Python will believe it owns the object and can hold onto it beyond the scope where the pointee is valid. A const-view flavor also helps avoid accidental copies where mutation isn’t needed.

// C++: a thin reference-like view
template <typename Vec>
struct DenseVecRefT {
  Vec* p = nullptr;
  explicit DenseVecRefT(Vec& vec) : p(&vec) {}
  double get(size_t i) const { return (*p)[i]; }
  void set(size_t i, double val) { (*p)[i] = val; }
};

using DenseVecRef = DenseVecRefT<DenseVec>;

// Updated solver: pass a view into the callback
void optimize(const std::function<void(const DenseVec&, DenseVecRef&)>& apply_constraints) {
  const DenseVec xi = ...;
  DenseVec bounds = ...;
  DenseVecRef out_view(bounds);
  apply_constraints(xi, out_view);
  // bounds has been updated through out_view
}

PYBIND11_MODULE(corebind, m) {
  py::class_<DenseVec>(m, "DenseVec")
    .def(py::init<size_t>(), "Constructor")
    .def("__getitem__", [](const DenseVec& v, size_t i) { return v[i]; })
    .def("__setitem__", [](DenseVec& v, size_t i, double val) { v[i] = val; });

  py::class_<DenseVecRef>(m, "DenseVecRef")
    .def("__getitem__", [](const DenseVecRef& ref, size_t i) { return ref.get(i); })
    .def("__setitem__", [](DenseVecRef& ref, size_t i, double val) { ref.set(i, val); });

  m.def("optimize", &optimize);
}
# Python: same callback, now receives a view that writes through
import corebind

def apply_constraints(xx, out_view):
    out_view[0] = /* function of xx */
    out_view[1] = /* function of xx */
    ...

corebind.optimize(apply_constraints)

This pattern mirrors span-like behavior. A read-only counterpart can be exposed similarly if you want a const view. In practice, this reference-like strategy has worked well.

Alternative directions and trade-offs

It’s also possible to store std::shared_ptr inside the Python object. If you don’t want Python to manage the lifetime, an empty deleter can be installed, but that still allocates a control block and is unsafe because user code can hold onto the C++ object and trigger UB. From a safety perspective, allocating with make_shared and letting Python own the object is cleaner, accepting the overhead of shared ownership.

Another route is to lean on NumPy and expose your matrix through the buffer protocol. You can provide both read-only and read-write views. However, creating the buffer from the Python object as shown requires copying the C++ object into the Python wrapper, and you cannot keep it on the stack. To avoid the copy you would need to manually construct a NumPy array pointing to stack memory, which makes the code less safe, or you can construct a Python object (for example by casting to py::object) and just read and write from the copy.

If you already use Eigen, its integration is available out of the box. The same limitation applies, though: passing matrices by reference requires Eigen::Ref, which is the same reference-like idea described above.

Why this matters

When a callback is supposed to fill outputs in place, an unnoticed copy silently breaks correctness and performance. You pay extra allocations, write to the wrong buffer, and may not notice until downstream checks fail. Understanding how pybind11 wraps C++ storage helps you design bindings that preserve semantics and avoid subtle ownership bugs.

Takeaways

The crux is that pybind11’s Python wrapper holds your C++ object by value, so constructing the wrapper copies the object. For in-place updates from Python back into C++, route writes through an object that references the original storage. A reference-like view is a direct and efficient fix if you manage lifetimes carefully. If you prefer shared ownership, wrap the data in std::shared_ptr and let Python control it. For matrix-heavy code, consider Eigen with Eigen::Ref, or NumPy buffers with an eye on where copies occur and what that means for safety. Choose the approach that matches your lifetime model and performance constraints, and treat ownership rules as a first-class part of your binding design.

The article is based on a question from StackOverflow by Charlie Vanaret - the Uno guy and an answer by Ahmed AEK.