2026, Jan 11 07:00
Safely shift the z-channel in batched 3D point tensors in PyTorch without breaking autograd
Learn how to add a per-batch z-offset to 3D point tensors in PyTorch without in-place view ops. Preserve autograd gradients with an out-of-place approach.
Shifting only the z-channel of a batched point set in PyTorch looks trivial until autograd gets involved. A single in-place assignment on a view can break the computational graph and throw a runtime error. Below is a clear, reproducible path from the problematic pattern to a safe, out-of-place solution that preserves gradients.
Problem setup
Assume a tensor of 3D points with shape (B, 3, N), where the second dimension corresponds to x, y, z channels, and a per-batch z-shift of shape (B, 1). The goal is to add the respective z-shift to all points in each batch while keeping x and y intact.
import torch
# 2 batches, 3 channels (x, y, z), 5 points
pts = torch.rand(2, 3, 5, requires_grad=True)
# one z shift per batch
z_offset = torch.tensor([[1.0], [10.0]], requires_grad=True)
# naive attempt (in-place on a view)
pts[:, 2, :] += z_offset
This pattern works when gradients are not tracked but fails when used in a forward pass with autograd. The error looks like this:
RuntimeError: a view of a leaf Variable that requires grad is being used in an in-place operation.
Why this fails
In PyTorch, tensors created directly by the user are leaf tensors. Their views share the same underlying storage. When you perform an in-place assignment on such a view, you mutate the storage of the original tensor right in the middle of the computational graph. That compromises autograd’s ability to reason about dependencies and leads to undefined behavior. Hence, the in-place modification of views that require gradients is disallowed.
Safe and differentiable approach
The fix is to avoid in-place updates and construct a new tensor out-of-place. Instead of writing into the z-channel directly, build a fresh tensor by stacking the untouched x and y channels with the shifted z channel.
import torch
# original inputs
pts = torch.rand(2, 3, 5, requires_grad=True)
z_offset = torch.tensor([[1.0], [10.0]], requires_grad=True)
# out-of-place construction preserves autograd graph integrity
pts_shifted = torch.stack([
pts[:, 0, :],
pts[:, 1, :],
pts[:, 2, :] + z_offset,
], dim=1)
This approach leaves the original storage untouched and creates a new tensor that correctly encodes the operation in the graph.
Why you should care
Autograd is strict about operations that can silently corrupt the graph. In-place edits on views look harmless but can invalidate gradient computation. Understanding the distinction between leaf tensors, their views, and how storage is shared helps avoid brittle code paths and hard-to-debug training issues.
Takeaways
When working with tensors that require gradients, avoid in-place modifications on views. If you need to change one channel or slice, rebuild the output tensor out-of-place using safe composition. The result is functionally identical, differentiable, and robust inside any nn.Module forward.