2025, Nov 09 07:00

How to Resolve PyTorch 'list' object has no attribute 'to' When Loading Batches from S3 with s3torchconnector

Fix PyTorch DataLoader crashes on SageMaker: s3torchconnector S3MapDataset.from_prefix may yield list-like batches. Normalize and move Tensors to device.

Training directly from S3 is convenient until the first batch crashes with a cryptic attribute error. If a pipeline expects a single Tensor per batch but the data source yields a list-like structure, a simple device move like samples.to(device) will fail. That’s exactly what can happen when using s3torchconnector.S3MapDataset.from_prefix in a SageMaker environment.

Repro: when a batch isn’t a Tensor

The following setup initializes a dataset from an S3 prefix and feeds it to a standard DataLoader. The model and loop are typical, but the critical part is that the incoming sample isn’t always a Tensor; with S3 it may be a list-like object, which triggers 'list' object has no attribute 'to'.

from PIL import Image
import torch
import torchvision
from torchvision import transforms
import s3torchconnector
# image transform returning an identifier and a float32 tensor
def fetch_img(obj_ref):
    pic = Image.open(obj_ref)
    resizer = transforms.Resize(size=(224, 224))
    pic = resizer(pic)
    pic = transforms.functional.pil_to_tensor(pic)
    return (obj_ref.key, torchvision.transforms.functional.convert_image_dtype(pic, dtype=torch.float32))
# S3-backed dataset
train_ds = s3torchconnector.S3MapDataset.from_prefix(
    cfg.IMAGES_URI,
    region=cfg.REGION,
    transform=fetch_img,
)
# DataLoader
train_loader = torch.utils.data.DataLoader(
    train_ds,
    sampler=train_sampler,
    batch_size=cfg.batch_size,
    num_workers=cfg.num_workers,
    pin_memory=cfg.pin_mem,
    drop_last=True,
)
# Model
model = models_mae.__dict__[cfg.model](norm_pix_loss=cfg.norm_pix_loss)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# Training loop (fragment)
for ep in range(cfg.start_epoch, cfg.epochs):
    if cfg.distributed:
        train_loader.sampler.set_epoch(ep)
    for batch in train_loader:
        samples = batch
        samples = samples.to(device, non_blocking=True)  # raises: 'list' object has no attribute 'to'

What’s going on

The data pipeline is mixing two expectations. The training step assumes a Tensor so it can call .to(device, non_blocking=True). With data coming via s3torchconnector.S3MapDataset.from_prefix, the sample can arrive as a list-like container instead of a Tensor. As soon as .to is invoked on that list, the attribute error appears. The same loop can work with a local dataset where the sample is already a Tensor, which is why the issue only surfaces after switching to S3.

Fix: normalize the batch right before moving it to device

The practical resolution is to coerce the incoming batch to the expected Tensor. When the batch is a list, the usable Tensor is at index 1; otherwise it’s already a Tensor. This conditional move keeps both local and S3 training paths working.

for ep in range(cfg.start_epoch, cfg.epochs):
    if cfg.distributed:
        train_loader.sampler.set_epoch(ep)
    for batch in train_loader:
        inputs = batch
        if type(inputs) == list:
            inputs = inputs[1].to(device, non_blocking=True)
        else:
            inputs = inputs.to(device, non_blocking=True)
        # proceed with forward/backward using `inputs`

Why this matters

Consistency of the batch interface is essential when swapping data sources. A training step generally codifies assumptions about input shape and type. If one source yields a Tensor and another produces a list-like wrapper, the same model code will behave differently. Normalizing the batch at the handoff point ensures the loop can run against a large S3-hosted dataset as reliably as it does against locally stored files in SageMaker.

Takeaways

If a training run fails with 'list' object has no attribute 'to' after switching to s3torchconnector, check what the batch actually contains and convert it to a single Tensor before calling .to(...). A compact type check that selects the correct element from the list and moves it to the device restores compatibility for both local datasets and S3-backed loading, enabling the model to read and train on a list-backed sample without further changes to the rest of the training loop.

The article is based on a question from StackOverflow by AlternativeWaltz and an answer by AlternativeWaltz.