2025, Oct 17 09:00

Use SQLAlchemy 2.5 to Run a MySQL UPDATE ... JOIN with LEFT JOIN and IS NULL—No Cartesian Products

Perform a MySQL-style UPDATE ... JOIN in SQLAlchemy 2.5: use INNER JOIN plus LEFT JOIN with IS NULL to avoid Cartesian products and update child timestamps.

Updating rows with data sourced from joins is a common need, but translating it cleanly to SQLAlchemy 2.5 without triggering Cartesian products can be tricky. The target here is a MySQL-style UPDATE ... JOIN with an inner join to a parent table and a left join to a third table, updating a timestamp when no related row exists in the third table.

Problem setup and failing attempt

The goal is to update child rows by pulling a value from parent, only when there is no matching row in other_child. A straightforward subquery approach ends up producing a Cartesian product or errors.

from sqlalchemy import DateTime, ForeignKey, Integer, create_engine, func, select, update
from sqlalchemy.orm import DeclarativeBase, mapped_column, sessionmaker

class ModelBase(DeclarativeBase):
pass

class ParentRow(ModelBase):
__tablename__ = "parent"
id = mapped_column(Integer, primary_key=True, nullable=False)
updated_at = mapped_column(DateTime, nullable=True)

class ChildRow(ModelBase):
__tablename__ = "child"
id = mapped_column(Integer, primary_key=True, nullable=False)
parent_id = mapped_column(Integer, ForeignKey("parent.id"))
last_status_change = mapped_column(DateTime, nullable=True)

class SiblingRow(ModelBase):
__tablename__ = "other_child"
id = mapped_column(Integer, primary_key=True, nullable=False)

db = create_engine("mysql://root:@127.0.0.1/dev?charset=utf8mb4")
ModelBase.metadata.create_all(db)
SessionFactory = sessionmaker(bind=db)

flt = (
select(ParentRow.id, ParentRow.updated_at)
.outerjoin(SiblingRow)
.where(SiblingRow.id.is_(None))
).subquery()

upd_stmt = (
update(ChildRow)
.where(ChildRow.parent_id.in_(select(flt.c.id)))
.values(last_status_change=func.CONVERT_TZ(flt.c.updated_at, "Europe/Paris", "UTC"))
)

with SessionFactory() as db_sess:
db_sess.execute(upd_stmt)

What’s going on

The approach above tries to correlate an UPDATE on child with a subquery that filters parents lacking other_child rows, and then uses that subquery to source the value to set. In practice this does not form the intended UPDATE ... JOIN and can result in a Cartesian product. The target SQL is meant to look like a MySQL UPDATE with an inner and a left join, along with the IS NULL filter on the left-joined table.

Working solution: update a join construct directly

You don’t need a subquery here. SQLAlchemy can target a join construct directly in the UPDATE, which produces the desired MySQL syntax.

from sqlalchemy import DateTime, ForeignKey, Integer, create_engine, func, join, update
from sqlalchemy.orm import DeclarativeBase, Session, mapped_column

class ModelBase(DeclarativeBase):
pass

class ParentRow(ModelBase):
__tablename__ = "parent"
id = mapped_column(Integer, primary_key=True, nullable=False)
updated_at = mapped_column(DateTime, nullable=True)

class ChildRow(ModelBase):
__tablename__ = "child"
id = mapped_column(Integer, primary_key=True, nullable=False)
parent_id = mapped_column(Integer, ForeignKey("parent.id"))
last_status_change = mapped_column(DateTime, nullable=True)

class SiblingRow(ModelBase):
__tablename__ = "other_child"
id = mapped_column(Integer, primary_key=True, nullable=False)
parent_id = mapped_column(Integer, ForeignKey("parent.id"))

engine = create_engine("mysql+pymysql://", echo=True)
ModelBase.metadata.create_all(engine)

with Session(engine) as db_sess:
jn = join(
join(ChildRow, ParentRow, ParentRow.id == ChildRow.parent_id),
SiblingRow,
SiblingRow.parent_id == ParentRow.id,
isouter=True,
)

stmt = (
update(jn)
.values({
ChildRow.last_status_change: func.CONVERT_TZ(
ParentRow.updated_at, "Europe/Paris", "UTC"
)
})
.where(SiblingRow.id.is_(None))
)

db_sess.execute(stmt)

This emits the intended MySQL query.

UPDATE child
INNER JOIN parent ON parent.id = child.parent_id
LEFT OUTER JOIN other_child ON other_child.parent_id = parent.id
SET
child.last_status_change = CONVERT_TZ(parent.updated_at, %(CONVERT_TZ_1)s, %(CONVERT_TZ_2)s)
WHERE
other_child.id IS NULL

Why this matters

When the business rule depends on the presence or absence of related rows, using a join as the UPDATE target aligns the ORM statement with the SQL you actually want. It avoids unintended Cartesian products and mirrors MySQL’s native UPDATE ... JOIN form, while keeping everything in SQLAlchemy rather than dropping down to raw SQL. You can also inspect the generated SQL using engine echo or by compiling a statement, which makes it easier to validate complex updates.

Takeaways

If you need to update rows based on multiple tables, including a left join and an IS NULL filter, build an explicit join construct and pass it to update(). Keep the filter on the left-joined table and assign values directly from the joined tables. When in doubt, inspect the emitted SQL to confirm it matches the intended MySQL UPDATE with joins.

The article is based on a question from StackOverflow by edg and an answer by python_user.