https://pytroubles.com/en/posts/id262-sqlalchemy-core-identifier-quoting-in-raw-sql-dialect-aware-cross-database-safe-dynamic-names

SQLAlchemy Core Identifier Quoting in Raw SQL: Dialect-Aware, Cross-Database Safe Dynamic Names

Safely Quote Dynamic Table, Schema, and Column Names in SQLAlchemy Core Using the Dialect IdentifierPreparer

SQLAlchemy Core Identifier Quoting in Raw SQL: Dialect-Aware, Cross-Database Safe Dynamic Names

Safely quote SQL identifiers in raw SQL using SQLAlchemy Core's IdentifierPreparer. Create cross-database queries with dynamic table, schema and column names.

2025-09-28T21:00:05+03:00

Safely quoting identifiers in raw SQL is a recurring pain point when you need dynamic table, schema, or column names in SQLAlchemy Core. Manually escaping is brittle and often tied to one database. There is, however, a dialect-aware way to handle this without guessing each backend’s rules.ProblemSuppose you want to run raw SQL with variable table, schema, and column names, and you tried to escape those identifiers yourself. The intent is to construct a cross-database-safe query, but a handcrafted escape helper tends to be backend-specific and uncertain.import re, sqlalchemy def _safe_ident(token, force_quotes=False): # WARNING: Works in Postgres only, and is not intended for production code. if not force_quotes and re.match(r'^[a-z_][a-z0-9$_]*$', token, re.IGNORECASE): return token return '"' + token.replace('"', '""') + '"' def fetch_row(tbl_name, pk_col, pk_value, sch_name='public'): qi = _safe_ident stmt = sqlalchemy.sql.text(f''' SELECT * FROM {qi(sch_name)}.{qi(tbl_name)} WHERE {qi(pk_col)} = :k; ''') args = dict(k=pk_value) # execute the statement with args and return data # ... This approach mirrors Postgres QUOTE_IDENT behavior and does not generalize cleanly to other dialects like SQLite, MySQL, or MSSQL. Even for Postgres, confidence that it covers all corner cases is limited, and relying on database-side functions for pre-quoting adds an extra round-trip you probably don’t want.What’s really going onIdentifier quoting rules are not uniform. Different backends use different quoting characters and escaping rules. Hardcoding a regex and quote logic ties the code to one database and risks subtle bugs when names collide with reserved words, contain spaces, or use mixed case. SQLAlchemy already encapsulates these rules per dialect; raw string assembly just bypasses that machinery.SolutionUse the dialect’s IdentifierPreparer via SQLAlchemy. It exposes methods that apply the correct quoting rules for the target database. The entry point is available from an Engine, and you can use it both to quote individual identifiers and to handle schema names.from sqlalchemy import create_engine # Create an engine for your target DB; make sure the driver is installed # e.g., psycopg2 for PostgreSQL or pyodbc for MSSQL eng = create_engine('postgresql://') prep = eng.dialect.identifier_preparer # Quote identifiers using the dialect rules quoted_tbl = prep.quote_identifier('foo bar') # yields "foo bar" quoted_sch = prep.quote_schema('public') quoted_col = prep.quote_identifier('id') # Build raw SQL safely with the quoted pieces from sqlalchemy import sql stmt = sql.text(f''' SELECT * FROM {quoted_sch}.{quoted_tbl} WHERE {quoted_col} = :key ''') # execute stmt with parameters as usual This keeps raw-SQL ergonomics while delegating the hard part—identifier quoting—to the right layer. You will need the appropriate driver package for the engine you create, for example psycopg2 for PostgreSQL or pyodbc for MSSQL.Dialects without creating an EngineIf you cannot or do not want to create an Engine, it may be possible to construct the dialect directly. This avoids a live connection but relies on implementation details and is not guaranteed across versions.from sqlalchemy import URL url = URL.create('mssql') entrypoint = url._get_entrypoint() dialect = entrypoint.get_dialect_cls(url)() # or get_async_dialect_cls quoted = dialect.identifier_preparer.quote_identifier('foo bar') # yields [foo bar] This works by accessing a private method and is best treated as brittle. Prefer the Engine-backed approach when available.Why this mattersDynamic SQL that interpolates identifiers is a common need in tooling, migrations, and query builders. Getting quoting wrong can break portability, produce syntax errors that are hard to diagnose, or even expose you to injection risks. By leaning on the dialect’s own IdentifierPreparer, you keep raw SQL while staying aligned with backend-specific rules.Practical wrap-upWhen you need variable table, schema, or column names with SQLAlchemy Core, don’t reinvent identifier quoting. Use eng.dialect.identifier_preparer.quote_identifier and, when relevant, quote_schema. Ensure the correct driver is available so the dialect can be constructed. If you must avoid creating an Engine, you can derive the dialect via URL and a private entrypoint, but treat that path as an implementation detail that may change. Keep values as bound parameters as you already do, and reserve string interpolation for identifiers prepared by the dialect.

SQLAlchemy Core, IdentifierPreparer, quote SQL identifiers, raw SQL, dynamic table names, schema names, column names, dialect-aware quoting, cross-database, PostgreSQL, MySQL, SQLite, MSSQL

2025

2025, Sep 28 21:00

Safely Quote Dynamic Table, Schema, and Column Names in SQLAlchemy Core Using the Dialect IdentifierPreparer

Safely quote SQL identifiers in raw SQL using SQLAlchemy Core's IdentifierPreparer. Create cross-database queries with dynamic table, schema and column names.

Problem

Suppose you want to run raw SQL with variable table, schema, and column names, and you tried to escape those identifiers yourself. The intent is to construct a cross-database-safe query, but a handcrafted escape helper tends to be backend-specific and uncertain.

import re, sqlalchemy
def _safe_ident(token, force_quotes=False):
    # WARNING: Works in Postgres only, and is not intended for production code.
    if not force_quotes and re.match(r'^[a-z_][a-z0-9$_]*$', token, re.IGNORECASE):
        return token
    return '"' + token.replace('"', '""') + '"'
def fetch_row(tbl_name, pk_col, pk_value, sch_name='public'):
    qi = _safe_ident
    stmt = sqlalchemy.sql.text(f'''
                                SELECT *
                                FROM {qi(sch_name)}.{qi(tbl_name)}
                                WHERE {qi(pk_col)} = :k;
                                ''')
    args = dict(k=pk_value)
    # execute the statement with args and return data
    # ...

This approach mirrors Postgres QUOTE_IDENT behavior and does not generalize cleanly to other dialects like SQLite, MySQL, or MSSQL. Even for Postgres, confidence that it covers all corner cases is limited, and relying on database-side functions for pre-quoting adds an extra round-trip you probably don’t want.

What’s really going on

Identifier quoting rules are not uniform. Different backends use different quoting characters and escaping rules. Hardcoding a regex and quote logic ties the code to one database and risks subtle bugs when names collide with reserved words, contain spaces, or use mixed case. SQLAlchemy already encapsulates these rules per dialect; raw string assembly just bypasses that machinery.

Solution

Use the dialect’s IdentifierPreparer via SQLAlchemy. It exposes methods that apply the correct quoting rules for the target database. The entry point is available from an Engine, and you can use it both to quote individual identifiers and to handle schema names.

from sqlalchemy import create_engine
# Create an engine for your target DB; make sure the driver is installed
# e.g., psycopg2 for PostgreSQL or pyodbc for MSSQL
eng = create_engine('postgresql://')
prep = eng.dialect.identifier_preparer
# Quote identifiers using the dialect rules
quoted_tbl = prep.quote_identifier('foo bar')   # yields "foo bar"
quoted_sch = prep.quote_schema('public')
quoted_col = prep.quote_identifier('id')
# Build raw SQL safely with the quoted pieces
from sqlalchemy import sql
stmt = sql.text(f'''
                 SELECT *
                 FROM {quoted_sch}.{quoted_tbl}
                 WHERE {quoted_col} = :key
                 ''')
# execute stmt with parameters as usual

This keeps raw-SQL ergonomics while delegating the hard part—identifier quoting—to the right layer. You will need the appropriate driver package for the engine you create, for example psycopg2 for PostgreSQL or pyodbc for MSSQL.

Dialects without creating an Engine

If you cannot or do not want to create an Engine, it may be possible to construct the dialect directly. This avoids a live connection but relies on implementation details and is not guaranteed across versions.

from sqlalchemy import URL
url = URL.create('mssql')
entrypoint = url._get_entrypoint()
dialect = entrypoint.get_dialect_cls(url)()  # or get_async_dialect_cls
quoted = dialect.identifier_preparer.quote_identifier('foo bar')  # yields [foo bar]

This works by accessing a private method and is best treated as brittle. Prefer the Engine-backed approach when available.

Why this matters

Dynamic SQL that interpolates identifiers is a common need in tooling, migrations, and query builders. Getting quoting wrong can break portability, produce syntax errors that are hard to diagnose, or even expose you to injection risks. By leaning on the dialect’s own IdentifierPreparer, you keep raw SQL while staying aligned with backend-specific rules.

Practical wrap-up

When you need variable table, schema, or column names with SQLAlchemy Core, don’t reinvent identifier quoting. Use eng.dialect.identifier_preparer.quote_identifier and, when relevant, quote_schema. Ensure the correct driver is available so the dialect can be constructed. If you must avoid creating an Engine, you can derive the dialect via URL and a private entrypoint, but treat that path as an implementation detail that may change. Keep values as bound parameters as you already do, and reserve string interpolation for identifiers prepared by the dialect.

The article is based on a question from StackOverflow by doekman and an answer by snakecharmerb.

python python-3.x sqlalchemy