2025, Sep 26 01:00

Ruff and environment-provided globals: silencing F821 for Databricks spark via builtins

Stop Ruff F821 undefined-name in Databricks notebooks: declare globals like spark in builtins (pyproject.toml) and keep your code clean. Stay consistent.

Ruff and environment-provided globals: how to silence F821 without littering your code

Working with notebooks in a managed environment like Databricks often means relying on globals that the platform injects for you. One of the most common examples is spark. Locally, however, your linter does not see that runtime context and raises F821 undefined-name. It’s a legitimate signal from the tool, but a nuisance in this case.

Minimal reproducible example

The snippet below is perfectly valid in a Databricks notebook, yet ruff check reports F821:

# Databricks notebook source
result_frame = spark.sql("SELECT 'hello world'")
result_frame.show()

Why this triggers F821

The code references a name that isn’t declared in the file itself. In the Databricks runtime, spark is provided by the environment, but the linter only sees the source and treats the identifier as undefined, which leads to the F821 undefined-name error.

Обvious workarounds exist, like explicitly loading the global or sprinkling # noqa: F821 where necessary, but both approaches either add boilerplate or clutter. There’s a cleaner way to declare such names for Ruff.

Solution: tell Ruff about your globals via builtins

Ruff supports a setting that declares additional built-in names. By listing spark there, you instruct the linter to treat it as a known global and stop flagging it as undefined. You can place this configuration in either pyproject.toml or ruff.toml.

# pyproject.toml
[tool.ruff]
builtins = ["spark"]
# ruff.toml
builtins = ["spark"]

After adding this entry, Ruff will no longer raise F821 for usages of spark in your notebooks.

Why it’s worth doing

Declaring environment-provided globals in Ruff keeps the code focused on business logic, without extra imports or per-line suppressions. It also codifies an intentional contract with the execution environment in a single, centralized place—the linter configuration—so the behavior is consistent across the codebase.

Conclusion

If your Python notebooks depend on Databricks-provided globals like spark, configure Ruff to recognize them through the builtins setting. Keep the code clean, avoid unnecessary noise, and retain helpful static checks everywhere else.

The article is based on a question from StackOverflow by Fabitosh and an answer by InSync.