2025, Oct 26 07:00

Generate index.html for Every Directory in a Static Site with a Clean Python os.walk Approach

Generate index.html in every folder of a static site with Python os.walk. Avoid hardcoded paths and FileNotFoundError using a single-pass walk site-wide.

Generating an index.html for every directory of a static site sounds simple, yet it often turns into a tangle of hardcoded paths, manual recursion and FileNotFoundError surprises. If your script keeps “losing” the current folder or needs an index.py per directory to make progress, you are fighting the filesystem instead of traversing it. There is a cleaner way.

What goes wrong with the naive approach

A common pattern is to aim the script at one folder, write an index.html there, and then try to kick off more scripts for subfolders. That quickly leads to brittle path handling and duplicated logic. Here is a minimized version of that pattern, which illustrates why paths drift and why hardcoding becomes necessary.

from pathlib import Path
def build_listing_html(base_loc):
    base_loc = Path(base_loc)
    base_loc = base_loc / 'FOLDER A'
    with open(base_loc / 'index.html', 'w') as fp:
        for entry in sorted(list(base_loc.iterdir())):
            fp.write(f'<a href="{entry.name}">{entry.name}</a>\n')
if __name__ == '__main__':
    build_listing_html('.')
with open("./FOLDER A/SUBFOLDER B/index.py") as fh:
    exec(fh.read())

This code forces the process into a specific subdirectory and then executes another script for a deeper folder. When everything runs from the project root, relative paths target the wrong place unless you hand-write full locations. That is where the FileNotFoundError comes from and why it feels necessary to keep an index.py in every directory.

The essence of the issue

The filesystem traversal is manual and fragmented. Each step assumes a particular working directory and depends on explicit string paths. As soon as the execution context shifts, relative references break. The result is a proliferation of entry points and duplicated scripts across folders.

A simpler solution: walk the tree once and write indexes as you go

Instead of hopping between directories or executing multiple scripts, iterate the entire directory tree and emit index.html in every folder during that single pass. Using os.walk() keeps the current directory explicit on each iteration and removes the need for ad hoc path fiddling. If you want to keep certain filenames out of the listing, filter them out before writing. If you need more expressive HTML, a templating package such as jinja can help structure the output, but the core traversal stays the same.

import os
def emit_indexes(site_root):
    index_name = "index.html"
    ignore_set = {index_name, "some_other_file.py"}
    link_tpl = '<a href="{filename}">{filename}</a>'
    for dir_here, subdirs, files_here in os.walk(site_root):
        index_path = os.path.join(dir_here, index_name)
        files_here = [x for x in files_here if x not in ignore_set]
        with open(index_path, "w") as outf:
            for fname in files_here:
                outf.write(link_tpl.format(filename=fname) + "\n")
SITE_BASE = "D:/temp/StackOverflow/foo"
emit_indexes(SITE_BASE)

This approach writes an index.html into every directory encountered. It also avoids self-linking by excluding the output file name from the listing, and you can add other entries to ignore as needed. The process is self-contained and does not require a separate index.py file in each folder.

Why this matters

Relying on a single traversal eliminates path confusion and makes failures deterministic. You remove the need to hardcode locations, reduce duplicated scripts, and keep relative href values predictable. It also scales naturally as the tree changes; new subfolders receive an index.html without additional wiring.

Practical wrap-up

Let the filesystem guide the logic. Walk the tree once, write the index for each directory you visit, and filter out files you do not want to expose. If you need richer markup, integrate a template engine like jinja. If something still looks off, print the current dir and output path during the walk to see exactly where the script is writing. With this structure in place, you can stop micromanaging paths and let your static site index itself.

The article is based on a question from StackOverflow by chungusG and an answer by JonSG.