2025, Oct 31 01:00

How to build an ffmpeg-style CLI in Python: per-file options with argparse and argv pre-processing

Learn why Python argparse can't scope flags to individual files and how to enable ffmpeg-style per-file options by pre-processing sys.argv and parsing slices.

Designing a CLI that accepts multiple input files with per-file options looks straightforward on paper. You might expect a call like this to work out of the box: a flag before one file applies only to that file, another flag before a second file applies to that one, and so on. Tools like ffmpeg popularized this style:

python my.py -a file_a -b file_b --do-stuff=x file_c

Reproducing the issue with argparse

Here is a minimal setup that attempts to define a couple of flags and a positional list of input files:

import argparse
import pathlib
cli = argparse.ArgumentParser()
cli.add_argument("-a", action="store_true")
cli.add_argument("-b", action="store_true")
cli.add_argument("--do-stuff", type=str)
cli.add_argument("files", type=pathlib.Path, nargs="+")
parsed = cli.parse_args()

Run it as shown earlier and you will get:

error: unrecognized arguments: file_b file_c

The goal behind this syntax is to derive a mapping like this for each file:

file_a: {a: True}
file_b: {b: True}
file_c: {do_stuff: 'x'}

Why it fails

Flags like -a and -b are boolean, they don’t take values. Therefore file_a and file_b are not consumed as values for those flags; they are treated as positional arguments. Python’s ArgumentParser does not associate options with the nearest positional argument or consider the order of options relative to positionals when building the result. Even the intermixed parsing mode described in the docs (https://docs.python.org/3/library/argparse.html#intermixed-parsing) is not sufficient here. A call to parse_intermixed_args or parse_known_intermixed_args will extract all options regardless of where they appear relative to file_a or any other positional input. In other words, there is no built-in way to say “this flag belongs to the next file only.”

A workable approach with pre-processing

To emulate the desired behavior, you can pre-process sys.argv, slice the arguments up to each file, and parse each slice separately. The idea is to first collect the files, then for each file parse only the arguments that precede it. One adjustment is required: read file names as str so they can be located later in the original argv sequence.

import sys
import argparse
import pathlib
cli = argparse.ArgumentParser()
cli.add_argument("-a", action="store_true")
cli.add_argument("-b", action="store_true")
cli.add_argument("--do-stuff", type=str)
# Read file names as str so we can find them in argv later
cli.add_argument("files", type=str, nargs="+")
# Collect file names while ignoring the options' positions
file_list = cli.parse_intermixed_args().files
per_input = {}
remaining = sys.argv[1:]
for fname in file_list:
    # Find the current file in the remaining argv slice
    cut = remaining.index(fname) + 1
    # Parse only the args relevant to this file (everything up to and including it)
    per_input[pathlib.Path(fname)] = cli.parse_args(remaining[:cut])
    # Remove what we just parsed and continue
    del remaining[:cut]

This isolates the flags that occur before each file and parses them as if you had invoked the program separately for each input. The intermixed parsing step is used only to retrieve the list of files without letting the parser consume all options up front.

Why this matters

If you’re aiming for a ffmpeg-like command-line experience where options can be scoped to individual inputs, assuming ArgumentParser will respect positional context leads to surprising results. Understanding that the parser does not associate options with adjacent positionals helps you avoid brittle interfaces and makes it clear when pre-processing is required.

Takeaways

Python’s ArgumentParser does not support per-file option scoping based on argument order. Intermixed parsing comes close in spirit but still aggregates options irrespective of their relative position to files. When you need per-input configuration, segment the raw argv yourself, parse each segment, and assemble the per-file mapping explicitly. That keeps the original command style while producing the clear, file-specific option mapping you intended.

The article is based on a question from StackOverflow by anatolyg and an answer by anatolyg.