2025, Oct 18 21:00

How to call ffmpeg from Python subprocess without errors: split flags and values, use shlex to tokenize safely

Running ffmpeg via Python subprocess fails with 'Unrecognized option' or 'Invalid stream specifier'? Learn to pass arguments: split tokens and use shlex.

When invoking ffmpeg from Python, it’s easy to run into cryptic errors even if the same command line works flawlessly in Terminal. A common pitfall is how arguments are passed to subprocess. The shell and Python treat spaces very differently, and that subtle mismatch is enough to produce errors like “Unrecognized option” or “Invalid stream specifier”. Here is a concise walkthrough of what goes wrong and how to fix it without changing the actual ffmpeg options you intend to use.

Context and the working shell command

The task is straightforward: convert a FLAC file to MP3 320k while preserving metadata and setting ID3v2 version 3. On macOS Sequoia 15.5 with Python 3 and ffmpeg 4.2.1, the following Terminal command runs as expected:

/usr/local/bin/ffmpeg -i "/Volumes/MainData/Media/Media Server/_Stage/03 - People Like Us - Aaron Tippin.flac" -b:a 320k -map_metadata 0 -id3v2_version 3 "/Volumes/MainData/Media/Media Server/_Stage/03 - People Like Us - Aaron Tippin.mp3"

The failing Python example

Translating that directly into a subprocess call by stuffing options with values into single list elements looks reasonable at first glance, but it breaks:

import subprocess

argv = [
    '/usr/local/bin/ffmpeg',
    '-i',
    '/Volumes/MainData/Media/Media Server/_Stage/03 - People Like Us - Aaron Tippin.flac',
    '-b:a 320k',
    '-map_metadata 0',
    '-id3v2_version 3',
    '/Volumes/MainData/Media/Media Server/_Stage/03 - People Like Us - Aaron Tippin.mp3',
]

exit_status = subprocess.call(argv)

This produces errors such as “Unrecognized option 'id3v2_version 3'” and, after removing that, “Invalid stream specifier: a 320k.”

What’s actually wrong

The shell splits a command line into tokens based on whitespace, unless you quote pieces that must stick together. Python’s subprocess, when given a list, does not perform any splitting inside each list element. Each item is passed verbatim as one argument. That means a list element like “-id3v2_version 3” is seen by ffmpeg as a single option name containing a space, which it does not recognize. The same applies to “-b:a 320k”: ffmpeg expects “-b:a” and “320k” as two separate arguments, not one item containing both.

The fix: pass each token as its own list element

Keep every flag and every value as distinct items. Don’t embed spaces within a single element of the argv list. The corrected invocation looks like this:

import subprocess

cmd_parts = [
    '/usr/local/bin/ffmpeg',
    '-i',
    '/Volumes/MainData/Media/Media Server/_Stage/03 - People Like Us - Aaron Tippin.flac',
    '-b:a',
    '320k',
    '-map_metadata',
    '0',
    '-id3v2_version',
    '3',
    '/Volumes/MainData/Media/Media Server/_Stage/03 - People Like Us - Aaron Tippin.mp3',
]

status = subprocess.call(cmd_parts)

This keeps the program, options, and option values identical to the working shell command, while expressing them in the form subprocess actually expects.

If you start from a single command string and need to split it safely, it’s appropriate to rely on shlex to tokenize it the same way a shell would.

Why this detail matters

Mixing shell quoting rules with Python’s argument handling is a source of fragile scripts and misleading diagnostics. ffmpeg’s error messages here are accurate but indirect: the real issue isn’t ffmpeg’s behavior, it’s how the arguments arrive. Ensuring each option and its value are separate elements gives you predictable, cross-environment behavior and eliminates the subtle bugs that appear only when paths include spaces or when an option requires a value.

Conclusion

When calling command-line tools like ffmpeg via Python, treat the argv list as a sequence of already-tokenized arguments. Keep flags and values separate, avoid embedding spaces inside a single list item, and, when you must split a command string, tokenize it with a tool designed for that purpose. This small discipline saves time, prevents puzzling errors, and keeps automation scripts robust.

The article is based on a question from StackOverflow by Brian and an answer by J Earls.