2025, Dec 22 05:00
Silent InfluxDB 2.0 writes in Docker: why Python clients stall and how token auth helps
Troubleshooting silent InfluxDB 2.0 write stalls in Dockerized Python clients: symptoms, code tips, and a practical fix—switch to token-based authentication.
Silent write failures are among the most frustrating issues when you run long-lived clients in Docker. In a setup with InfluxDB 2.0 and a Python writer container, data keeps flowing in, notifications arrive on time, and the database looks healthy — yet writes to InfluxDB suddenly stop without errors or exceptions. A container restart restores normal operation, which points the finger at the client session or runtime state rather than the database itself.
Code example that mirrors the symptoms
The pattern below initializes a Python influxdb-client in SYNCHRONOUS mode, periodically reconnects, and retries a write if a session-related error text appears. Despite that, writes can stall after hours with no visible error.
from influxdb_client import InfluxDBClient
from influxdb_client.client.write_api import SYNCHRONOUS
from datetime import datetime
import time, threading, logging
class InfluxSink:
def __init__(self, endpoint, workspace, vault, user, secret):
self.endpoint = endpoint
self.workspace = workspace
self.vault = vault
self.user = user
self.secret = secret
self.conn = InfluxDBClient(url=endpoint, username=user, password=secret, org=workspace)
self.writer = self.conn.write_api(write_options=SYNCHRONOUS)
threading.Thread(target=self._hourly_rebind, daemon=True).start()
def _hourly_rebind(self):
while True:
time.sleep(3600)
self._rebind()
def _rebind(self):
self.writer.close()
self.conn.close()
self.conn = InfluxDBClient(url=self.endpoint, username=self.user, password=self.secret, org=self.workspace)
self.writer = self.conn.write_api(write_options=SYNCHRONOUS)
def push_event_to_influx(self, series, payload, ts):
data = {"measurement": series, "fields": payload, "time": ts}
try:
self.writer.write(bucket=self.vault, record=data)
self.writer.flush()
except Exception as ex:
if "session not found" in str(ex):
self._rebind()
self.writer.write(bucket=self.vault, record=data)
self.writer.flush()
else:
logging.error(f"Write error: {ex}")
What’s actually happening from the outside
The system behaves normally for a few hours. Notifications continue to arrive in the writer container, confirming the container stays healthy and the data pipeline upstream is intact. InfluxDB itself is stable and is not restarted. The only moving part that consistently restores writes is restarting the writer container. Switching between SYNCHRONOUS and ASYNCHRONOUS, forcing reconnects on a timer, manually handling a potential session timeout, and trimming logs did not change the outcome. The effect looks like the client session or connection becomes unusable over time, but without raising errors to the application level.
A change that helped in practice: token-based auth
Using a token instead of username/password significantly improved stability in this scenario. The client kept writing longer, with the same runtime and the same InfluxDB instance. This observation does not explain why other services continue to work with user/pass, but the token switch proved effective enough to adopt and monitor.
Issuing a token is straightforward. Create one for the organization and assign the needed bucket permissions:
influx auth create \
--org <ORG_NAME> \
--read-buckets \
--write-buckets \
--description "Token for My App"
Then initialize the client with that token:
InfluxDBClient(url="http://localhost:8086", token="my-token", org="my-org")
Same logic, token-based client
Below is the same write flow, still using SYNCHRONOUS mode, periodic reconnects, and a retry on session text — only the authentication switches to a token.
from influxdb_client import InfluxDBClient
from influxdb_client.client.write_api import SYNCHRONOUS
from datetime import datetime
import time, threading, logging
class InfluxSink:
def __init__(self, endpoint, workspace, vault, token):
self.endpoint = endpoint
self.workspace = workspace
self.vault = vault
self.token = token
self.conn = InfluxDBClient(url=endpoint, token=token, org=workspace)
self.writer = self.conn.write_api(write_options=SYNCHRONOUS)
threading.Thread(target=self._hourly_rebind, daemon=True).start()
def _hourly_rebind(self):
while True:
time.sleep(3600)
self._rebind()
def _rebind(self):
self.writer.close()
self.conn.close()
self.conn = InfluxDBClient(url=self.endpoint, token=self.token, org=self.workspace)
self.writer = self.conn.write_api(write_options=SYNCHRONOUS)
def push_event_to_influx(self, series, payload, ts):
data = {"measurement": series, "fields": payload, "time": ts}
try:
self.writer.write(bucket=self.vault, record=data)
self.writer.flush()
except Exception as ex:
if "session not found" in str(ex):
self._rebind()
self.writer.write(bucket=self.vault, record=data)
self.writer.flush()
else:
logging.error(f"Write error: {ex}")
Why it’s worth knowing
Long-running write clients in containers can fail in quiet ways. If the database and upstream producers look fine but writes stop without logs or exceptions, authentication mode can be the deciding factor even when everything else remains unchanged. Token-based auth is a low-friction change that can reduce the likelihood of silent stalls in such setups.
Conclusion
If a Python client writing to InfluxDB 2.0 in Docker appears to hang silently while the service stays responsive and the database remains healthy, try switching from user/password to a token. Keep the client’s write flow and reconnect logic as is, observe for stability over time, and retain simple retry-on-session-text if that fits your operational model. Monitoring remains essential, but in practice the token approach can extend uptime for long-lived write paths.