https://pytroubles.com/en/posts/id2660-fixing-sliding-iqr-outlier-detection-for-tachometer-sensors-why-windows-fail-and-how-to-diagnose

Fixing Sliding IQR Outlier Detection for Tachometer Sensors: Why Windows Fail and How to Diagnose

Sliding IQR Filter Pitfalls on Tachometer Fan-Speed Data: Iteration-Wise Diagnostics and Safer Guardrails

Fixing Sliding IQR Outlier Detection for Tachometer Sensors: Why Windows Fail and How to Diagnose

Learn why a sliding-window IQR outlier filter misses low tachometer fan speed values, and how to add diagnostics, limits, and jump checks to catch sensor errors

2025-12-24T09:00:12+03:00

2025-12-24T09:00:13+03:00

Filtering bad tachometer readings with a sliding IQR rule looks straightforward until it suddenly stops flagging values you’re sure are wrong. If your sensor occasionally returns numbers far below what a full-speed fan can physically produce, you may expect a robust outlier mask to catch them every time. Yet with a short window and shifting quartiles, the mask can intermittently let obvious spikes through. Below is a minimal example that demonstrates why this happens and how to collect iteration-wise diagnostics so you can evaluate the behavior across time rather than one window at a time.Problem setupThe goal is to smooth speed readings from an i2c-based sensor by excluding outliers using an IQR-based mask. Valid full-speed values live roughly between 1700 and 2300, while the sensor sometimes reports values below 1000 that are incorrect during steady full-power operation. The mask is built each iteration on a small sliding window, and it sometimes fails to classify the low values as outliers.Code that reproduces the behaviorThe snippet below appends a new reading, maintains a fixed-size window, computes Q1, Q3, the IQR, lower and upper fences, and applies a boolean mask to drop outliers. Identifiers are different, but the operational logic matches the described approach.from traceback import print_exc from sys import exit N0 = 0 N1 = 1 WIN = 5 def runner(): from board import I2C from adafruit_emc2101 import EMC2101 from time import sleep import numpy as np ring = np.zeros(WIN) rc = N0 try: bus = I2C() if bus is None: raise Exception('No Bus?') ctl = EMC2101(bus) if ctl is None: raise Exception('No Controller?') tick = N0 while True: rpm = ctl.fan_speed ring = np.append(ring, rpm) if ring.size > WIN: ring = np.delete(ring, N0) q1 = np.quantile(ring, 0.25) q3 = np.quantile(ring, 0.75) iqr = q3 - q1 lo = q1 - 1.5 * iqr hi = q3 + 1.5 * iqr print(f"{tick:04} | Q1: {q1}, Q3: {q3}") print(f"{tick:04} | IQR: {iqr}, L: {lo}, U: {hi}") print(f"{tick:04} | Window: {ring}") rej_mask = (ring < lo) | (ring > hi) print(f"{tick:04} | Mask: {rej_mask}") print(f"{tick:04} | Outliers: {ring[rej_mask]}") kept = ring[~rej_mask] avg = np.mean(kept) print(f"{tick:04} | Speed: {rpm:0.2f}, Average: {avg:0.2f}") else: print(f"{tick:04} | Speed: {rpm}") tick += N1 sleep(N1) except KeyboardInterrupt: print("Main Keyboard Interrupt") except OSError as err: print(f"Error: {err.errno} '{err.strerror}'") except Exception: print_exc() finally: return rc if __name__ == '__main__': exit(runner()) Why the mask flips from “works” to “not working”The IQR fences are recomputed on each iteration from a very small window. When one extremely low value is inside the window, it can drag Q1 down and inflate the IQR. That makes both lower and upper bounds very wide, so nothing in that window is flagged as an outlier. The logged values illustrate this shift clearly: in one iteration Q1 is near the bulk of the data and the low value is excluded; in the next iteration the low value sits at Q1, the IQR balloons, the lower bound dives to an unexpected negative number, and the mask lets every point pass. As soon as the window shifts again and the distribution recenters, the low value is again marked as an outlier. This is a natural consequence of a dynamic boxplot on a short sliding window.What to do instead of relying on a single dynamic boxplotIt helps to step back and observe how the fences, mean, and spread evolve across iterations. You can partition the run by iteration index, collect per-iteration statistics, and then examine the entire sequence. Visualization such as a jointplot can be informative for these distributions. If you can obtain bounds or behavior details from the manufacturer, that may be even better for setting limits. If that is not available, consider complementing a boxplot-based rule with additional criteria such as hard boundaries, spikes, drops, or maximum jumps.Collecting iteration-wise diagnosticsThe following pattern creates a per-iteration store and records the boxplot bounds along with basic moments from the current window. After the run, you can iterate through this mapping to inspect, plot, and test the behavior holistically.import numpy as np snapshots = {} # BEGIN LOOP # assume 'ring' holds the current fixed-size window (same as above) # and 'tick' is the iteration counter q1, q3 = np.percentile(ring, [25, 75]) iqr = q3 - q1 lo = float(q1 - 1.5 * iqr) hi = float(q3 + 1.5 * iqr) mu = ring.mean() sd = ring.std() snapshots[tick] = {} snapshots[tick]["mean"] = mu snapshots[tick]["std"] = sd snapshots[tick]["lower"] = lo snapshots[tick]["upper"] = hi snapshots[tick]["min"] = ring.min() snapshots[tick]["max"] = ring.max() snapshots[tick]["window"] = ring.copy() # END LOOP # AFTER ALL ITERATIONS: iterate over 'snapshots' to examine values, # visualize, or run statistical checks across time. Why this mattersOutlier detection is contextual. A rule that recalculates quartiles on a tiny window can change character from one second to the next. For streaming sensor data, that means intermittent misclassification even when the data generation process hasn’t changed. Seeing the evolution of fences and simple statistics over the whole run reveals when and why the rule relaxes or tightens, and whether you need to add guardrails such as fixed limits or change the way you segment the data.TakeawaysThe sliding IQR filter failed intermittently because a single extreme value shifted the quartiles in a small window and expanded the fences so far that no point looked like an outlier. Capture per-iteration statistics to understand these shifts end to end, visualize the joint behavior if possible, and, where available, use manufacturer information to set meaningful bounds. If that’s not an option, augment the dynamic boxplot with simple guardrails like hard thresholds or checks for sudden jumps and drops so that obviously wrong readings don’t slip through when the window realigns.

sliding IQR, outlier detection, tachometer readings, fan speed sensor, I2C sensor, sliding window, IQR filter, boxplot fences, quartiles, diagnostics, hard limits, jump checks, sensor data

2025

2025, Dec 24 09:00

Sliding IQR Filter Pitfalls on Tachometer Fan-Speed Data: Iteration-Wise Diagnostics and Safer Guardrails

Learn why a sliding-window IQR outlier filter misses low tachometer fan speed values, and how to add diagnostics, limits, and jump checks to catch sensor errors

Problem setup

The goal is to smooth speed readings from an i2c-based sensor by excluding outliers using an IQR-based mask. Valid full-speed values live roughly between 1700 and 2300, while the sensor sometimes reports values below 1000 that are incorrect during steady full-power operation. The mask is built each iteration on a small sliding window, and it sometimes fails to classify the low values as outliers.

Code that reproduces the behavior

The snippet below appends a new reading, maintains a fixed-size window, computes Q1, Q3, the IQR, lower and upper fences, and applies a boolean mask to drop outliers. Identifiers are different, but the operational logic matches the described approach.

from traceback import print_exc
from sys import exit
N0 = 0
N1 = 1
WIN = 5
def runner():
    from board import I2C
    from adafruit_emc2101 import EMC2101
    from time import sleep
    import numpy as np
    ring = np.zeros(WIN)
    rc = N0
    try:
        bus = I2C()
        if bus is None:
            raise Exception('No Bus?')
        ctl = EMC2101(bus)
        if ctl is None:
            raise Exception('No Controller?')
        tick = N0
        while True:
            rpm = ctl.fan_speed
            ring = np.append(ring, rpm)
            if ring.size > WIN:
                ring = np.delete(ring, N0)
                q1 = np.quantile(ring, 0.25)
                q3 = np.quantile(ring, 0.75)
                iqr = q3 - q1
                lo = q1 - 1.5 * iqr
                hi = q3 + 1.5 * iqr
                print(f"{tick:04} | Q1: {q1}, Q3: {q3}")
                print(f"{tick:04} | IQR: {iqr}, L: {lo}, U: {hi}")
                print(f"{tick:04} | Window: {ring}")
                rej_mask = (ring < lo) | (ring > hi)
                print(f"{tick:04} | Mask: {rej_mask}")
                print(f"{tick:04} | Outliers: {ring[rej_mask]}")
                kept = ring[~rej_mask]
                avg = np.mean(kept)
                print(f"{tick:04} | Speed: {rpm:0.2f}, Average: {avg:0.2f}")
            else:
                print(f"{tick:04} | Speed: {rpm}")
            tick += N1
            sleep(N1)
    except KeyboardInterrupt:
        print("Main Keyboard Interrupt")
    except OSError as err:
        print(f"Error: {err.errno} '{err.strerror}'")
    except Exception:
        print_exc()
    finally:
        return rc
if __name__ == '__main__':
    exit(runner())

Why the mask flips from “works” to “not working”

The IQR fences are recomputed on each iteration from a very small window. When one extremely low value is inside the window, it can drag Q1 down and inflate the IQR. That makes both lower and upper bounds very wide, so nothing in that window is flagged as an outlier. The logged values illustrate this shift clearly: in one iteration Q1 is near the bulk of the data and the low value is excluded; in the next iteration the low value sits at Q1, the IQR balloons, the lower bound dives to an unexpected negative number, and the mask lets every point pass. As soon as the window shifts again and the distribution recenters, the low value is again marked as an outlier. This is a natural consequence of a dynamic boxplot on a short sliding window.

What to do instead of relying on a single dynamic boxplot

It helps to step back and observe how the fences, mean, and spread evolve across iterations. You can partition the run by iteration index, collect per-iteration statistics, and then examine the entire sequence. Visualization such as a jointplot can be informative for these distributions. If you can obtain bounds or behavior details from the manufacturer, that may be even better for setting limits. If that is not available, consider complementing a boxplot-based rule with additional criteria such as hard boundaries, spikes, drops, or maximum jumps.

Collecting iteration-wise diagnostics

The following pattern creates a per-iteration store and records the boxplot bounds along with basic moments from the current window. After the run, you can iterate through this mapping to inspect, plot, and test the behavior holistically.

import numpy as np
snapshots = {}
# BEGIN LOOP
# assume 'ring' holds the current fixed-size window (same as above)
# and 'tick' is the iteration counter
q1, q3 = np.percentile(ring, [25, 75])
iqr = q3 - q1
lo = float(q1 - 1.5 * iqr)
hi = float(q3 + 1.5 * iqr)
mu = ring.mean()
sd = ring.std()
snapshots[tick] = {}
snapshots[tick]["mean"] = mu
snapshots[tick]["std"] = sd
snapshots[tick]["lower"] = lo
snapshots[tick]["upper"] = hi
snapshots[tick]["min"] = ring.min()
snapshots[tick]["max"] = ring.max()
snapshots[tick]["window"] = ring.copy()
# END LOOP
# AFTER ALL ITERATIONS: iterate over 'snapshots' to examine values,
# visualize, or run statistical checks across time.

Why this matters

Outlier detection is contextual. A rule that recalculates quartiles on a tiny window can change character from one second to the next. For streaming sensor data, that means intermittent misclassification even when the data generation process hasn’t changed. Seeing the evolution of fences and simple statistics over the whole run reveals when and why the rule relaxes or tightens, and whether you need to add guardrails such as fixed limits or change the way you segment the data.

Takeaways

The sliding IQR filter failed intermittently because a single extreme value shifted the quartiles in a small window and expanded the fences so far that no point looked like an outlier. Capture per-iteration statistics to understand these shifts end to end, visualize the joint behavior if possible, and, where available, use manufacturer information to set meaningful bounds. If that’s not an option, augment the dynamic boxplot with simple guardrails like hard thresholds or checks for sudden jumps and drops so that obviously wrong readings don’t slip through when the window realigns.

python statistics