2025, Nov 20 01:00

Prevent SimPy closing-time race conditions: gate the queue with acceptance events, not resource grabs

Stop SimPy closing-time race conditions with PriorityResource: event-gated queues enforce hours and cancel late requests. Keep simulations consistent.

Closing-time policies in discrete-event simulations often look straightforward until concurrency sneaks in. In this SimPy scenario, a shop operates between 8am and 8pm, stops accepting at 6pm, and enforces a hard close by claiming all resources at 8pm for 12 hours. Intermittently, entities still manage to slip in and grab a resource right after the close action is initiated, even though the closure requests use a stronger priority. The outcome is confusing: some resources are claimed exactly at close, while others are delayed, and a couple of entities briefly obtain and release a resource during the supposed closed period. Below is a walk-through of the issue and a robust way to enforce opening hours without racing the queue.

Minimal example that exhibits the race

The model below reproduces the behavior with a deterministic random seed (10). It uses a PriorityResource, requests at priority 1 for entities, and a closure claiming process at priority -1 for 12 hours, expecting to fully occupy the resource pool from 8pm to 8am. Names are intentionally chosen to be descriptive and readable.

import simpy
import random
import math
import numpy as np

class BaseSettings:
    random.seed(10)
    sim_duration = 10080
    runs = 1
    capacity_units = 4
    interarrival = 120
    service_duration = 240
    open_hour = 8
    cutoff_hour = 18
    close_hour = 20

class Customer:
    def __init__(self, ident):
        self.ident = ident
        self.arrived_at = np.nan
        self.seized_at = np.nan
        self.left_at = np.nan

class StoreSim:
    def __init__(self, run_idx, cfg):
        self.env = simpy.Environment()
        self.cfg = cfg
        self.counter = 0
        self.run_idx = run_idx
        self.pool = simpy.PriorityResource(self.env, capacity=cfg.capacity_units)

    def clock_parts(self, t):
        day = math.floor(t / (24*60))
        dow = day % 7
        hour = math.floor((t % (day*(24*60)) if day != 0 else t) / 60)
        return day, dow, hour

    def generate_arrivals(self):
        yield self.env.timeout(1)
        while True:
            self.counter += 1
            person = Customer(self.counter)
            self.env.process(self.visit_flow(person))
            gap = round(random.expovariate(1.0 / self.cfg.interarrival))
            yield self.env.timeout(gap)

    def scheduler_close(self):
        while True:
            if self.env.now == 0:
                closed_hours = self.cfg.open_hour
                next_span = self.cfg.close_hour
            else:
                closed_hours = 12
                next_span = 24

            print(f'--!!!!!CLOSING SHOP AT {self.env.now} FOR {closed_hours * 60} MINS!!!!!--')
            k = 0
            for _ in range(self.cfg.capacity_units):
                k += 1
                print(f'--claiming resource {k}')
                self.env.process(self.occupy_slot(closed_hours * 60))
            print(f'--!!!!!SHOP CLOSED!!!!!--')

            yield self.env.timeout(next_span * 60)

    def occupy_slot(self, span):
        with self.pool.request(priority=-1) as rq:
            yield rq
            print(f'--resource claimed for close at {self.env.now} for {span} mins')
            yield self.env.timeout(span)

    def visit_flow(self, cust):
        cust.arrived_at = self.env.now
        day, dow, hour = self.clock_parts(cust.arrived_at)
        print(f'entity {cust.ident} starting process at {cust.arrived_at}')

        if ((hour >= self.cfg.cutoff_hour) and (hour < self.cfg.close_hour)):
            next_close = ((day * 60 * 24) + (self.cfg.close_hour * 60))
            pause = (next_close - cust.arrived_at) + 1
            print(f'entity {cust.ident} arrived in queue after stop accepting.  Time out {pause} mins until close')
            yield self.env.timeout(pause)
            print(f'entity {cust.ident} has waited until close at {self.env.now}')

        attempt = 1
        with self.pool.request(priority=1) as rq:
            now = self.env.now
            print(f'entity {cust.ident} requesting resource at time {now} - attempt {attempt}')
            day, dow, hour = self.clock_parts(now)
            next_cutoff_day = (day + 1 if hour >= self.cfg.cutoff_hour else day)
            next_cutoff = ((next_cutoff_day * 60 * 24) + (self.cfg.cutoff_hour * 60))
            until_cutoff = (next_cutoff - now) + 1
            print(f'entity {cust.ident} has {until_cutoff} mins until shop stops accepting')

            yield rq | self.env.timeout(until_cutoff)

            while not rq.triggered:
                attempt += 1
                print(f'entity {cust.ident} did not get resource as shop stopped accepting at {self.env.now}, entity waiting 2 hours then rejoining queue for attempt {attempt}')
                yield self.env.timeout(((self.cfg.close_hour - self.cfg.cutoff_hour) * 60) + 1)

                print(f'entity {cust.ident} requesting resource at time {self.env.now} - attempt {attempt}')
                now = self.env.now
                day, dow, hour = self.clock_parts(now)
                next_cutoff_day = (day + 1 if hour >= self.cfg.cutoff_hour else day)
                next_cutoff = ((next_cutoff_day * 60 * 24) + (self.cfg.cutoff_hour * 60))
                until_cutoff = (next_cutoff - now) + 1
                print(f'entity {cust.ident} has {until_cutoff} until shop stops accepting')

                yield rq | self.env.timeout(until_cutoff)

            print(f'entity {cust.ident} got resource at {self.env.now} on attempt {attempt}')
            cust.seized_at = self.env.now
            day, dow, hour = self.clock_parts(self.env.now)
            next_close_day = (day + 1 if hour >= self.cfg.close_hour else day)
            hard_close = ((next_close_day * 60 * 24) + (self.cfg.close_hour * 60)) - 1
            window = hard_close - self.env.now
            service = round(random.expovariate(1.0 / self.cfg.service_duration))
            yield self.env.timeout(min(service, window))

            cust.left_at = self.env.now

    def start(self):
        self.env.process(self.generate_arrivals())
        self.env.process(self.scheduler_close())
        self.env.run(until=self.cfg.sim_duration)

def execute(cfg):
    for r in range(cfg.runs):
        print(f"Run {r+1} of {cfg.runs}")
        model = StoreSim(r, cfg)
        model.start()

execute(BaseSettings)

What’s going wrong

Empirically, two things are visible in the logs. First, at the exact close timestamp, the closure routine starts submitting high-priority requests to occupy all capacity for the night. Second, entity activity sometimes interleaves between individual closure claims. As a result, two resources can be claimed immediately at close, while the remaining closures are delayed. During that gap, entities manage to obtain a resource and leave quickly. This is observable in the output where some closure claims land at the close timestamp and others arrive later, after entities have briefly held resources. The model behaves this way intermittently, even with a fixed seed, because the timing of request/grant interleaving at the close boundary is sensitive to how processes are scheduled around simultaneous events.

A robust approach: gate the queue with an event

The fix below stops racing the queue at close. Instead of trying to concurrently seize all resources at 8pm, the shop’s state is modeled explicitly with an acceptance event and a simple scheduler. When the shop is not accepting or closed, entity requests are either not allowed to proceed or are canceled and retried at the next opening. This replaces the need to flood the resource with close-time claims and prevents any entity from slipping through during the off hours. The solution also tracks entities in a list to keep visibility into the waiting population. The random seed remains deterministic at 10.

import simpy
import random
import math
import numpy as np

class BaseSettings:
    random.seed(10)
    sim_duration = 10080
    runs = 1
    capacity_units = 4
    interarrival = 120
    service_duration = 240
    open_hour = 8
    cutoff_hour = 18
    close_hour = 20
    active_days = [0, 1, 2, 3, 4]

class Customer:
    def __init__(self, ident):
        self.ident = ident
        self.arrived_at = np.nan
        self.seized_at = np.nan
        self.left_at = np.nan

class StoreSim:
    def __init__(self, run_idx, cfg):
        self.env = simpy.Environment()
        self.cfg = cfg
        self.counter = 0
        self.run_idx = run_idx
        self.pool = simpy.PriorityResource(self.env, capacity=cfg.capacity_units)
        self.waitlist = []
        self.is_open = False
        self.accepting = False
        self.accept_evt = self.env.event()

    def clock_parts(self, t):
        day = math.floor(t / (24*60))
        dow = day % 7
        hour = math.floor((t % (day*(24*60)) if day != 0 else t) / 60)
        return day, dow, hour

    def schedule_open_state(self, day, dow, hour):
        accepting = ((dow in self.cfg.active_days) and ((hour >= self.cfg.open_hour) and (hour < self.cfg.cutoff_hour)))
        open_flag = ((dow in self.cfg.active_days) and ((hour >= self.cfg.open_hour) and (hour < self.cfg.close_hour)))
        return accepting, open_flag

    def generate_arrivals(self):
        yield self.env.timeout(1)
        while True:
            self.counter += 1
            person = Customer(self.counter)
            self.env.process(self.visit_flow(person))
            gap = round(random.expovariate(1.0 / self.cfg.interarrival))
            yield self.env.timeout(gap)

    def update_accept_event(self):
        if self.accepting:
            if not self.accept_evt.triggered:
                self.accept_evt.succeed()
        else:
            if self.accept_evt.triggered:
                self.accept_evt = self.env.event()

    def gatekeeper(self):
        while True:
            now = self.env.now
            day, dow, hour = self.clock_parts(now)
            should_accept, should_open = self.schedule_open_state(day, dow, hour)

            if should_accept and not self.accepting:
                print(f'!!!!!SHOP START ACCEPTING AT {now}!!!!!')
                self.accepting = True
                self.is_open = True
                self.update_accept_event()

            elif not should_accept and self.accepting:
                print(f'!!!!!SHOP STOP ACCEPTING AT {now}!!!!!')
                self.accepting = False
                self.update_accept_event()

            elif not should_open and self.is_open:
                print(f'!!!!!SHOP CLOSING AT {now}!!!!!')
                self.is_open = False
                self.accepting = False
                if dow < 4:
                    closed_span = (24 - self.cfg.close_hour + self.cfg.open_hour) * 60
                else:
                    closed_span = (72 - self.cfg.close_hour + self.cfg.open_hour) * 60
                print(f'!!!!!SHOP CLOSED!!!!!')
                yield self.env.timeout(closed_span)
                print(f'!!!!!SHOP REOPENING AT {self.env.now}!!!!!')
                self.is_open = True
                self.accepting = True
                self.update_accept_event()

            yield self.env.timeout(60)

    def visit_flow(self, cust):
        cust.arrived_at = self.env.now
        day, dow, hour = self.clock_parts(cust.arrived_at)
        print(f'entity {cust.ident} starting process at {cust.arrived_at}')
        self.waitlist.append(cust)

        got_it = False
        req = None

        while not got_it:
            if not self.accepting:
                yield self.accept_evt

            print(f'entity {cust.ident} requesting resource at {self.env.now}')
            req = self.pool.request()

            now = self.env.now
            day, dow, hour = self.clock_parts(now)
            next_cutoff_day = (day + 1 if hour >= self.cfg.cutoff_hour else day)
            next_cutoff = ((next_cutoff_day * 60 * 24) + (self.cfg.cutoff_hour * 60))
            until_cutoff = max((next_cutoff - now), 1)

            outcome = yield req | self.env.timeout(until_cutoff)

            if req in outcome:
                got_it = True
                print(f'++ entity {cust.ident} got resource at {self.env.now}')
            else:
                if not req.triggered:
                    req.cancel()
                    print(f'entity {cust.ident} did not get resource at  {self.env.now} - retrying...')

            cust.seized_at = self.env.now
            day, dow, hour = self.clock_parts(self.env.now)
            next_close_day = (day + 1 if hour >= self.cfg.close_hour else day)
            hard_close = ((next_close_day * 60 * 24) + (self.cfg.close_hour * 60)) - 1
            window = hard_close - self.env.now
            service = round(random.expovariate(1.0 / self.cfg.service_duration))
            yield self.env.timeout(min(service, window))

            self.pool.release(req)
            print(f'-- entity {cust.ident} released resource at {self.env.now}')
            cust.left_at = self.env.now

    def start(self):
        self.env.process(self.generate_arrivals())
        self.env.process(self.gatekeeper())
        self.env.run(until=self.cfg.sim_duration)

def execute(cfg):
    for r in range(cfg.runs):
        print(f"Run {r+1} of {cfg.runs}")
        model = StoreSim(r, cfg)
        model.start()

execute(BaseSettings)

Why this works

The improved model formalizes two states: accepting and open. Entities block on the acceptance event and can proceed only while the shop is accepting. When the shop stops accepting, pending requests are canceled and the process waits for the next acceptance window to try again. At the hard close, the shop flips to closed and remains so for the precomputed overnight or weekend duration, with no need to occupy all resources. This eliminates the interleaving that previously let entities request and briefly hold resources between individual close-time claims.

Why this matters

Discrete-event models that simulate opening hours, SLAs, or downtime can produce misleading results if boundary conditions are not enforced atomically. A queue that appears to respect cutoffs but occasionally admits late requests may skew utilization, waiting times, and throughput, especially when resource capacity is small. Gating with explicit events removes ambiguity and keeps the model faithful to the policy.

Takeaways

If you must stop accepting requests at a specific time and fully close at a later time, treat the shop schedule as first-class state rather than racing the resource with high-priority requests. Use a single event to control admission, cancel pending requests when acceptance ends, and reattempt only when the shop is open again. This keeps the queue predictable, respects closing rules, and avoids edge-case interleavings at the boundaries. The deterministic seed of 10 makes the behavior reproducible for validation and debugging.