2025, Nov 30 03:00

Why .first() Is Slow in Django QuerySets and How Explicit order_by on time_epoch Makes It Instant

Learn why Django queryset .first() is slow from implicit ordering, and how order_by on an indexed timestamp finds the earliest row in 24 hours instantly.

Django queryset performance often looks straightforward until a subtle default kicks in. A common case: a fast-growing table receives new rows every few seconds, and you need the first row from the past 24 hours. The query feels simple, yet it takes seconds with .first() or Min('id'), while other approaches are instant. The difference comes from implicit ordering and how the database uses available indexes.

Problem overview

You have a model where rows arrive continuously, and timestamp is indexed. The goal is to locate the first entry in the last 24 hours. Fetching the last row is instant, but the first one is unexpectedly slow when using typical patterns.

from django.db import models

class EventRow(models.Model):
    time_epoch = models.PositiveBigIntegerField(db_index=True)

Naïve queries that should return quickly instead take seconds:

import time
from django.db.models import Min

cutoff_epoch = int(time.time()) - 24 * 60 * 60

first_row = EventRow.objects.filter(time_epoch__gte=cutoff_epoch).first()
first_row_id_min = EventRow.objects.filter(time_epoch__gte=cutoff_epoch).aggregate(Min('id'))

In contrast, getting the minimum timestamp is instant, and even pulling all matching ids into Python and computing min locally feels fast in this setup:

fast_ts_min = EventRow.objects.filter(time_epoch__gte=cutoff_epoch).aggregate(Min('time_epoch'))
unsafe_fast = EventRow.objects.filter(time_epoch__gte=cutoff_epoch)[0]  # not guaranteed to be the earliest by time
python_min = min(list(EventRow.objects.filter(time_epoch__gte=cutoff_epoch)
                              .values_list('id', flat=True)))

What is actually happening

The crux is implicit ordering. When you call .first() without an explicit order_by, Django applies a default order on the primary key (id). That means the database now has to satisfy a filter on time_epoch and simultaneously determine the smallest id among the filtered rows. You do have indexes on id and on time_epoch, but this combination makes the query planner avoid using them efficiently for the intended operation.

On the other hand, aggregating Min('time_epoch') aligns with the indexed time_epoch field and completes almost instantly. Likewise, grabbing the first slice unsafe_fast looks fast because it does not guarantee any order at all, so it returns an arbitrary matching row, which is not the correct “first by time” definition.

Clean solution

Make the ordering explicit and aligned with the filter and the indexed field. If you want the earliest record in the last 24 hours, order by the timestamp and take the first row:

import time

cutoff_epoch = int(time.time()) - 24 * 60 * 60
first_by_ts = (
    EventRow.objects
    .filter(time_epoch__gte=cutoff_epoch)
    .order_by('time_epoch')
    .first()
)

This leverages the existing index on time_epoch and avoids the implicit order on id. In practice, that makes the query return essentially instantly in this scenario.

Why you should care

Implicit behavior in query building can derail performance, especially on large, append-heavy tables. Relying on defaults means you may accidentally sort on an unintended field, confusing the database and turning a straightforward index-friendly scan into an expensive operation. The outcome is visible as multi-second delays for what should be a simple lookup.

Takeaways

Always state the ordering that reflects the business definition of “first” or “last”. If the operation is about time, order by the timestamp explicitly. Avoid leaning on implicit ordering via .first(), and remember that a quick array index like queryset[0] is not the same as “earliest by timestamp”. When an aggregate needs to be fast, align it with a field that is indexed and matches your filter condition.

Being deliberate about order_by eliminates ambiguity, keeps your queries index-friendly, and protects latency under high write rates.