2025, Dec 12 03:00

Python Iterables vs Iterators: Why next() Raises TypeError on spaCy Doc and How to Iterate Correctly

Learn the difference between iterable and iterator in Python with a spaCy Doc example. Fix TypeError 'object is not an iterator' using iter() and next().

Sometimes Python looks inconsistent at first glance: you call next() on an object and get a TypeError, but a for loop over the same object works flawlessly. This is a classic moment of confusion between an iterator and an iterable. Let’s walk through a concrete example, see why it happens, and how to write code that does exactly what you intend.

Reproducing the issue

The sequence below shows the behavior in an interactive session. Calling next() on a spaCy Doc fails, yet iterating with a for loop prints tokens as expected.

>>> import spacy
>>> pipe = spacy.load("en_core_web_sm")
>>> text_doc = pipe("Berlin looks like a nice city")
>>> next(text_doc)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'spacy.tokens.doc.Doc' object is not an iterator
>>> for tok in text_doc:
...     print(tok)
...
Berlin
looks
like
a
nice
city

What’s actually going on

The key is the distinction between an iterable and an iterator. An iterable is something you can loop over. An iterator is the object that produces items one by one for that loop. Many everyday Python objects are iterables but are not themselves iterators; lists, strings, tuples, and range all fit this pattern, just like spaCy’s Doc in the example above.

When using iterables, it is usually not necessary to call iter() or deal with iterator objects yourself. The for statement does that automatically for you, creating a temporary unnamed variable to hold the iterator for the duration of the loop.

That explains the asymmetry. next(text_doc) raises a TypeError because text_doc is not an iterator. The for loop works because Python implicitly calls iter(text_doc) behind the scenes to obtain an iterator and then advances it until exhaustion.

How to fix it and write it clearly

If your goal is to loop, just use for; Python will manage the iterator for you. If you need to manually pull items one at a time, first obtain an iterator from the iterable using iter(), and then use next() on that iterator.

import spacy

runner = spacy.load("en_core_web_sm")
doc_obj = runner("Berlin looks like a nice city")

# Option 1: idiomatic looping (Python calls iter() for you)
for item in doc_obj:
    print(item)

# Option 2: manual iteration when you need explicit control
walker = iter(doc_obj)
print(next(walker))  # Berlin
print(next(walker))  # looks
# ... continue as needed

Why this matters

Understanding the split between iterable and iterator saves time when debugging and reading code. It clarifies why next() sometimes fails and why for ... in ... works on a broad range of objects. With the right mental model, you choose the right tool: a simple loop for most cases, or explicit control via iter() and next() when you need stepwise consumption.

Takeaways

Not every object you can loop over is an iterator. When next() raises a TypeError, you’re likely dealing with an iterable. Let the for loop do the work for you, or, if you need to advance manually, first call iter() to obtain an iterator and then use next() on that iterator. Keeping this distinction in mind eliminates a common source of confusion in Python iteration.