https://pytroubles.com/en/posts/id1883-pandas-dataframe-indexing-explained-brackets-loc-vs-iloc-label-vs-position-selection

Pandas DataFrame Indexing Explained: Brackets, loc vs iloc, Label vs Position Selection

Understand Pandas DataFrame Indexing: How Brackets, loc and iloc Differ by Labels and Positions

Pandas DataFrame Indexing Explained: Brackets, loc vs iloc, Label vs Position Selection

Learn how pandas DataFrame indexing works: brackets for columns, loc for labels, iloc for positions. Clear rules, code examples, and tips to avoid confusion.

2025-11-14T23:00:09+03:00

2025-11-14T23:00:10+03:00

pandas gives you three ways to index a DataFrame: square brackets, loc and iloc (with ix in the past). What looks redundant at first is the result of early design decisions that shaped how selection works. Understanding those choices removes the confusion around label-based and position-based access and helps you write predictable, readable code.Problem overviewThe root of the confusion is that square brackets on a DataFrame are not a two-dimensional slicer. Instead, the brackets prioritize selecting a lower-dimensional slice, i.e., a single column. At the same time, pandas emphasizes Index objects (labels) over integer positions, while the default row labels are numeric. Numeric labels make label-based access look like positional access, even though they are different concepts.Code example: where confusion startsThe following snippet shows how the same-looking number can refer to a label or to a position depending on the accessor you choose:import pandas as pd frame = pd.DataFrame({"c1": [10, 20, 30], "c2": [40, 50, 60]}, index=[0, 1, 2]) # Brackets prioritize a single column selection (lower-dimensional slice) column_series = frame["c1"] # Label-based access via loc: the number 0 here is a label row_by_label = frame.loc[0] # Position-based access via iloc: the number 0 here is a position row_by_position = frame.iloc[0] At a glance these look interchangeable, but they follow different rules. With loc you address labels; with iloc you address integer positions. Using default numeric row labels blurs the distinction, which is why code becomes ambiguous if you are not explicit about intent.What is actually going onTwo design choices explain the current behavior. First, square brackets were designed to pull out lower-dimensional data rather than perform two-dimensional slicing. In other words, for DataFrames the bracket operator targets a Series by column name:“the primary function of indexing with [] is selecting out lower-dimensional slices”, i.e. for dataframes df[colname] should “return the series corresponding to colname”.Second, pandas puts label-aware Index objects at the center of selection and alignment. Labels are hashable keys, not positions. Functions that work with labels raise KeyError on missing keys, while iloc, which specifically avoids labels, raises IndexError on out-of-range positions. Because default row labels are numeric, users easily conflate labels with positions. One way to disambiguate would be to always route label intent through an Index object, but label semantics were embedded directly into DataFrame via loc, making the intent explicit in the API.Solution: be explicit about labels vs positionsThe practical fix is to encode intent in your code. Use brackets to select columns, loc for label-based selection and iloc for integer position. This removes ambiguity when row labels are numeric.import pandas as pd grid = pd.DataFrame({"alpha": [1, 2, 3], "beta": [4, 5, 6]}, index=[0, 1, 2]) # Column selection (lower-dimensional): col_alpha = grid["alpha"] # Label-based row selection: first_row_by_label = grid.loc[0] # Position-based row selection: first_row_by_pos = grid.iloc[0] # Explicit 2D selection by labels: rows_l = [0] cols_l = ["alpha"] label_cut = grid.loc[rows_l, cols_l] # Explicit 2D selection by positions: rows_p = [0] cols_p = [0] positional_cut = grid.iloc[rows_p, cols_p] If a label is absent, label-oriented selection fails with a key error; if a position is out of range, positional selection fails with an index error. Keeping that mental model helps you immediately diagnose surprises during selection.Why this mattersThese choices mean DataFrames are not treated purely as 2D matrices. Brackets are optimized for the “dict-like collection of Series” view, while loc and iloc serve two-dimensional slicing with clearly different semantics. Over time, loc and iloc became the everyday tools, and position-based indexing gained prominence again alongside label-aware selection. Direct reliance on numeric row labels became less common precisely because it confuses intent.TakeawaysThink in terms of “labels vs positions”, not “one accessor fits all”. Reach for brackets when you need a column, loc when your logic is keyed by labels and iloc when you target integer positions. This matches how pandas evolved: the bracket operator is about lower-dimensional selection, while loc and iloc are the unambiguous, two-dimensional workhorses. Adopting this habit makes your data slicing clear, predictable and easier to maintain.

pandas indexing, pandas loc, pandas iloc, DataFrame selection, label-based indexing, position-based indexing, square brackets, DataFrame slicing, loc vs iloc, select columns, labels vs positions

2025

2025, Nov 14 23:00

Understand Pandas DataFrame Indexing: How Brackets, loc and iloc Differ by Labels and Positions

Learn how pandas DataFrame indexing works: brackets for columns, loc for labels, iloc for positions. Clear rules, code examples, and tips to avoid confusion.

Problem overview

The root of the confusion is that square brackets on a DataFrame are not a two-dimensional slicer. Instead, the brackets prioritize selecting a lower-dimensional slice, i.e., a single column. At the same time, pandas emphasizes Index objects (labels) over integer positions, while the default row labels are numeric. Numeric labels make label-based access look like positional access, even though they are different concepts.

Code example: where confusion starts

The following snippet shows how the same-looking number can refer to a label or to a position depending on the accessor you choose:

import pandas as pd
frame = pd.DataFrame({"c1": [10, 20, 30], "c2": [40, 50, 60]}, index=[0, 1, 2])
# Brackets prioritize a single column selection (lower-dimensional slice)
column_series = frame["c1"]
# Label-based access via loc: the number 0 here is a label
row_by_label = frame.loc[0]
# Position-based access via iloc: the number 0 here is a position
row_by_position = frame.iloc[0]

At a glance these look interchangeable, but they follow different rules. With loc you address labels; with iloc you address integer positions. Using default numeric row labels blurs the distinction, which is why code becomes ambiguous if you are not explicit about intent.

What is actually going on

Two design choices explain the current behavior. First, square brackets were designed to pull out lower-dimensional data rather than perform two-dimensional slicing. In other words, for DataFrames the bracket operator targets a Series by column name:

“the primary function of indexing with [] is selecting out lower-dimensional slices”, i.e. for dataframes df[colname] should “return the series corresponding to colname”.

Second, pandas puts label-aware Index objects at the center of selection and alignment. Labels are hashable keys, not positions. Functions that work with labels raise KeyError on missing keys, while iloc, which specifically avoids labels, raises IndexError on out-of-range positions. Because default row labels are numeric, users easily conflate labels with positions. One way to disambiguate would be to always route label intent through an Index object, but label semantics were embedded directly into DataFrame via loc, making the intent explicit in the API.

Solution: be explicit about labels vs positions

The practical fix is to encode intent in your code. Use brackets to select columns, loc for label-based selection and iloc for integer position. This removes ambiguity when row labels are numeric.

import pandas as pd
grid = pd.DataFrame({"alpha": [1, 2, 3], "beta": [4, 5, 6]}, index=[0, 1, 2])
# Column selection (lower-dimensional):
col_alpha = grid["alpha"]
# Label-based row selection:
first_row_by_label = grid.loc[0]
# Position-based row selection:
first_row_by_pos = grid.iloc[0]
# Explicit 2D selection by labels:
rows_l = [0]
cols_l = ["alpha"]
label_cut = grid.loc[rows_l, cols_l]
# Explicit 2D selection by positions:
rows_p = [0]
cols_p = [0]
positional_cut = grid.iloc[rows_p, cols_p]

If a label is absent, label-oriented selection fails with a key error; if a position is out of range, positional selection fails with an index error. Keeping that mental model helps you immediately diagnose surprises during selection.

Why this matters

These choices mean DataFrames are not treated purely as 2D matrices. Brackets are optimized for the “dict-like collection of Series” view, while loc and iloc serve two-dimensional slicing with clearly different semantics. Over time, loc and iloc became the everyday tools, and position-based indexing gained prominence again alongside label-aware selection. Direct reliance on numeric row labels became less common precisely because it confuses intent.

Takeaways

Think in terms of “labels vs positions”, not “one accessor fits all”. Reach for brackets when you need a column, loc when your logic is keyed by labels and iloc when you target integer positions. This matches how pandas evolved: the bracket operator is about lower-dimensional selection, while loc and iloc are the unambiguous, two-dimensional workhorses. Adopting this habit makes your data slicing clear, predictable and easier to maintain.

pandas python