Pandas — Indexing, Selection & Boolean Masks

Once you have a DataFrame, the next step is to select the specific rows and columns you need for your analysis. Pandas provides several ways to do this, but the two most important are .loc and .iloc.

.loc[]: Selection by Label

Use .loc[] for label-based indexing. This means you refer to rows and columns by their names (i.e., their index labels and column names).

Syntax: df.loc[row_labels, column_labels]

Python

import pandas as pd

data = {'Math': [85, 90, 78], 'Science': [92, 88, 94], 'History': [80, 85, 82]}
index = ['Alice', 'Bob', 'Charlie']
df = pd.DataFrame(data, index=index)
print("Original DataFrame:")
print(df)

# Select a single row (returns a Series)
alice_scores = df.loc['Alice']
print("\nAlice's scores:")
print(alice_scores)

# Select multiple rows and a single column
bob_charlie_math = df.loc[['Bob', 'Charlie'], 'Math']
print("\nBob and Charlie's Math scores:")
print(bob_charlie_math)

# Select a range of rows and columns (slicing)
# Note: When slicing with labels, the endpoint is INCLUDED.
all_scores_sci_hist = df.loc['Alice':'Charlie', 'Science':'History']
print("\nAll scores for Science and History:")
print(all_scores_sci_hist)

.iloc[]: Selection by Position

Use .iloc[] for integer position-based indexing. This means you refer to rows and columns by their integer position, starting from 0.

Syntax: df.iloc[row_positions, column_positions]

Python

# Using the same DataFrame as above

# Select the first row (index position 0)
first_row = df.iloc[0]
print("\nFirst row (Alice):")
print(first_row)

# Select the last row (index position -1)
last_row = df.iloc[-1]
print("\nLast row (Charlie):")
print(last_row)

# Select rows 0 and 2, and columns 0 and 2
subset = df.iloc[[0, 2], [0, 2]]
print("\nAlice & Charlie's Math & History scores:")
print(subset)

# Select a range of rows and columns
# Note: When slicing with integers, the endpoint is EXCLUDED.
first_two_rows_cols = df.iloc[0:2, 0:2]
print("\nFirst two rows and columns:")
print(first_two_rows_cols)

Boolean Masking: Conditional Selection

This is one of the most powerful features of Pandas. You can filter your data based on a condition, creating a "mask" of True/False values.

Create the condition: This results in a Series of booleans.
Apply the mask: Pass this Series into the DataFrame using [] or .loc[].

Python

# Find all students who scored above 80 in Science
science_mask = df['Science'] > 80
print("\nBoolean mask for Science > 80:")
print(science_mask)

# Apply the mask to the DataFrame
high_science_scorers = df[science_mask]
print("\nStudents who scored > 80 in Science:")
print(high_science_scorers)

# You can combine multiple conditions with & (and) and | (or).
# Remember to wrap each condition in parentheses!
high_math_and_history = df[(df['Math'] > 80) & (df['History'] > 80)]
print("\nStudents who scored > 80 in both Math and History:")
print(high_math_and_history)

LearnCodePro

Pandas — Indexing, Selection & Boolean Masks

.loc[]: Selection by Label

.iloc[]: Selection by Position

Boolean Masking: Conditional Selection

NumPy Basics: Arrays, Broadcasting & Vectorization

Pandas — Series & DataFrame Basics

Data Cleaning — Missing Values, Duplicates & Outliers

Feature Extraction from Dates & Text

GroupBy, Pivot Tables & Aggregation Patterns

Quick Navigation

This Series

Topics in Data Science, Machine Learning & AI

Categories

Learn More

Want to Track Your Progress?

.loc[]: Selection by Label

.iloc[]: Selection by Position

Boolean Masking: Conditional Selection

More in Data Wrangling & EDA (Pandas / NumPy)

NumPy Basics: Arrays, Broadcasting & Vectorization

Pandas — Series & DataFrame Basics

Data Cleaning — Missing Values, Duplicates & Outliers

Feature Extraction from Dates & Text

GroupBy, Pivot Tables & Aggregation Patterns

Quick Navigation

This Series

Topics in Data Science, Machine Learning & AI

Categories

Learn More

Want to Track Your Progress?