Pandas — Series & DataFrame Basics

What is Pandas?

Pandas is a Python library that provides fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It's the go-to tool for practical, real-world data analysis in Python.

The Pandas Series

A Series is a one-dimensional, labeled array capable of holding any data type (integers, strings, floats, Python objects, etc.). Think of it as a single column in a spreadsheet. It has two main components:

Data: The actual values.
Index: A label for each data point. If you don't specify an index, Pandas creates a default integer index from 0 to N-1.

Python

import pandas as pd # It's a strong convention to import pandas as pd

# Creating a Series from a list
population = pd.Series([990_000, 850_000, 3_400_000], name='Population')
print("--- Default Index ---")
print(population)

# Creating a Series with a custom index
population_labeled = pd.Series(
    [990_000, 850_000, 3_400_000],
    index=['San Jose', 'San Francisco', 'Los Angeles'],
    name='Population'
)
print("\n--- Custom Index ---")
print(population_labeled)

# Accessing data via index label
print(f"\nPopulation of Los Angeles: {population_labeled['Los Angeles']}")

The Pandas DataFrame

A DataFrame is a two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns). It's the most commonly used Pandas object. Think of it as a spreadsheet, an SQL table, or a dictionary of Series objects.

A DataFrame has both a row index and a column index.

Python

# Creating a DataFrame from a dictionary
data = {
    'City': ['San Jose', 'San Francisco', 'Los Angeles'],
    'Population': [990_000, 850_000, 3_400_000],
    'State': ['CA', 'CA', 'CA']
}
df = pd.DataFrame(data)
print(df)

Inspecting Your DataFrame

Once you've loaded your data, you'll want to inspect it. Here are some essential methods:

df.head(): View the first 5 rows.
df.tail(): View the last 5 rows.
df.shape: Get the dimensions (rows, columns).
df.info(): Get a concise summary, including data types and non-null counts.
df.describe(): Get descriptive statistics for numerical columns (count, mean, std, etc.).

Python

# Using the DataFrame created above
print("\n--- DataFrame Info ---")
df.info()

print("\n--- Descriptive Stats ---")
print(df.describe())

LearnCodePro

Pandas — Series & DataFrame Basics

What is Pandas?

The Pandas Series

The Pandas DataFrame

Inspecting Your DataFrame

NumPy Basics: Arrays, Broadcasting & Vectorization

Pandas — Indexing, Selection & Boolean Masks

Data Cleaning — Missing Values, Duplicates & Outliers

Feature Extraction from Dates & Text

GroupBy, Pivot Tables & Aggregation Patterns

Quick Navigation

This Series

Topics in Data Science, Machine Learning & AI

Categories

Learn More

Want to Track Your Progress?

What is Pandas?

The Pandas Series

The Pandas DataFrame

Inspecting Your DataFrame

More in Data Wrangling & EDA (Pandas / NumPy)

NumPy Basics: Arrays, Broadcasting & Vectorization

Pandas — Indexing, Selection & Boolean Masks

Data Cleaning — Missing Values, Duplicates & Outliers

Feature Extraction from Dates & Text

GroupBy, Pivot Tables & Aggregation Patterns

Quick Navigation

This Series

Topics in Data Science, Machine Learning & AI

Categories

Learn More

Want to Track Your Progress?