Pandas

Perfect my data manipulation and data wrangling skills

📖 What is Pandas?

Pandas is a powerful Python library used for data manipulation and analysis. It provides fast, flexible data structures that make working with structured data intuitive and efficient.

Pandas is built around two core data structures:

🔹 Series

A Series is a one‑dimensional labeled array that can hold any data type (integers, floats, strings, etc.).

Think of it as:

A single column in a table
A labeled NumPy array
A Python dictionary with ordered keys

Key characteristics:

Has an index and values
Homogeneous dtype
Vectorized operations

🔹 DataFrame

A DataFrame is a two‑dimensional labeled data structure with columns that can have different data types.

Think of it as:

A spreadsheet table
A SQL table
A collection of Series sharing the same index

Key characteristics:

Rows and columns
Heterogeneous data types
Powerful indexing and alignment

🧠 Python Prerequisites

Before working through these lessons, viewers should be comfortable with the following Python fundamentals:

✅ Core Python Basics

Variables and data types (int, float, str, bool)
Lists, tuples, dictionaries, and sets
Basic operators and expressions
Writing and calling functions
Lambda functions

✅ Control Flow

if / elif / else
for and while loops
List comprehensions

✅ Working with Python Objects

Indexing and slicing
Basic error handling
Understanding mutability vs immutability

✅ Helpful (but not strictly required)

Basic NumPy familiarity
File paths and working directories
Virtual environments

📌 Recommended level: Early intermediate Python.

📌 Overview

This repository documents my journey to mastering data manipulation using Pandas. It contains structured lessons, practice notebooks, and examples covering the most essential concepts used in real-world data analysis and data science workflows.

The goal of this repository is to build strong, practical skills in reading, transforming, analyzing, and combining datasets efficiently using Python and Pandas.

📚 Lessons Covered

1️⃣ Creating, Reading and Writing

You can’t work with data if you can’t read it.

Creating Series and DataFrame objects
Reading data from:
- CSV files
- Excel files
- JSON files
Writing data back to files
Understanding dataset structure

📂 Focus: Importing and exporting data properly.

2️⃣ Selecting, Filtering & Assigning

Core skills used daily by data professionals.

Selecting columns and rows
.loc[] and .iloc[]
Boolean filtering
Conditional selection
Assigning new columns
Modifying existing data

📂 Focus: Accessing exactly the data you need.

3️⃣ Summary Functions and Maps

Extract insights from raw data.

Summary statistics (mean, median, describe, etc.)
Value counts
Unique values
map() and apply()
Custom functions on columns

📂 Focus: Turning data into information.

4️⃣ Grouping and Sorting

Scale up your level of insight.

groupby() operations
Aggregations
Multiple aggregations
Sorting values
Sorting by index
Ranking within groups

📂 Focus: Analyzing complex datasets efficiently.

5️⃣ Data Types and Missing Values

Handle common real-world data problems.

Data types (int, float, object, category)
Type conversion
Detecting missing values
Handling NaN
Filling missing data
Dropping missing data

📂 Focus: Cleaning and preparing data for analysis.

6️⃣ Renaming and Combining

Make sense of data from multiple sources.

Renaming columns and indices
Concatenation
Merging datasets
Joining datasets
Handling multi-index structures

📂 Focus: Building complete datasets from multiple pieces.

🛠 Technologies Used

Python 3.x
Pandas
Jupyter Notebook / VS Code

🎯 Goals of This Repository

Strengthen practical Pandas skills
Improve data cleaning techniques
Master data transformation workflows
Build a strong foundation for:
- Data Analysis
- Machine Learning
- Data Science projects

🚀 Who This Repository Is For

Aspiring data analysts
Future machine learning engineers
Python developers working with data
Anyone wanting production‑ready Pandas skills

Progress: Ongoing and continuously improving.

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
2. DataFrame		2. DataFrame
1. Series.ipynb		1. Series.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pandas

📖 What is Pandas?

🔹 Series

🔹 DataFrame

🧠 Python Prerequisites

✅ Core Python Basics

✅ Control Flow

✅ Working with Python Objects

✅ Helpful (but not strictly required)

📌 Overview

📚 Lessons Covered

1️⃣ Creating, Reading and Writing

2️⃣ Selecting, Filtering & Assigning

3️⃣ Summary Functions and Maps

4️⃣ Grouping and Sorting

5️⃣ Data Types and Missing Values

6️⃣ Renaming and Combining

🛠 Technologies Used

🎯 Goals of This Repository

🚀 Who This Repository Is For

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pandas

📖 What is Pandas?

🔹 Series

🔹 DataFrame

🧠 Python Prerequisites

✅ Core Python Basics

✅ Control Flow

✅ Working with Python Objects

✅ Helpful (but not strictly required)

📌 Overview

📚 Lessons Covered

1️⃣ Creating, Reading and Writing

2️⃣ Selecting, Filtering & Assigning

3️⃣ Summary Functions and Maps

4️⃣ Grouping and Sorting

5️⃣ Data Types and Missing Values

6️⃣ Renaming and Combining

🛠 Technologies Used

🎯 Goals of This Repository

🚀 Who This Repository Is For

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages