Rapido SQL Mini Case Study 🚲🚕

Overview

This repository contains a SQL-based analytical case study on a fictional Rapido ride-hailing dataset, designed to demonstrate real-world data analysis skills using SQL.

The analysis focuses on:

User behavior and engagement
Ride distance patterns
Vehicle-type performance
Signup cohort analysis
Data segmentation and aggregation

All queries are written in BigQuery SQL and emphasize clarity, correctness, and analytical intent. Refer to Problem Statements - Queries file to access all SQL queries.

Repository Structure

DATASETS → Downloadable CSV, Excel files for RAPIDO Dataset
SCHEMA/ → Table definitions and ER diagram
Problem Statements - Queries/ → 15 business-driven SQL problem statements
RESULTS/ → Sample outputs and validation notes

Entity Relationship Diagram (ERD)

The dataset follows a simple 1-to-many, star-style schema with:

USERS as the dimension table
RIDES as the fact table See SCHEMA/Rapido ER.png for the visual ER diagram.

Dataset Description

**Refer DATASETS file to access and download:

Rapido DATASET - rides, users EXCEL file for easy reference,
users and rides tables, CSV files. Use these tables and create RAPIDO dataset in any SQL dialects.**

USERS Table

Column Name	Description
user_id	Unique user identifier
first_name	User's first name
last_name	User's last name
signup_date	Date the user registered on Rapido

RIDES Table

Column Name	Description
ride_id	Unique ride identifier
user_id	User who took the ride
vehicle_type	Type of vehicle used
start_location	Ride start location
end_location	Ride end location
distance_km	Distance travelled (in km)
captain_rating	Rating given to the captain (0–5)

Key Business Questions Answered

Which users travel the most?
How does ride distance vary by vehicle type?
Which users are highly engaged vs inactive?
How do signup cohorts behave over time?
Which vehicle types attract diverse usage?

SQL Concepts Demonstrated

Aggregations (SUM, AVG, COUNT)
Use of GROUP BY, CASE WHEN for conditonal buckets and grouping.
Filtering with HAVING
CTE's
Subqueries (correlated & non-correlated)
CASE expressions for segmentation
UNION DISTINCT
Date-based cohort filtering
Conditional aggregation
Analytical thinking with averages & thresholds

How to Use This Repository

Review the problem statement at the top of each SQL file.
Examine the query logic and SQL patterns used.
Run queries in BigQuery or MySQL adapt them for other SQL dialects.
Refer to screenshots in RESULTS/ folder for expected outputs and row counts.

Author

~~ Shivaling Battarki Email: shivalingb09@gmail.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rapido SQL Mini Case Study 🚲🚕

Overview

Repository Structure

Entity Relationship Diagram (ERD)

Dataset Description

USERS Table

RIDES Table

Key Business Questions Answered

SQL Concepts Demonstrated

How to Use This Repository

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
DATASETS		DATASETS
Problem Statements - Queries		Problem Statements - Queries
RESULTS		RESULTS
SCHEMA		SCHEMA
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Rapido SQL Mini Case Study 🚲🚕

Overview

Repository Structure

Entity Relationship Diagram (ERD)

Dataset Description

USERS Table

RIDES Table

Key Business Questions Answered

SQL Concepts Demonstrated

How to Use This Repository

Author

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages