Skip to content

Hazardous9hub/RAPIDO-MINI-CASE-STUDY

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Rapido SQL Mini Case Study 🚲🚕

Overview

This repository contains a SQL-based analytical case study on a fictional Rapido ride-hailing dataset, designed to demonstrate real-world data analysis skills using SQL.

The analysis focuses on:

  • User behavior and engagement
  • Ride distance patterns
  • Vehicle-type performance
  • Signup cohort analysis
  • Data segmentation and aggregation

All queries are written in BigQuery SQL and emphasize clarity, correctness, and analytical intent. Refer to Problem Statements - Queries file to access all SQL queries.


Repository Structure

  • DATASETS → Downloadable CSV, Excel files for RAPIDO Dataset
  • SCHEMA/ → Table definitions and ER diagram
  • Problem Statements - Queries/ → 15 business-driven SQL problem statements
  • RESULTS/ → Sample outputs and validation notes

Entity Relationship Diagram (ERD)

The dataset follows a simple 1-to-many, star-style schema with:

  • USERS as the dimension table
  • RIDES as the fact table See SCHEMA/Rapido ER.png for the visual ER diagram.

Dataset Description

**Refer DATASETS file to access and download:

  • Rapido DATASET - rides, users EXCEL file for easy reference,
  • users and rides tables, CSV files. Use these tables and create RAPIDO dataset in any SQL dialects.**

USERS Table

Column Name Description
user_id Unique user identifier
first_name User's first name
last_name User's last name
signup_date Date the user registered on Rapido

RIDES Table

Column Name Description
ride_id Unique ride identifier
user_id User who took the ride
vehicle_type Type of vehicle used
start_location Ride start location
end_location Ride end location
distance_km Distance travelled (in km)
captain_rating Rating given to the captain (0–5)

Key Business Questions Answered

  • Which users travel the most?
  • How does ride distance vary by vehicle type?
  • Which users are highly engaged vs inactive?
  • How do signup cohorts behave over time?
  • Which vehicle types attract diverse usage?

SQL Concepts Demonstrated

  • Aggregations (SUM, AVG, COUNT)
  • Use of GROUP BY, CASE WHEN for conditonal buckets and grouping.
  • Filtering with HAVING
  • CTE's
  • Subqueries (correlated & non-correlated)
  • CASE expressions for segmentation
  • UNION DISTINCT
  • Date-based cohort filtering
  • Conditional aggregation
  • Analytical thinking with averages & thresholds

How to Use This Repository

  1. Review the problem statement at the top of each SQL file.
  2. Examine the query logic and SQL patterns used.
  3. Run queries in BigQuery or MySQL adapt them for other SQL dialects.
  4. Refer to screenshots in RESULTS/ folder for expected outputs and row counts.

Author

~~ Shivaling Battarki Email: shivalingb09@gmail.com

About

SQL-based analytical case study on a Rapido ride-hailing dataset, covering user behavior, ride patterns, vehicle performance, and cohort analysis using BigQuery SQL.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors