Skip to content

Latest commit

 

History

History
225 lines (195 loc) · 6.42 KB

File metadata and controls

225 lines (195 loc) · 6.42 KB

Summarizing Data with SQL

Summary Statistics

  1. How many rows are in the pets table? --13 rows
SELECT *
FROM pets
  1. How many female pets are in the pets table? --7
SELECT *
FROM pets
WHERE sex='F';
  1. How many female cats are in the pets table? --4
SELECT *
FROM pets
WHERE species='cat' AND sex='F';
  1. What's the mean age of pets in the pets table? --5.23076923076923
SELECT AVG(age)
FROM pets;
  1. What's the mean age of dogs in the pets table? --6.5
SELECT AVG(age)
FROM pets
WHERE species='dog';
  1. What's the mean age of male dogs in the pets table? - 8.33333333333333
SELECT AVG(age)
FROM pets
WHERE species='dog' AND sex='M';
  1. What's the count, mean, minimum, and maximum of pet ages in the pets table? --13 count, 5.23076923076923 mean, 1 min, 10 max
    • NOTE: SQLite doesn't have built-in formulas for standard deviation or median!
SELECT COUNT(age), AVG(age), MIN(age), MAX(age)
FROM pets;
  1. Repeat the previous problem with the following stipulations:
    • Round the average to one decimal place.
    • Give each column a human-readable column name (for example, "Average Age")
SELECT 
    ROUND(COUNT(age), 1) AS count_of_ages, 
    ROUND(AVG(age), 1) AS average_age, 
    ROUND(MIN(age), 1) AS min_age, 
    ROUND(MAX(age), 1) AS max_age
FROM pets;
  1. How many rows in employees_null have missing salaries? -- 10 rows
SELECT *
FROM employees_null
WHERE salary IS NULL;
  1. How many salespeople in employees_null having nonmissing salaries? --60 rows
SELECT * 
FROM employees_null
WHERE salary IS NOT NULL AND job='Sales';
  1. What's the mean salary of employees who joined the company after 2010? Go back to the usual employees table for this one. -- 78554.0882352941
    • Hint: You may need to use the CAST() function for this. To cast a string as a float, you can do CAST(x AS REAL)
SELECT AVG(CAST(salary AS DECIMAL(10,2))) AS mean_salary
FROM employees
WHERE CAST(strftime('%Y', startdate) AS INTEGER) > 2010;
  1. What's the mean salary of employees in Swiss Francs? --75727.5605
    • Hint: Swiss Francs are abbreviated "CHF" and 1 USD = 0.97 CHF.
SELECT AVG(salary * 0.97) AS mean_salary_chf
FROM employees;
  1. Create a query that computes the mean salary in USD as well as CHF. Give the columns human-readable names (for example "Mean Salary in USD"). Also, format them with comma delimiters and currency symbols.
    • NOTE: Comma-delimiting numbers is only available for integers in SQLite, so rounding (down) to the nearest dollar or franc will be done for us.
    • NOTE2: The symbols for francs is simply Fr. or fr.. So an example output will look like 100,000 Fr..

Aggregating Statistics with GROUP BY

  1. What is the average age of pets by species? --cat 4.33333333333333, dog 6.5, lobster 3.0
SELECT AVG(age)
FROM pets
GROUP BY species;
  1. Repeat the previous problem but make sure the species label is also displayed! Assume this behavior is always being asked of you any time you use GROUP BY. --cat 4.33333333333333, dog 6.5, lobster 3.0
SELECT species, AVG(age)
FROM pets
GROUP BY species;
  1. What is the count, mean, minimum, and maximum age by species in pets?
SELECT species, COUNT(age), AVG(age), MIN(age), MAX(age)
FROM pets
GROUP BY species;
  1. Show the mean salaries of each job title in employees. --Administrator 71986.1428571429, IT 71381.0, Operations 74055.25, Sales 80778.0441176471
SELECT job, AVG(salary)
FROM employees
GROUP BY job;
  1. Show the mean salaries in New Zealand dollars of each job title in employees.
    • NOTE: 1 USD = 1.65 NZD.
SELECT job, AVG(salary * 1.65) AS mean_salary_nzd
FROM employees
GROUP BY job;
  1. Show the mean, min, and max salaries of each job title in employees, as well as the numbers of employees in each category.
SELECT job, AVG(salary), MIN(salary), MAX(salary), COUNT(salary)
FROM employees
GROUP BY job;
  1. Show the mean salaries of each job title in employees sorted descending by salary.
SELECT job, AVG(salary)
FROM employees
GROUP BY job
ORDER BY salary DESC;
  1. What are the top 5 most common first names among employees?
SELECT firstname, COUNT(firstname) as freq
FROM employees
GROUP BY firstname
ORDER BY freq DESC
LIMIT 5;
  1. Show all first names which have exactly 2 occurrences in employees.
SELECT firstname, COUNT(firstname) as freq
FROM employees
GROUP BY firstname
HAVING freq=2;
  1. Take a look at the transactions table to get a idea of what it contains. Note that a transaction may span multiple rows if different items are purchased as part of the same order. The employee who made the order is also given by their ID.
SELECT *
FROM transactions;
  1. Show the top 5 largest orders (and their respective customer) in terms of the numbers of items purchased in that order.
SELECT customer, quantity
FROM transactions
ORDER BY quantity DESC
LIMIT 5;
  1. Show the total cost of each transaction.
    • Hint: The unit_price column is the price of one item. The customer may have purchased multiple.
    • Hint2: Note that transactions here span multiple rows if different items are purchased.
SELECT order_id, SUM(unit_price * quantity) AS total_cost
FROM transactions
GROUP BY order_id;
  1. Show the top 5 transactions in terms of total cost.
SELECT order_id, SUM(unit_price * quantity) AS total_cost
FROM transactions
GROUP BY order_id
ORDER BY total_cost DESC
LIMIT 5;
  1. Show the top 5 customers in terms of total revenue (ie, which customers have we done the most business with in terms of money?)
SELECT customer, SUM(unit_price * quantity) AS total_cost
FROM transactions
GROUP BY customer
ORDER BY total_cost DESC
LIMIT 5;
  1. Show the top 5 employees in terms of revenue generated (ie, which employees made the most in sales?)
SELECT employee_id, SUM(unit_price * quantity) AS total_cost
FROM transactions
GROUP BY employee_id
ORDER BY total_cost DESC
LIMIT 5;
  1. Which customer worked with the largest number of employees?

    • Hint: This is a tough one! Check out the DISTINCT keyword.
  2. Show all customers who've done more than $80,000 worth of business with us.

SELECT customer, SUM(unit_price * quantity) AS total_cost
FROM transactions
GROUP BY customer
HAVING total_cost>80000
ORDER BY total_cost DESC;