For this project, I designed a SQL database to hold the historial employee records of the Pewlett Hackard Corporation.
This project was performed in three steps;
-
Data Engineering/Data Modelling;
-
Data Analysis; and
-
Data Visualisation.
Firstly, historical CSV records were inspected and and ERD of the tables were sketched out, I utilised http://www.quickdatabasediagrams.com for this purpose and below is the ERD of the historical CSVs.
I was able to initialise the tables in PostgreSQL by exporting the information and the SQL query from quickdatabasediagrams.com.
After imported the CSV files into the relavant tables in the database, the following queries were completed;
-
Listed following details of each employee: employee number, last name, first name, sex, and salary.
-
Listed first name, last name, and hire date for employees who were hired in 1986.
-
Listed the manager of each department with the following information: department number, department name, the manager's employee number, last name, first name.
-
Listed the department of each employee with the following information: employee number, last name, first name, and department name.
-
Listed first name, last name, and sex for employees whose first name is "Hercules" and last names begin with "B".
-
Listed all employees in the Sales department, including their employee number, last name, first name, and department name.
-
Listed all employees in the Sales and Development departments, including their employee number, last name, first name, and department name.
-
Listed the frequency count of employee last names, i.e., how many employees share each last name in descending order.
To generate a visualisation of the data, I followed below steps;
-
Imported the SQL database into Pandas.
-
Created a histogram to visualise the most common salary ranges for employees.
-
Created a bar chart of average salary by title.
