top of page

College Data Cleaning

Introduction:

This was a personal project I completed on my own. I had been introduced to data cleaning, and its importance in the workplace due to its time-consuming tasks, so I took it upon myself to practice this skill. I browsed the web for really dirty data, but couldn't find a dataset that matched my preferences. So I created my own dirty data in Excel, which consisted of 2 worksheets. The first worksheet was student data, with 15 records and 13 columns. The second worksheet was university data, with 14 records and 10 columns. The main point of this project was to practice and demonstrate my skills in data cleaning.

Description:

I first started by generating the fake, dirty data by manually inserting the data. Since it was only a couple records, this process was relatively easy. The data was now really dirty, and the next step was cleaning it. I used a large number of Excel functions and features to clean the data, which included: removing duplicates, applying data validation rules, trim, proper, find and replace, nested if, changing data types, filling missing values, concat, text, xlookup, left, and text to columns.

Results:

The end result was amazing. The 2 dirty worksheets were very dirty and unusable, but the end result was a clean, shiny dataset that had usable data, ready for analysis. I wanted to create a dirty dataset myself and apply all the functions and features I had learned from courses to clean the data. After completing this project, I feel much more confident using data cleaning techniques in Excel to clean data in a work environment.

© 2025 by Mohammad Shiha. Powered and secured by Wix.

bottom of page