MyDukaan Data Analysis
Introduction:
This was a case competition project for my college course: MIS 304--Using and Managing Information Systems. My team and I (5 people) were tasked with analyzing data for MyDukaan--an Amazon-like fictional ecommerce company based in India. Then we needed to present this data to 2 judges. I led the data analytics team, which was comprised of me and 2 other teammates. The other 2 teammates were in charge of the machine learning side. There were 8 tables: Customers, Orders, OrderDetails, OrderPayments, Products, Ratings, Sellers, and PostalCodes. The main point of this project was to develop 3-5 strategic questions that stakeholders would be interested in, and then using data analytics tools like Excel, SQL, and Tableau to explore and analyze the data and derive meaningful insights to present to the judges.
Description:
I first started by importing the 8 Excel workbooks into one workbook using Power Query in Excel. Then I transformed the data and did some data cleaning. Then I used Power Pivot to add the 8 tables to the data model and connect them to form the relationships. This was done so I could then use pivot tables across the 8 connected tables to explore the data to find any trends or patterns. After creating numerous pivot tables and successfully conducting exploratory data analysis, I moved to SQL. I used SQL to derive even more insights, because Excel is quite limited. I wrote complex queries using subqueries, complex joins, and CTE's. After writing a couple queries, I chose 3 of them to be the focus for my presentation. I exported the query results to Tableau. I used Tableau to visualize these query results, because a visual looks much nicer than a table output! I created bar graphs and scatter plots.
The 3 questions I answered using these tools were:
-
How many times did a seller wait 20 or more days to respond to 1-3 star customer review 3 or more times?
-
Which sellers had products that arrived after 30 days from when the customer ordered 15 or more times?
-
Does the length of the product name, length of the description, or number of images impact how well a product sells?
Results:
The end result was one of my greatest achievements. My team and I won this case competition, and there was only a handful of winners. Moreover, I learned so much because for the first time, I had worked with 8 related tables with complex relationships and hundreds of thousands of records. Joining these tables together, whether it was Excel, SQL, or Tableau, was not easy and I ran into some obstacles but learned so much along the way. Although this data was fictional, it was still quite representative of real data, due to its complexity and size. After completing this project and winning the case competition, I feel so much more confident in tackling real world data projects using Excel, SQL, and Tableau.