8 WEEKS SQL CHALLENGE - Case study 1

8 WEEKS SQL CHALLENGE - Case study 1

Danny's Diner

·

6 min read

I came across this challenge after learning SQL on Datacamp and figured it would help me practice and master the concepts. It would also look good in my portfolio, right? These challenges are the best for beginners because they equip you with a taste of what it's like to solve real-life problems using SQL.

Problem Statement
"Danny wants to use the data to answer a few simple questions about his customers, especially about their visiting patterns, how much money they’ve spent and also which menu items are their favourite. Having this deeper connection with his customers will help him deliver a better and more personalised experience for his loyal customers. He plans on using these insights to help him decide whether he should expand the existing customer loyalty program - additionally, he needs help to generate some basic datasets so his team can easily inspect the data without needing to use SQL. Danny has provided you with a sample of his overall customer data due to privacy issues - but he hopes that these examples are enough for you to write fully functioning SQL queries to help him answer his questions! Danny has shared with you 3 key datasets for this case study: sales, menu and members"

The full code for these queries can be found on my GitHub

After creating the database and tables, I had to answer the following questions.

CASE STUDY QUESTIONS

This is the fun part. I used Postgresql to write all my queries.

  1. What is the total amount each customer spent at the restaurant?

  2. How many days has each customer visited the restaurant?

  3. What was the first item from the menu purchased by each customer?

  4. What is the most purchased item on the menu and how many times was it purchased by all customers?

  5. Which item was the most popular for each customer?

  6. Which item was purchased first by the customer after they became a member?

  7. Which item was purchased just before the customer became a member?

  8. What are the total items and amount spent for each member before they became a member?

  9. If each $1 spent equates to 10 points and sushi has a 2x points multiplier - how many points would each customer have?

  10. In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi - how many points do customers A and B have at the end of January?

What is the total amount each customer spent at the restaurant?

I joined the tables to get the price and customer id which were in different tables and used the SUM() and GROUP BY functions.

Conclusion: A has spent the most while C has spent the least amount

How many days has each customer visited the restaurant?

I used COUNT(DISTINCT) and GROUP BY to get the answer.

Conclusion: B is the most frequent customer while C is the least frequent customer

What was the first item from the menu purchased by each customer?

I SELECT DISTINCT from a subquery to get unique combinations of customer_id and first_buy in case someone bought the same item twice on the first day. I also use the subquery to get the minimum /earliest order date for each customer from the sales table. The GROUP BY clause groups the rows by customer_id and calculates the minimum order_date for each group.

I then joined the subquery and sales s2 and then joined the outcome to the menu(m) for the final result

Conclusion: Customer A's first items were curry and sushi, B bought curry and C bought ramen.

What is the most purchased item on the menu and how many times was it purchased by all customers?

First I wrote a subquery to get the most purchased item product_idthen used it in a WHERE clause during the joining of the sales and menu tables to get the product_name and product_count. I learnt that In SQL when you use the GROUP BY clause, you need to include all non-aggregated columns that are present in the SELECT clause otherwise, it will throw you an error.

Conclusion: Seems like ramen is the most popular product. Bought 3 times by A and C and 2 times by B.

Which item was the most popular for each customer?

I learnt about using RANK() OVER and PARTITION BY to rank the items for each customer based on the purchase count. The outer query selects the rows where the rank = 1, indicating the most popular item for each customer.

Conclusion: Ramen is A and C's favourite and B likes everything equally.

Which item was purchased first by the customer after they became a member?

I used a subquery that got the customer_id, product_id, order_date, and assigned a row number to each row partitioned by customer_id. The inner join condition ensured that only sales made after a customer became a member were included. I later used an outer query to join the subquery with the menu table (m) based on the matching product_id.

Conclusion: A's first purchase after becoming a member was curry and B's was sushi.

Which item was purchased just before the customer became a member?

This question only differs from the one above because we are looking for the item last purchased before the customer became a member. The logic is quite the same.

Conclusion: A purchased sushi and curry just before becoming a member while B purchased sushi.

What are the total items and amount spent for each member before they became a member?

The main query joins the sales table (s) with the menu table to get the price of each item, then later joined with the subquery to ensure the order date is before they became a member.

Conclusion: B bought 3 items worth 40 dollars before becoming a member whereas A bought 2 items worth 25 dollars.

If each $1 spent equates to 10 points and sushi has a 2x points multiplier - how many points would each customer have?

The subquery sets the points earned for each product and it's joined to the main query to get the total points for each customer by grouping the outcome by customer_id

Conclusion: B has the most points followed by A and C has the least points.

In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi - how many points do customers A and B have at the end of January?

First I joined all the tables to get data for customers A and B. The CASE statement sets the points calculator to x20 when the order date is within 7 days after the join date, x20 when the product is sushi any day and x10 if none of the other conditions are met. The WHERE statement filters out sales that were not done in January. Aim was to get the total points at the end of January, including orders made before the customer became a member during that month.

Then did the aggregation function to get the total points. I also learnt about a function to extract the month from a date for filtering.

Conclusion: A had the most points that month followed by B.

I really had fun solving these problems
fun fact: I always almost tear up whenever i solve a problem, I get a kick from finding solutions.

If you have any suggestions on areas I could improve on, feel free to tell me.