Skip to content

Latest commit

 

History

History
444 lines (368 loc) · 28.5 KB

File metadata and controls

444 lines (368 loc) · 28.5 KB

Case Study #1 - Danny's Diner👨🏻‍🍳

Coding

Contents

In early 2021, Danny follows his passion for Japanese food and opens "Danny's Diner," a charming restaurant offering sushi, curry, and ramen. However, lacking data analysis expertise, the restaurant struggles to leverage the basic data collected during its initial months to make informed business decisions. Danny's Diner seeks assistance in using this data effectively to keep the restaurant thriving.

Danny aims to utilize customer data to gain valuable insights into their visiting patterns, spending habits, and favorite menu items. By establishing a deeper connection with his customers, he can provide a more personalized experience for his loyal patrons.

He plans to use these insights to make informed decisions about expanding the existing customer loyalty program. Additionally, Danny seeks assistance in generating basic datasets for his team to inspect the data conveniently, without requiring SQL expertise.

Due to privacy concerns, he has shared a sample of his overall customer data, hoping it will be sufficient for you to create fully functional SQL queries to address his questions.

The case study revolves around three key datasets:

  • Sales
  • Menu
  • Members

Coding
  1. What is the total amount each customer spent at the restaurant?
  2. SELECT S.customer_id, SUM(M.price) AS total_amnt
    FROM sales S
    JOIN menu M ON S.product_id = M.product_id
    GROUP BY S.customer_id
    ORDER BY customer_id
    Answer:
    Coding
    • The SQL query retrieves the customer_id and calculates the total amount spent (total_amnt) by each customer at the restaurant.
    • It combines data from the sales and menu tables based on matching product_id.
    • The results are grouped by customer_id.
    • The query then calculates the total sum of price for each group of sales records with the same customer_id.
    • Finally, the results are sorted in ascending order based on the customer_id.
  3. How many days has each customer visited the restaurant?
  4. SELECT customer_id, COUNT(DISTINCT order_date) AS No_Days
    FROM sales
    GROUP BY customer_id
    Answer:
    Coding
    • The SQL query selects the customer_id and counts the number of distinct order dates (No_Days) for each customer.
    • It retrieves data from the sales table.
    • The results are grouped by customer_id.
    • The COUNT(DISTINCT order_date) function calculates the number of unique order dates for each customer.
    • Finally, the query presents the total number of unique order dates as No_Days for each customer.
  5. What was the first item from the menu purchased by each customer?
  6. WITH CTE AS
    (SELECT S.customer_id,DENSE_RANK() OVER(PARTITION BY S.customer_id ORDER BY S.order_date)AS rn,M.product_name
    FROM sales S
    JOIN menu M ON S.product_id=M.product_id)
    
    SELECT customer_id,product_name
    FROM CTE
    WHERE rn=1
    Answer:
    Coding
    • The SQL query uses a Common Table Expression (CTE) named CTE to generate a temporary result set.
    • Within the CTE, it selects the customer_id, assigns a dense rank to each row based on the order date for each customer, and retrieves the corresponding product_name from the menu table.
    • The sales table is joined with the menu table on matching product_id.
    • The DENSE_RANK() function assigns a rank to each row within the partition of each customer_id based on the order_date in ascending order.
    • Each customer_id has its own partition and separate ranks based on the order dates of their purchases.
    • Next, the main query selects the customer_id and corresponding product_name from the CTE.
    • It filters the results and only includes rows where the rank rn is equal to 1, which means the earliest purchase for each customer_id.
    • As a result, the query returns the first purchased product for each customer.
  7. What is the most purchased item on the menu and how many times was it purchased by all customers?
  8. SELECT M.product_name,COUNT(S.product_id)AS most_ordered
    FROM Sales S
    JOIN menu M ON S.product_id=M.product_id
    GROUP BY M.product_name
    ORDER BY most_ordered DESC
    LIMIT 1
    Answer:
    Coding
    • The SQL query selects the product_name from the menu table and counts the number of times each product was ordered (most_ordered).
    • It retrieves data from the Sales table and joins it with the menu table based on matching product_id.
    • The results are grouped by product_name.
    • The COUNT(S.product_id) function calculates the number of occurrences of each product_id in the Sales table.
    • The query then presents the product_name and its corresponding count as most_ordered for each product.
    • Next, the results are sorted in descending order based on the most_ordered column, so the most ordered product appears first.
    • The LIMIT 1 clause is used to restrict the result to only one row, effectively showing the most ordered product.
  9. Which item was the most popular for each customer?
  10. WITH CTE AS
    (SELECT S.customer_id,M.product_name,COUNT(M.product_id)AS order_count,DENSE_RANK() OVER(PARTITION BY S.customer_id ORDER BY COUNT(S.product_id)DESC) AS rnk
    FROM sales S
    JOIN menu M ON S.product_id=M.product_id
    GROUP BY S.customer_id,M.product_name)
    
    SELECT customer_id,product_name,order_count
    FROM CTE
    WHERE rnk=1
    Answer:
    Coding
    • The SQL query uses a Common Table Expression (CTE) named CTE to generate a temporary result set.
    • Within the CTE, it selects the customer_id, product_name, and counts the number of times each product was ordered (order_count) for each customer.
    • It retrieves data from the sales table and joins it with the menu table based on matching product_id.
    • The results are grouped by customer_id and product_name to get the count of orders for each product of each customer.
    • The COUNT(M.product_id) function calculates the number of occurrences of each product_id in the menu table.
    • The DENSE_RANK() function assigns a rank to each row within the partition of each customer_id based on the order count of products in descending order.
    • Each customer_id has its own partition and separate ranks based on the number of product orders.
    • Next, the main query selects the customer_id, product_name, and order_count from the CTE.
    • It filters the results and only includes rows where the rank rnk is equal to 1, which means the most ordered product for each customer_id.
    • As a result, the query returns the customer's ID, the most ordered product, and the number of times it was ordered by that customer.
  11. Which item was purchased first by the customer after they became a member?
  12. SELECT DISTINCT ON (s.customer_id)
         s.customer_id,
         m.product_name
    FROM sales s
    JOIN members mbr ON s.customer_id = mbr.customer_id
    JOIN menu m ON s.product_id = m.product_id
    WHERE s.order_date > mbr.join_date
    ORDER BY s.customer_id;
    Answer:
    Coding
    • The SQL query retrieves distinct rows for each unique customer_id with their corresponding product_name from the sales and menu tables.
    • It filters the data based on the condition that the order_date in the sales table is greater than the join_date of the customer in the members table.
    • The sales table is aliased as s, the members table is aliased as mbr, and the menu table is aliased as m.
    • The query performs inner joins between sales and members tables on matching customer_id and between sales and menu tables on matching product_id.
    • Only rows that meet the join condition and the order_date > join_date condition are considered in the result set.
    • The query selects the customer_id and corresponding product_name for each customer who has placed an order after their join_date.
    • The results are sorted in ascending order based on the customer_id.
    • The DISTINCT ON (s.customer_id) clause ensures that only the first occurrence of each customer_id is included in the result set.
    • As a result, the query returns a unique list of customer_id along with the first product_name they ordered after joining as a member.
  13. Which item was purchased just before the customer became a member?
  14. SELECT DISTINCT ON (s.customer_id)
      	s.customer_id,
         m.product_name
    FROM sales s
    JOIN members mbr ON s.customer_id = mbr.customer_id
    JOIN menu m ON s.product_id = m.product_id
    WHERE s.order_date < mbr.join_date
    ORDER BY s.customer_id;
    Answer:
    Coding
    • The SQL query retrieves distinct rows for each unique customer_id with their corresponding product_name from the sales and menu tables.
    • It filters the data based on the condition that the order_date in the sales table is less than the join_date of the customer in the members table.
    • The sales table is aliased as s, the members table is aliased as mbr, and the menu table is aliased as m.
    • The query performs inner joins between sales and members tables on matching customer_id and between sales and menu tables on matching product_id.
    • Only rows that meet the join condition and the order_date < join_date condition are considered in the result set.
    • The query selects the customer_id and corresponding product_name for each customer who has placed an order before their join_date.
    • The results are sorted in ascending order based on the customer_id.
    • The DISTINCT ON (s.customer_id) clause ensures that only the first occurrence of each customer_id is included in the result set.
    • As a result, the query returns a unique list of customer_id along with the first product_name they ordered before joining as a member.
  15. What is the total items and amount spent for each member before they became a member?
  16. SELECT S.customer_id,
      COUNT(S.product_id) AS total_item,
      SUM(M.price) AS total_amont
    FROM sales S
    JOIN menu M ON S.product_id=M.product_id
    JOIN members ME ON S.customer_id=ME.customer_id
    WHERE S.order_date<ME.join_date
    GROUP BY S.customer_id
    ORDER BY S.customer_id
    Answer:
    Coding
    • The SQL query retrieves the customer_id along with the total count of items ordered (total_item) and the total amount spent (total_amont) by each customer.
    • It retrieves data from the sales table and joins it with the menu table based on matching product_id.
    • It also joins the sales table with the members table based on matching customer_id.
    • The results are filtered based on the condition that the order_date in the sales table is less than the join_date of the customer in the members table.
    • The COUNT(S.product_id) function calculates the number of occurrences of each product_id in the sales table, giving the total number of items ordered by each customer.
    • The SUM(M.price) function calculates the sum of the price from the menu table, providing the total amount spent by each customer.
    • Results are grouped by customer_id to get the totals for each customer.
    • The query then presents the customer_id, total_item, and total_amont for each customer who placed orders before joining as a member.
    • Finally, the results are sorted in ascending order based on the customer_id.
  17. If each $1 spent equates to 10 points and sushi has a 2x points multiplier - how many points would each customer have?
  18. SELECT s.customer_id,
         SUM(CASE
                 WHEN m.product_name = 'sushi' THEN price * 2
                 ELSE price
             END) * 10 AS total_points
    FROM sales s
    JOIN menu m ON s.product_id = m.product_id
    GROUP BY s.customer_id
    ORDER BY s.customer_id;
    Answer:
    Coding
    • The SQL query retrieves the customer_id and calculates the total points (total_points) earned by each customer based on their purchases from the sales and menu tables.
    • It retrieves data from the sales table and joins it with the menu table based on matching product_id.
    • The query uses a CASE statement to differentiate between 'sushi' and other products.
    • If the product name is 'sushi', the price is multiplied by 2 to give double points.
    • Otherwise, the regular price is considered.
    • The SUM function calculates the total points for each customer by adding up the points earned from their purchases.
    • The total points are then multiplied by 10 to give a scaled value.
    • Results are grouped by customer_id to get the total points for each customer.
    • The query then presents the customer_id and the scaled total_points for each customer based on their purchases.
    • Finally, the results are sorted in ascending order based on the customer_id.
  19. In the first week after a customer joins the program (including their join date) they earn 2x points on all items, not just sushi - how many points do customer A and B have at the end of January?
  20. WITH dates_cte AS (
    SELECT 
      customer_id, 
      join_date, 
      join_date + INTERVAL '6 days' AS valid_date, 
      DATE_TRUNC('month', '2021-01-31'::DATE) + INTERVAL '1 month' - INTERVAL '1 day' AS last_date
    FROM members
    )
    
    SELECT 
    s.customer_id, 
    SUM(CASE
      WHEN m.product_name = 'sushi' OR (s.order_date BETWEEN dates.join_date AND dates.valid_date) THEN 2 * 10 * m.price
      ELSE 10 * m.price END) AS points
    FROM sales s
    INNER JOIN dates_cte AS dates
    ON s.customer_id = dates.customer_id
    AND dates.join_date <= s.order_date
    AND s.order_date <= dates.last_date
    INNER JOIN menu m
    ON s.product_id = m.product_id
    GROUP BY s.customer_id
    ORDER BY s.customer_id;
    Answer:
    Coding
  • The SQL query starts by creating a Common Table Expression (CTE) named dates_cte.
  • Within the CTE, it selects customer_id, join_date, join_date + INTERVAL '6 days' as valid_date, and the last day of the month for the date '2021-01-31' as last_date.
  • The CTE is used to generate date ranges for each customer, from their join_date to 6 days later, and the last day of the month for January 2021.
  • Next, the main query selects the customer_id and calculates the total points (points) earned by each customer based on their purchases from the sales and menu tables.
  • It retrieves data from the sales table and joins it with the dates_cte CTE using a combination of JOIN and ON clauses.
  • The query uses a CASE statement to differentiate between 'sushi' purchases and other products.
  • If the product name is 'sushi' or the order date falls within the range of join_date to valid_date, the points are calculated as 2 times 10 times the price of the product.
  • Otherwise, for other products, the points are calculated as 10 times the price of the product.
  • The SUM function calculates the total points for each customer by adding up the points earned from their purchases.
  • Results are grouped by customer_id to get the total points for each customer.
  • The query then presents the customer_id and the calculated points for each customer based on their purchases.
  • Finally, the results are sorted in ascending order based on the customer_id.
Join All The Things
WITH customer_member_status AS (
  SELECT
    s.customer_id,
    s.order_date,
    m.product_name,
    m.price,
    CASE
      WHEN mbr.join_date <= s.order_date THEN 'Y'
      ELSE 'N'
    END AS member
  FROM sales s
  INNER JOIN menu m ON s.product_id = m.product_id
  LEFT JOIN members mbr ON s.customer_id = mbr.customer_id
)
SELECT
  customer_id,
  order_date,
  product_name,
  price,
  member
FROM customer_member_status
ORDER BY
  customer_id,
  member DESC,
  order_date;
Answer:
Coding
  • The SQL query starts by creating a Common Table Expression (CTE) named customer_member_status.
  • Within the CTE, it selects customer_id, order_date, product_name, price, and uses a CASE statement to determine whether the customer is a member ('Y') or not ('N') based on their join date in the members table.
  • The sales table is aliased as s, the menu table as m, and the members table as mbr.
  • The query performs inner joins between sales and menu tables on matching product_id and left join between sales and members tables on matching customer_id.
  • For each row, the CASE statement checks if the join_date from the members table is less than or equal to the order_date from the sales table.
  • If true, it assigns 'Y' (member) to the member column, otherwise 'N' (non-member).
  • Next, the main query selects the customer_id, order_date, product_name, price, and member from the customer_member_status CTE.
  • Results are ordered first by customer_id in ascending order, then by member in descending order (so members appear first), and finally by order_date in ascending order.
  • The query presents the final results with the selected columns for each customer, showing whether they are a member or not for each order and sorted accordingly.
Rank All The Things

Danny needs additional information about the ranking of customer products. However, he specifically requires null ranking values for non-member purchases, as he is not interested in ranking customers who are not yet part of the loyalty program.

WITH customers_data AS (
    SELECT 
        sales.customer_id, 
        sales.order_date,  
        menu.product_name, 
        menu.price,
        CASE
            WHEN members.join_date > sales.order_date THEN 'N'
            WHEN members.join_date <= sales.order_date THEN 'Y'
            ELSE 'N' 
        END AS member_status
    FROM sales
    LEFT JOIN members ON sales.customer_id = members.customer_id
    INNER JOIN menu ON sales.product_id = menu.product_id
)

SELECT 
    customer_id,
    order_date, 
    product_name, 
    price,
    member_status AS member,
    CASE
        WHEN member_status = 'N' THEN NULL
        ELSE RANK() OVER (PARTITION BY customer_id, member_status ORDER BY order_date)
    END AS ranking
FROM customers_data;
Answer:
Coding
  • The SQL query starts by creating a Common Table Expression (CTE) named customers_data.
  • Within the CTE, it selects customer_id, order_date, product_name, price, and uses a CASE statement to determine the member_status based on whether the join_date in the members table is greater, equal, or less than the order_date in the sales table.
  • The sales table is aliased as sales, the members table as members, and the menu table as menu.
  • The query performs left join between sales and members tables on matching customer_id and inner join between sales and menu tables on matching product_id.
  • For each row, the CASE statement checks the join_date from the members table and compares it to the order_date from the sales table, assigning 'Y' if the join_date is less than or equal to the order_date (customer is a member) and 'N' otherwise (non-member).
  • Next, the main query selects the customer_id, order_date, product_name, price, and member_status from the customers_data CTE.
  • It also uses a CASE statement to calculate the ranking for each customer's orders if they are a member ('Y'). The RANK() function is used with PARTITION BY to partition the ranking within each customer and their membership status and ordered by order_date.
  • If the customer is not a member ('N'), the ranking is set to NULL.
  • Results are presented with the selected columns and ranking for each customer's orders, showing whether they are a member or not and the order of their orders.
  • Customer Spending: The total amount spent by each customer at Danny's Diner varies widely. Some customers have spent significantly more than others, indicating potential high-value customers or loyal patrons.
  • Customer Visits: The number of days each customer visited the restaurant also varies, showing different patterns of customer engagement. Some customers visit frequently, while others may visit less often.
  • First Purchases: Understanding the first items purchased by each customer can help Danny identify popular entry items and potentially attract new customers.
  • Most Popular Item: The most purchased item on the menu is valuable information for Danny. He can use this insight to optimize inventory management and capitalize on the popularity of the item.
  • Personalized Recommendations: Knowing the most popular item for each customer allows Danny to make personalized menu suggestions, enhancing the dining experience for his customers.
  • Customer Loyalty: The data about purchases before and after joining the loyalty program helps Danny evaluate the effectiveness of the loyalty program and its impact on customer behavior.
  • Bonus Points for New Members: By offering 2x points to new members during their first week, Danny incentivizes more spending, encouraging members to engage more actively with the program.
  • Member Points: The points earned by each member can be used to assess their loyalty and potentially offer targeted rewards and promotions.
  • Data Visualization: Creating visualizations based on the data and insights can further aid Danny in understanding trends and making data-driven decisions.
  • Customer Segmentation: By analyzing customer spending habits, Danny can segment his customer base and tailor marketing strategies accordingly.
  • Expanding Membership: Danny can use the insights from this data to refine his loyalty program and attract new members, leveraging the success of the program.
  • Inventory Management: Knowing the most popular and least popular items can help Danny optimize his inventory, reduce wastage, and maximize profits.
  • Menu Optimization: Danny can use the data to evaluate the performance of different menu items and consider introducing new dishes based on customer preferences.
  • Customer Engagement: Analyzing customer behavior can help Danny understand what keeps customers coming back and help him create more engaging experiences.
  • Long-Term Growth: By leveraging data analysis, Danny can make informed decisions that contribute to the long-term growth and success of Danny's Diner.