Skip to content

The project involved designing a MySQL database to store and normalize transaction, customer, and product data. SQL queries were used to extract business insights like sales trends, product performance, and customer behaviors.

Notifications You must be signed in to change notification settings

Chiagoziemchidera/Retail-Store-Database-Design-and-Key-Business-Metrics-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Retail-Store-Database-Design-and-Key-Business-Metrics-Analysis


Project Overview

BrightMart is a fictitious retail chain with stores across multiple regions, offering a wide selection of products including electronics, fashion, groceries, and home essentials. Although BrightMart has built a strong customer base, it identified a significant opportunity to leverage its rich transaction and customer data for more strategic decision-making. To unlock this potential, BrightMart embarked on a data analysis project aimed at transforming its raw sales data into actionable insights.

Through this project, BrightMart organized its data into a structured relational database and analyzed revenue trends, product profitability, customer demographics, and store performance. By understanding its top-selling products, high-performing store locations, and customer purchasing patterns, BrightMart is now equipped to optimize inventory management, tailor marketing efforts, and improve the customer experience. This data-driven approach positions BrightMart to enhance operational efficiency, drive profitability, and foster deeper customer loyalty across its expanding network.

Project Workflow

  1. Database Creation: Created a MySQL database to store the data.
  2. Data Loading and Cleaning: Imported data from CSV, ensured date columns were formatted correctly, removed duplicates, and replaced empty cells with NULL.
  3. Data Normalization: Normalized the data into multiple tables to improve data organization and minimize redundancy.
  4. Data Model and ERD Diagram: Designed a relational model to structure data and visualize relationships between entities. Created an ERD diagram to outline the tables, keys, and relationships.
  5. Querying and Analysis: Used SQL queries to discover business insights.
  6. Data Visualization: Loaded data from the created MySQL database into Power BI to visualize metrics and insights.

Data Structure Overview

The database is designed to capture and manage key transactional data across orders, customers, stores, and products. The structure is normalized to ensure efficient data storage, reduce redundancy, and maintain data integrity.

  • Original Dataset An initial table, SalesData, was created with columns mirroring the structure of the original CSV file. TransactionID, OrderNumber, LineItem, OrderDate, DeliveryDate, Quantity, CustomerID, CustomerGender, CustomerName, CustomerCity, CustomerStateCode, CustomerState, CustomerZip, CustomerCountry, CustomerContinent, CustomerDOB, StoreID, StoreCountry, StoreState, StoreSqMeters, StoreOpenDate, ProductID, ProductName, ProductBrand, ProductColor, ProductCost, ProductPrice, ProductCategoryID, ProductCategory, ProductSubcategoryID, ProductSubcategory Note: Year 2021 transaction data consists of only January and February.

  • Normalized Structure (3NF)

ERD transaction

The Entity-Relationship Diagram (ERD) illustrates the tables, fields, primary keys, and relationships within the database. This visual guide provides a high-level overview of the entire data model.

All SQL codes used from creating the database to populating normalized tables can be viewed in Salessql.

Relationships

The database has several key relationships to support detailed analytics and ensure consistency:

  • Customers ↔ Orders: Each customer can have multiple orders, allowing for analysis of customer lifetime value, new customers and repeat purchases.
  • Stores ↔ Orders: Orders are linked to stores to analyze store-level performance by region.
  • Products ↔ Orders: Each order contains product details, which enables revenue analysis by product and category.
  • ProductCategories ↔ ProductSubcategories ↔ Products: The categories and subcategories help categorize products, supporting profit margin analysis by category.

Data Analysis and Querying

Various SQL queries were crafted to extract key metrics and insights from the database, enabling a deeper understanding of business performance. These queries helped calculate essential KPIs, uncover trends, and analyze patterns across multiple dimensions, including revenue, profit, customer data, and product performance.

  1. What is the total revenue generated by the business?
SELECT SUM(ProductPrice * Quantity) AS TotalRevenue
FROM orders
JOIN products ON orders.ProductID = products.ProductID;
  • Output
Total Revenue
$55,755,479.59
  1. How has the total revenue varied by year?
SELECT YEAR(OrderDate) AS Year, SUM(ProductPrice * Quantity) AS Revenue
FROM orders
JOIN products ON orders.ProductID = products.ProductID
GROUP BY YEAR(OrderDate)
ORDER BY Year;
  • Output
Year Total Revenue
2019 $18,264,382.48
2018 $12,788,960.66
2020 $9,294,632.14
2017 $7,421,422.27
2016 $6,946,793.56
2021 $1,039,288.48
  1. What was the revenue generated in the last six months?
SELECT MONTH(OrderDate) AS Month, YEAR(OrderDate) AS Year, SUM(Quantity * ProductPrice) AS RevenueLast6Months
FROM orders
JOIN products ON orders.ProductID = products.ProductID
WHERE OrderDate >= DATE_SUB((SELECT MAX(OrderDate) FROM orders), INTERVAL 6 MONTH)
GROUP BY Year, Month;
  • Output
Month Year Revenue (Last 6 Months)
8 2020 $145,869.05
9 2020 $380,431.52
10 2020 $245,647.59
11 2020 $256,701.02
12 2020 $651,526.44
1 2021 $513,021.58
2 2021 $526,266.90
  1. What is the total profit earned by the business?
SELECT SUM((ProductPrice - ProductCost) * Quantity) AS TotalProfit
FROM orders
JOIN products ON orders.ProductID = products.ProductID;
  • Output
Total Profit
$32,662,688.38
  1. What are the profit margins across different product subcategories?
SELECT ProductSubcategoryID, 
       SUM((ProductPrice - ProductCost) * Quantity) / SUM(ProductPrice * Quantity) * 100 AS ProfitMargin
FROM orders
JOIN products ON orders.ProductID = products.ProductID
GROUP BY ProductSubcategoryID;
  • Output
Product Subcategory Profit Margin (%)
Monitors 62.58
Projectors & Screens 62.57
Digital SLR Cameras 61.76
Televisions 61.08
Refrigerators 61.15
Movie DVD 60.98
Camcorders 60.10
Coffee Machines 59.31
Microwaves 59.10
Cell Phones Accessories 58.88
MP4 & MP3 58.46
Bluetooth Headphones 58.45
Printers, Scanners & Fax 58.42
Digital Cameras 57.65
Cameras & Camcorders Accessories 57.88
Lamps 57.51
Water Heaters 57.50
Smart Phones & PDAs 56.69
Laptops 56.68
Desktops 56.82
Car Video 56.43
Recording Pen 56.55
Home & Office Phones 56.50
Touch Screen Phones 56.38
Computers Accessories 56.18
Boxed Games 56.48
Washers & Dryers 56.49
Air Conditioners 56.49
VCD & DVD 55.73
Home Theater System 55.40
Download Games 54.47
Fans 53.82
  1. How much revenue is generated by each product category, and what is the quantity sold per category?
SELECT ProductCategoryID, 
       SUM(ProductPrice * Quantity) AS Revenue, 
       SUM(Quantity) AS QuantitySold
FROM orders
JOIN products ON orders.ProductID = products.ProductID
GROUP BY ProductCategoryID;
  • output
Product Category Revenue Quantity Sold
Computers $19,301,595.46 44,151
Home Appliances $10,795,478.59 18,401
Cameras and Camcorders $6,520,168.02 17,609
Cell Phones $6,183,791.22 31,477
TV and Video $5,928,982.69 11,236
Audio $3,169,627.74 23,490
Music Movies and Audio Books $3,131,006.44 28,802
Games and Toys $724,829.43 22,591
  1. Which product category generates the highest revenue each year, and what is the quantity sold for that category?
SELECT Year, ProductCategory, Revenue, QuantitySold
FROM (
    SELECT 
        YEAR(OrderDate) AS Year, 
        pc.ProductCategory,
        SUM(Quantity * ProductPrice) AS Revenue, 
        SUM(Quantity) AS QuantitySold,
        ROW_NUMBER() OVER (PARTITION BY YEAR(OrderDate) ORDER BY SUM(Quantity * ProductPrice) DESC) AS row_rank
    FROM orders
    JOIN products p ON orders.ProductID = p.ProductID
    JOIN productcategories pc ON p.ProductCategoryID = pc.ProductCategoryID
    GROUP BY YEAR(OrderDate), pc.ProductCategory
) AS ranked_categories
WHERE row_rank = 1;
  • output
Year Product Category Revenue Quantity Sold
2016 Home Appliances $2,010,476.12 3,363
2017 Computers $2,228,431.91 5,269
2018 Computers $4,507,043.96 10,321
2019 Computers $6,958,430.89 16,203
2020 Computers $3,665,122.72 7,941
2021 Computers $445,745.65 982
  1. What is the average order value for customer purchases?
SELECT AVG(OrderTotal) AS AverageOrderValue
FROM (
    SELECT SUM(quantity * productprice) AS OrderTotal
    FROM orders
    JOIN products ON orders.ProductID = products.ProductID
    GROUP BY transactionID
) AS OrderTotals;
  • output
Average Order Value
$886.64
  1. How does revenue distribution vary across different stores location?
SELECT StoreCountry AS StoresLocation, SUM(ProductPrice * Quantity) AS Revenue
FROM orders
JOIN products ON orders.ProductID = products.ProductID
JOIN stores ON orders.StoreID = stores.StoreID
GROUP BY StoresLocation;
  • output
Store Location Revenue
Online $11,404,324.63
Australia $2,099,141.07
Canada $3,611,561.79
France $1,229,545.95
Germany $4,246,279.22
Italy $2,059,086.81
Netherlands $1,591,344.48
United Kingdom $5,749,769.78
United States $23,764,425.86
  1. Which are the top 5 highest-performing stores, and where are they located (state and country)?
SELECT StoreID, StoreState, StoreCountry, SUM(ProductPrice * Quantity) AS Revenue
FROM orders
JOIN products ON orders.ProductID = products.ProductID
JOIN stores ON orders.StoreID = stores.StoreID
GROUP BY StoreID, StoreState, StoreCountry
ORDER BY Revenue DESC
LIMIT 5;
  • output
Store ID Store State Store Country Revenue
0 Online Online $11,404,324.63
55 Nevada United States $1,417,885.41
50 Kansas United States $1,394,738.06
54 Nebraska United States $1,384,396.24
9 Northwest Territories Canada $1,336,150.06
  1. Which stores perform the best each year, and where are they located (state and country)?
SELECT Year, StoreID, StoreState, StoreCountry, Revenue
FROM (
    SELECT 
        YEAR(OrderDate) AS Year, 
        s.StoreID, 
        s.StoreState, 
        s.StoreCountry, 
        SUM(Quantity * ProductPrice) AS Revenue,
        ROW_NUMBER() OVER (PARTITION BY YEAR(OrderDate) ORDER BY SUM(Quantity * ProductPrice) DESC) AS store_rank
    FROM orders
    JOIN stores s ON orders.StoreID = s.StoreID
    JOIN products ON orders.ProductID = products.ProductID
    GROUP BY YEAR(OrderDate), s.StoreID, s.StoreState, s.StoreCountry
) AS ranked_stores
WHERE store_rank = 1;
  • output
Year Store ID Store State Store Country Revenue
2016 0 Online Online $1,169,315.86
2017 0 Online Online $1,388,497.44
2018 0 Online Online $2,582,287.13
2019 0 Online Online $3,908,495.70
2020 0 Online Online $2,069,950.90
2021 0 Online Online $285,777.60
  1. How many customers place repeat orders each year?
SELECT 
    Year, 
    COUNT(DISTINCT CustomerID) AS RepeatCustomers
FROM (
    SELECT 
        CustomerID, 
        YEAR(OrderDate) AS Year, 
        COUNT(OrderNumber) AS OrderCount
    FROM orders
    GROUP BY CustomerID, YEAR(OrderDate)
    HAVING OrderCount > 1
) AS RepeatOrders
GROUP BY Year;
  • output
Year Repeat Customers
2016 1,776
2017 2,012
2018 3,447
2019 4,903
2020 2,718
2021 319

Insights and Findings

rev over time

1. Yearly Revenue Trends

  • Total Revenue: The business has generated a significant total revenue of $55,755,479.59, indicating a strong market presence and sustained demand across various product categories.
  • Revenue has shown growth year-over-year, with peaks in 2018 ($12.79M) and 2019 ($18.26M), indicating substantial expansion during these periods.
  • There are recurring spikes in revenue, particularly in the fourth quarter (Q4) each year. This likely corresponds to holiday season sales, where consumer spending tends to increase. Revenue usually declines in the first quarter (Q1), reflecting a post-holiday slowdown, a common pattern in retail and consumer-oriented industries.
  • Revenue drops sharply in early 2020, with revenue dropping to $9.29M, aligning with the onset of the COVID-19 pandemic. Lockdowns, supply chain disruptions, and decreased consumer spending likely contributed to this decline.

2. Revenue in the Last Six Months

  • Revenue from the final months of 2020 through early 2021 remained steady, totaling $2.72M, showing consistent customer engagement and resilience during the period. There is a slight increase in early 2021. This increase at the start of 2021 could indicate a positive trend that, if sustained, may lead to improved yearly performance.

3. Profit Analysis

  • Total Profit: The total profit stands at $32.66M, reflecting strong profit margins across various product offerings.
  • Top-Performing Subcategories by Profit Margin: Categories such as Monitors, Projectors & Screens, and Digital Cameras have consistently high profit margins, exceeding 60%, making them highly valuable product segments.
  • Poor-Performing Subcategories: Categories like "Fans" (53.82%) have lower margins, indicating they may require strategic review, such as better sourcing or adjusted pricing.

4. Revenue and Sales Volume by Product Category

  • Computers: The leading revenue generator with $19.30M and the highest sales volume with 44,151 units, indicating its central role in the product portfolio.
  • Other Key Categories: Home Appliances ($10.8M) and Cameras and Camcorders ($6.5M) also perform well, highlighting a wide product range that appeals to multiple market segments.

5. Top Product Category per Year

  • Each year, Computers or Home Appliances dominated revenue, with Computers taking the lead from 2017 onward, showing sustained demand for tech-related products and home essentials.

6. Average Order Value

  • Average Order Value: The average customer purchase is $886.64, giving insight into customer spending habits. This enables tailored promotions and marketing strategies around this spending range.

7. Revenue Distribution by Store Location

  • Online Sales: Represent the highest revenue source, totaling $11.40M, demonstrating the effectiveness of the online sales channel.
  • Top Physical Locations: The United States ($23.76M), followed by the UK ($5.75M) and Germany ($4.25M), underscore strong performance in key geographic regions.

8. Top Performing Stores

  • Online: This channel ranks as the highest-performing store across all years.
  • Physical Stores: Locations in the United States (Nevada and Kansas) and Canada (Northwest Territories) show notable performances, making them strategic assets for continued focus.

9. Repeat Customer Rate

  • Customer loyalty has been increasing over time, with repeat purchases growing from 1,776 in 2016 to 4,903 in 2019. However, a decrease in 2020 suggests an opportunity to enhance customer retention efforts, potentially through targeted campaigns or loyalty programs.
POWERBI DASHBOARD

These findings provide a comprehensive view of the business’s financial and operational performance, highlighting:

  • Key Product Categories: Technology and home products remain highly profitable and should be prioritized in marketing efforts.
  • Sales Channels: The success of online sales supports increased investment in digital channels.
  • Customer Engagement: Increasing customer retention rates through loyalty programs and targeted campaigns can drive further revenue growth.

Recommendations

1. Seasonal Marketing Boost

  • Capitalize on Q4 Revenue Spikes: The data indicates a consistent increase in revenue during Q4, likely due to holiday season demand. Launching targeted marketing campaigns and promotions during this period can help capture additional sales and maximize revenue.

2. Inventory and Resource Allocation

  • Align Inventory with Seasonal Demand: Leveraging insights from quarterly revenue trends allows for strategic inventory management. Adjusting stock levels, staffing, and promotional efforts to align with seasonal peaks and dips can enhance operational efficiency and reduce associated costs.

3. Enhance Marketing for Top Product Categories

  • Focus on High-Profit Subcategories: With categories like Monitors, Projectors & Screens, and Digital Cameras boasting profit margins over 60%, targeted marketing for these products can amplify profits.
  • Leverage Seasonal Promotions: Use seasonal campaigns to boost sales in high-demand periods, especially for Computers and Home Appliances, which are top revenue contributors.

4. Invest in Online Sales Expansion

  • Strengthen Online Presence: Given that Online Sales generated $11.4M, further investment in e-commerce platforms, social media ads, and SEO can continue driving online traffic and conversions.
  • Personalized Online Experience: Implement recommendations and personalization features on the website to increase Average Order Value ($886.64) by suggesting add-on products.

5. Optimize Sales in Key Regions

  • Expand U.S. and U.K. Marketing Efforts: The United States and United Kingdom are leading revenue sources, suggesting a strong brand presence and high customer base in these markets. Focused campaigns and new product launches in these regions could boost sales.
  • Explore Emerging Markets: Consider expanding physical and online offerings in high-potential markets like Germany and Canada, where revenues are consistently strong.

6. Increase Customer Retention

  • Introduce Loyalty Programs: A loyalty program can help retain repeat customers, which dropped in 2020. Rewarding repeat purchases with points, discounts, or exclusive offers will foster customer loyalty and improve long-term revenue.
  • Engage Through Targeted Campaigns: Use targeted email and social media campaigns to reach past customers with product updates and special offers, particularly for customers in regions with high repeat purchases (e.g., U.S.).

7. Maximize Revenue from High-Performing Stores

  • Online Store Optimization: Since the online store consistently ranks as the top performer, invest in UI/UX improvements, faster loading times, and checkout ease to maximize conversions.
  • Enhance Inventory and Promotions for Top Physical Stores: Stores in Nevada and Kansas (U.S.), and Northwest Territories (Canada) can benefit from enhanced stock levels, in-store promotions, and exclusive product releases to maintain high performance.

8. Expand Product Lines in High-Margin Categories

  • Introduce New Models in Top Categories: Adding new models or variants within high-profit categories (e.g., Digital Cameras, Televisions) can attract interest and further increase profitability.
  • Cross-Sell Accessories: Boost revenue by promoting related accessories for high-demand items like Computers and Cell Phones.

Appendix

power bi model

All DAX codes for Power BI dashboard design.

  • Date Dimension Table Creation
DateDim = 
VAR StartDate = DATE(2016, 1, 1)  -- Adjust Start Date
VAR EndDate = DATE(2021, 2, 28)   -- Adjust End Date
RETURN
ADDCOLUMNS(
    CALENDAR(StartDate, EndDate),
    "Year", YEAR([Date]),
    "Month Number", MONTH([Date]),
    "Month Short", FORMAT([Date], "MMM"),
    "Quarter Number", QUARTER([Date]),
    "Quarter Text", "Q" & QUARTER([Date]),
    "Day", DAY([Date]),
    "Week of Year", WEEKNUM([Date], 2)
)
  • Revenue
Revenue = 
CALCULATE(
    SUMX(
        'sales orders',
        'sales orders'[Quantity] * RELATED('sales products'[ProductPrice])
    )
)
  • Total Cost
Total Cost = 
CALCULATE(
    SUMX(
        'sales orders',
        'sales orders'[Quantity] * RELATED('sales products'[ProductCost])
    )
)
  • Total Profit
Total Profit = 
[Revenue] - [Total Cost]
  • Profit Margin Per subcategory
Total Profit Margin (%) = 
DIVIDE(
    [Total Profit], 
    [Revenue],
    0
) * 100
  • Average Order Value
AverageOrderValue = 
DIVIDE(
    SUMX('sales orders', 'sales orders'[Quantity] * RELATED('sales products'[ProductPrice])),
    DISTINCTCOUNT('sales orders'[TransactionID])
)

About

The project involved designing a MySQL database to store and normalize transaction, customer, and product data. SQL queries were used to extract business insights like sales trends, product performance, and customer behaviors.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published