BrightMart is a fictitious retail chain with stores across multiple regions, offering a wide selection of products including electronics, fashion, groceries, and home essentials. Although BrightMart has built a strong customer base, it identified a significant opportunity to leverage its rich transaction and customer data for more strategic decision-making. To unlock this potential, BrightMart embarked on a data analysis project aimed at transforming its raw sales data into actionable insights.
Through this project, BrightMart organized its data into a structured relational database and analyzed revenue trends, product profitability, customer demographics, and store performance. By understanding its top-selling products, high-performing store locations, and customer purchasing patterns, BrightMart is now equipped to optimize inventory management, tailor marketing efforts, and improve the customer experience. This data-driven approach positions BrightMart to enhance operational efficiency, drive profitability, and foster deeper customer loyalty across its expanding network.
- Database Creation: Created a MySQL database to store the data.
- Data Loading and Cleaning: Imported data from CSV, ensured date columns were formatted correctly, removed duplicates, and replaced empty cells with
NULL. - Data Normalization: Normalized the data into multiple tables to improve data organization and minimize redundancy.
- Data Model and ERD Diagram: Designed a relational model to structure data and visualize relationships between entities. Created an ERD diagram to outline the tables, keys, and relationships.
- Querying and Analysis: Used SQL queries to discover business insights.
- Data Visualization: Loaded data from the created MySQL database into Power BI to visualize metrics and insights.
The database is designed to capture and manage key transactional data across orders, customers, stores, and products. The structure is normalized to ensure efficient data storage, reduce redundancy, and maintain data integrity.
-
Original Dataset An initial table,
SalesData, was created with columns mirroring the structure of the original CSV file.TransactionID,OrderNumber,LineItem,OrderDate,DeliveryDate,Quantity,CustomerID,CustomerGender,CustomerName,CustomerCity,CustomerStateCode,CustomerState,CustomerZip,CustomerCountry,CustomerContinent,CustomerDOB,StoreID,StoreCountry,StoreState,StoreSqMeters,StoreOpenDate,ProductID,ProductName,ProductBrand,ProductColor,ProductCost,ProductPrice,ProductCategoryID,ProductCategory,ProductSubcategoryID,ProductSubcategoryNote: Year 2021 transaction data consists of only January and February. -
Normalized Structure (3NF)
The Entity-Relationship Diagram (ERD) illustrates the tables, fields, primary keys, and relationships within the database. This visual guide provides a high-level overview of the entire data model.
All SQL codes used from creating the database to populating normalized tables can be viewed in Salessql.
The database has several key relationships to support detailed analytics and ensure consistency:
- Customers ↔ Orders: Each customer can have multiple orders, allowing for analysis of customer lifetime value, new customers and repeat purchases.
- Stores ↔ Orders: Orders are linked to stores to analyze store-level performance by region.
- Products ↔ Orders: Each order contains product details, which enables revenue analysis by product and category.
- ProductCategories ↔ ProductSubcategories ↔ Products: The categories and subcategories help categorize products, supporting profit margin analysis by category.
Various SQL queries were crafted to extract key metrics and insights from the database, enabling a deeper understanding of business performance. These queries helped calculate essential KPIs, uncover trends, and analyze patterns across multiple dimensions, including revenue, profit, customer data, and product performance.
- What is the total revenue generated by the business?
SELECT SUM(ProductPrice * Quantity) AS TotalRevenue
FROM orders
JOIN products ON orders.ProductID = products.ProductID;- Output
| Total Revenue |
|---|
| $55,755,479.59 |
- How has the total revenue varied by year?
SELECT YEAR(OrderDate) AS Year, SUM(ProductPrice * Quantity) AS Revenue
FROM orders
JOIN products ON orders.ProductID = products.ProductID
GROUP BY YEAR(OrderDate)
ORDER BY Year;- Output
| Year | Total Revenue |
|---|---|
| 2019 | $18,264,382.48 |
| 2018 | $12,788,960.66 |
| 2020 | $9,294,632.14 |
| 2017 | $7,421,422.27 |
| 2016 | $6,946,793.56 |
| 2021 | $1,039,288.48 |
- What was the revenue generated in the last six months?
SELECT MONTH(OrderDate) AS Month, YEAR(OrderDate) AS Year, SUM(Quantity * ProductPrice) AS RevenueLast6Months
FROM orders
JOIN products ON orders.ProductID = products.ProductID
WHERE OrderDate >= DATE_SUB((SELECT MAX(OrderDate) FROM orders), INTERVAL 6 MONTH)
GROUP BY Year, Month;- Output
| Month | Year | Revenue (Last 6 Months) |
|---|---|---|
| 8 | 2020 | $145,869.05 |
| 9 | 2020 | $380,431.52 |
| 10 | 2020 | $245,647.59 |
| 11 | 2020 | $256,701.02 |
| 12 | 2020 | $651,526.44 |
| 1 | 2021 | $513,021.58 |
| 2 | 2021 | $526,266.90 |
- What is the total profit earned by the business?
SELECT SUM((ProductPrice - ProductCost) * Quantity) AS TotalProfit
FROM orders
JOIN products ON orders.ProductID = products.ProductID;- Output
| Total Profit |
|---|
| $32,662,688.38 |
- What are the profit margins across different product subcategories?
SELECT ProductSubcategoryID,
SUM((ProductPrice - ProductCost) * Quantity) / SUM(ProductPrice * Quantity) * 100 AS ProfitMargin
FROM orders
JOIN products ON orders.ProductID = products.ProductID
GROUP BY ProductSubcategoryID;- Output
| Product Subcategory | Profit Margin (%) |
|---|---|
| Monitors | 62.58 |
| Projectors & Screens | 62.57 |
| Digital SLR Cameras | 61.76 |
| Televisions | 61.08 |
| Refrigerators | 61.15 |
| Movie DVD | 60.98 |
| Camcorders | 60.10 |
| Coffee Machines | 59.31 |
| Microwaves | 59.10 |
| Cell Phones Accessories | 58.88 |
| MP4 & MP3 | 58.46 |
| Bluetooth Headphones | 58.45 |
| Printers, Scanners & Fax | 58.42 |
| Digital Cameras | 57.65 |
| Cameras & Camcorders Accessories | 57.88 |
| Lamps | 57.51 |
| Water Heaters | 57.50 |
| Smart Phones & PDAs | 56.69 |
| Laptops | 56.68 |
| Desktops | 56.82 |
| Car Video | 56.43 |
| Recording Pen | 56.55 |
| Home & Office Phones | 56.50 |
| Touch Screen Phones | 56.38 |
| Computers Accessories | 56.18 |
| Boxed Games | 56.48 |
| Washers & Dryers | 56.49 |
| Air Conditioners | 56.49 |
| VCD & DVD | 55.73 |
| Home Theater System | 55.40 |
| Download Games | 54.47 |
| Fans | 53.82 |
- How much revenue is generated by each product category, and what is the quantity sold per category?
SELECT ProductCategoryID,
SUM(ProductPrice * Quantity) AS Revenue,
SUM(Quantity) AS QuantitySold
FROM orders
JOIN products ON orders.ProductID = products.ProductID
GROUP BY ProductCategoryID;- output
| Product Category | Revenue | Quantity Sold |
|---|---|---|
| Computers | $19,301,595.46 | 44,151 |
| Home Appliances | $10,795,478.59 | 18,401 |
| Cameras and Camcorders | $6,520,168.02 | 17,609 |
| Cell Phones | $6,183,791.22 | 31,477 |
| TV and Video | $5,928,982.69 | 11,236 |
| Audio | $3,169,627.74 | 23,490 |
| Music Movies and Audio Books | $3,131,006.44 | 28,802 |
| Games and Toys | $724,829.43 | 22,591 |
- Which product category generates the highest revenue each year, and what is the quantity sold for that category?
SELECT Year, ProductCategory, Revenue, QuantitySold
FROM (
SELECT
YEAR(OrderDate) AS Year,
pc.ProductCategory,
SUM(Quantity * ProductPrice) AS Revenue,
SUM(Quantity) AS QuantitySold,
ROW_NUMBER() OVER (PARTITION BY YEAR(OrderDate) ORDER BY SUM(Quantity * ProductPrice) DESC) AS row_rank
FROM orders
JOIN products p ON orders.ProductID = p.ProductID
JOIN productcategories pc ON p.ProductCategoryID = pc.ProductCategoryID
GROUP BY YEAR(OrderDate), pc.ProductCategory
) AS ranked_categories
WHERE row_rank = 1;- output
| Year | Product Category | Revenue | Quantity Sold |
|---|---|---|---|
| 2016 | Home Appliances | $2,010,476.12 | 3,363 |
| 2017 | Computers | $2,228,431.91 | 5,269 |
| 2018 | Computers | $4,507,043.96 | 10,321 |
| 2019 | Computers | $6,958,430.89 | 16,203 |
| 2020 | Computers | $3,665,122.72 | 7,941 |
| 2021 | Computers | $445,745.65 | 982 |
- What is the average order value for customer purchases?
SELECT AVG(OrderTotal) AS AverageOrderValue
FROM (
SELECT SUM(quantity * productprice) AS OrderTotal
FROM orders
JOIN products ON orders.ProductID = products.ProductID
GROUP BY transactionID
) AS OrderTotals;- output
| Average Order Value |
|---|
| $886.64 |
- How does revenue distribution vary across different stores location?
SELECT StoreCountry AS StoresLocation, SUM(ProductPrice * Quantity) AS Revenue
FROM orders
JOIN products ON orders.ProductID = products.ProductID
JOIN stores ON orders.StoreID = stores.StoreID
GROUP BY StoresLocation;- output
| Store Location | Revenue |
|---|---|
| Online | $11,404,324.63 |
| Australia | $2,099,141.07 |
| Canada | $3,611,561.79 |
| France | $1,229,545.95 |
| Germany | $4,246,279.22 |
| Italy | $2,059,086.81 |
| Netherlands | $1,591,344.48 |
| United Kingdom | $5,749,769.78 |
| United States | $23,764,425.86 |
- Which are the top 5 highest-performing stores, and where are they located (state and country)?
SELECT StoreID, StoreState, StoreCountry, SUM(ProductPrice * Quantity) AS Revenue
FROM orders
JOIN products ON orders.ProductID = products.ProductID
JOIN stores ON orders.StoreID = stores.StoreID
GROUP BY StoreID, StoreState, StoreCountry
ORDER BY Revenue DESC
LIMIT 5;- output
| Store ID | Store State | Store Country | Revenue |
|---|---|---|---|
| 0 | Online | Online | $11,404,324.63 |
| 55 | Nevada | United States | $1,417,885.41 |
| 50 | Kansas | United States | $1,394,738.06 |
| 54 | Nebraska | United States | $1,384,396.24 |
| 9 | Northwest Territories | Canada | $1,336,150.06 |
- Which stores perform the best each year, and where are they located (state and country)?
SELECT Year, StoreID, StoreState, StoreCountry, Revenue
FROM (
SELECT
YEAR(OrderDate) AS Year,
s.StoreID,
s.StoreState,
s.StoreCountry,
SUM(Quantity * ProductPrice) AS Revenue,
ROW_NUMBER() OVER (PARTITION BY YEAR(OrderDate) ORDER BY SUM(Quantity * ProductPrice) DESC) AS store_rank
FROM orders
JOIN stores s ON orders.StoreID = s.StoreID
JOIN products ON orders.ProductID = products.ProductID
GROUP BY YEAR(OrderDate), s.StoreID, s.StoreState, s.StoreCountry
) AS ranked_stores
WHERE store_rank = 1;
- output
| Year | Store ID | Store State | Store Country | Revenue |
|---|---|---|---|---|
| 2016 | 0 | Online | Online | $1,169,315.86 |
| 2017 | 0 | Online | Online | $1,388,497.44 |
| 2018 | 0 | Online | Online | $2,582,287.13 |
| 2019 | 0 | Online | Online | $3,908,495.70 |
| 2020 | 0 | Online | Online | $2,069,950.90 |
| 2021 | 0 | Online | Online | $285,777.60 |
- How many customers place repeat orders each year?
SELECT
Year,
COUNT(DISTINCT CustomerID) AS RepeatCustomers
FROM (
SELECT
CustomerID,
YEAR(OrderDate) AS Year,
COUNT(OrderNumber) AS OrderCount
FROM orders
GROUP BY CustomerID, YEAR(OrderDate)
HAVING OrderCount > 1
) AS RepeatOrders
GROUP BY Year;- output
| Year | Repeat Customers |
|---|---|
| 2016 | 1,776 |
| 2017 | 2,012 |
| 2018 | 3,447 |
| 2019 | 4,903 |
| 2020 | 2,718 |
| 2021 | 319 |
- Total Revenue: The business has generated a significant total revenue of $55,755,479.59, indicating a strong market presence and sustained demand across various product categories.
- Revenue has shown growth year-over-year, with peaks in 2018 ($12.79M) and 2019 ($18.26M), indicating substantial expansion during these periods.
- There are recurring spikes in revenue, particularly in the fourth quarter (Q4) each year. This likely corresponds to holiday season sales, where consumer spending tends to increase. Revenue usually declines in the first quarter (Q1), reflecting a post-holiday slowdown, a common pattern in retail and consumer-oriented industries.
- Revenue drops sharply in early 2020, with revenue dropping to $9.29M, aligning with the onset of the COVID-19 pandemic. Lockdowns, supply chain disruptions, and decreased consumer spending likely contributed to this decline.
- Revenue from the final months of 2020 through early 2021 remained steady, totaling $2.72M, showing consistent customer engagement and resilience during the period. There is a slight increase in early 2021. This increase at the start of 2021 could indicate a positive trend that, if sustained, may lead to improved yearly performance.
- Total Profit: The total profit stands at $32.66M, reflecting strong profit margins across various product offerings.
- Top-Performing Subcategories by Profit Margin: Categories such as Monitors, Projectors & Screens, and Digital Cameras have consistently high profit margins, exceeding 60%, making them highly valuable product segments.
- Poor-Performing Subcategories: Categories like "Fans" (53.82%) have lower margins, indicating they may require strategic review, such as better sourcing or adjusted pricing.
- Computers: The leading revenue generator with $19.30M and the highest sales volume with 44,151 units, indicating its central role in the product portfolio.
- Other Key Categories: Home Appliances ($10.8M) and Cameras and Camcorders ($6.5M) also perform well, highlighting a wide product range that appeals to multiple market segments.
- Each year, Computers or Home Appliances dominated revenue, with Computers taking the lead from 2017 onward, showing sustained demand for tech-related products and home essentials.
- Average Order Value: The average customer purchase is $886.64, giving insight into customer spending habits. This enables tailored promotions and marketing strategies around this spending range.
- Online Sales: Represent the highest revenue source, totaling $11.40M, demonstrating the effectiveness of the online sales channel.
- Top Physical Locations: The United States ($23.76M), followed by the UK ($5.75M) and Germany ($4.25M), underscore strong performance in key geographic regions.
- Online: This channel ranks as the highest-performing store across all years.
- Physical Stores: Locations in the United States (Nevada and Kansas) and Canada (Northwest Territories) show notable performances, making them strategic assets for continued focus.
- Customer loyalty has been increasing over time, with repeat purchases growing from 1,776 in 2016 to 4,903 in 2019. However, a decrease in 2020 suggests an opportunity to enhance customer retention efforts, potentially through targeted campaigns or loyalty programs.
These findings provide a comprehensive view of the business’s financial and operational performance, highlighting:
- Key Product Categories: Technology and home products remain highly profitable and should be prioritized in marketing efforts.
- Sales Channels: The success of online sales supports increased investment in digital channels.
- Customer Engagement: Increasing customer retention rates through loyalty programs and targeted campaigns can drive further revenue growth.
- Capitalize on Q4 Revenue Spikes: The data indicates a consistent increase in revenue during Q4, likely due to holiday season demand. Launching targeted marketing campaigns and promotions during this period can help capture additional sales and maximize revenue.
- Align Inventory with Seasonal Demand: Leveraging insights from quarterly revenue trends allows for strategic inventory management. Adjusting stock levels, staffing, and promotional efforts to align with seasonal peaks and dips can enhance operational efficiency and reduce associated costs.
- Focus on High-Profit Subcategories: With categories like Monitors, Projectors & Screens, and Digital Cameras boasting profit margins over 60%, targeted marketing for these products can amplify profits.
- Leverage Seasonal Promotions: Use seasonal campaigns to boost sales in high-demand periods, especially for Computers and Home Appliances, which are top revenue contributors.
- Strengthen Online Presence: Given that Online Sales generated $11.4M, further investment in e-commerce platforms, social media ads, and SEO can continue driving online traffic and conversions.
- Personalized Online Experience: Implement recommendations and personalization features on the website to increase Average Order Value ($886.64) by suggesting add-on products.
- Expand U.S. and U.K. Marketing Efforts: The United States and United Kingdom are leading revenue sources, suggesting a strong brand presence and high customer base in these markets. Focused campaigns and new product launches in these regions could boost sales.
- Explore Emerging Markets: Consider expanding physical and online offerings in high-potential markets like Germany and Canada, where revenues are consistently strong.
- Introduce Loyalty Programs: A loyalty program can help retain repeat customers, which dropped in 2020. Rewarding repeat purchases with points, discounts, or exclusive offers will foster customer loyalty and improve long-term revenue.
- Engage Through Targeted Campaigns: Use targeted email and social media campaigns to reach past customers with product updates and special offers, particularly for customers in regions with high repeat purchases (e.g., U.S.).
- Online Store Optimization: Since the online store consistently ranks as the top performer, invest in UI/UX improvements, faster loading times, and checkout ease to maximize conversions.
- Enhance Inventory and Promotions for Top Physical Stores: Stores in Nevada and Kansas (U.S.), and Northwest Territories (Canada) can benefit from enhanced stock levels, in-store promotions, and exclusive product releases to maintain high performance.
- Introduce New Models in Top Categories: Adding new models or variants within high-profit categories (e.g., Digital Cameras, Televisions) can attract interest and further increase profitability.
- Cross-Sell Accessories: Boost revenue by promoting related accessories for high-demand items like Computers and Cell Phones.
All DAX codes for Power BI dashboard design.
- Date Dimension Table Creation
DateDim =
VAR StartDate = DATE(2016, 1, 1) -- Adjust Start Date
VAR EndDate = DATE(2021, 2, 28) -- Adjust End Date
RETURN
ADDCOLUMNS(
CALENDAR(StartDate, EndDate),
"Year", YEAR([Date]),
"Month Number", MONTH([Date]),
"Month Short", FORMAT([Date], "MMM"),
"Quarter Number", QUARTER([Date]),
"Quarter Text", "Q" & QUARTER([Date]),
"Day", DAY([Date]),
"Week of Year", WEEKNUM([Date], 2)
)
- Revenue
Revenue =
CALCULATE(
SUMX(
'sales orders',
'sales orders'[Quantity] * RELATED('sales products'[ProductPrice])
)
)
- Total Cost
Total Cost =
CALCULATE(
SUMX(
'sales orders',
'sales orders'[Quantity] * RELATED('sales products'[ProductCost])
)
)
- Total Profit
Total Profit =
[Revenue] - [Total Cost]
- Profit Margin Per subcategory
Total Profit Margin (%) =
DIVIDE(
[Total Profit],
[Revenue],
0
) * 100
- Average Order Value
AverageOrderValue =
DIVIDE(
SUMX('sales orders', 'sales orders'[Quantity] * RELATED('sales products'[ProductPrice])),
DISTINCTCOUNT('sales orders'[TransactionID])
)
