Synthetic E-commerce Database

This project demonstrates a complete, reproducible pipeline for generating synthetic e-commerce datasets, ingesting them into a SQLite database, and querying them using multi-table joins. The entire solution — code, structure, and SQL — was built exclusively through agentic prompt engineering inside Cursor Agentic IDE. This README documents the project, the workflow, and (most importantly) the precise prompts used to guide Cursor’s agent to build the system end-to-end.

Overview

This project simulates a small e-commerce ecosystem by generating realistic-looking customers, products, orders, order_items, and payments. Each run produces fresh CSV files, which are then loaded into a clean SQLite database. A final SQL query demonstrates multi-table joins and aggregation (e.g., revenue per customer). The goal is to showcase: Efficient prompt engineering Cursor’s agentic development workflow Clean, reproducible data + DB pipelines SQL join and reporting logic

Tech stack

Python 3 (standard library only)
SQLite for storage
CSV files for intermediate data
Cursor IDE for all agentic development

Project structure

data/ # Auto-generated datasets customers.csv products.csv orders.csv order_items.csv payments.csv

scripts/ generate_data.py # Create synthetic CSV datasets ingest_to_db.py # Build/refresh SQLite DB using the CSVs run_queries.py # Execute SQL join query and print results

sql/ queries.sql # SQL join + aggregation logic

README.md .gitignore ecom.db # Auto-generated SQLite database

How to run

Generate fresh CSVs
python scripts/generate_data.py
Rebuild the SQLite database from CSVs
python scripts/ingest_to_db.py
Execute the sample join query and view results
python scripts/run_queries.py

Prompts Used

Hey Cursor — You are an expert backend engineer and Python / SQLite developer. I’m doing a timed assignment, so I want to keep things clean and simple. I need a small Python project that generates synthetic e-commerce data, loads it into SQLite, and runs a SQL query that joins multiple tables. Please set up the initial project structure with:

data/
scripts/ (generate_data.py, ingest_to_db.py, run_queries.py)
sql/queries.sql
README.md (with Overview, How to Run, Prompts Used)
.gitignore

Just create the skeleton and short comments in each file. No logic yet.

Cursor created the full folder structure, added placeholder Python files with comments, generated README sections, and initialized .gitignore. Files created: data/, scripts/, sql/, README.md, .gitignore.

Now implement generate_data.py. I need 5 CSVs: customers.csv, products.csv, orders.csv, order_items.csv, payments.csv.

Use Python standard libraries. Generate around:

50 customers
20 products
150 orders
300–400 order_items
payments for non-cancelled orders

Respect foreign keys, overwrite files each run, and print a summary after writing each CSV.

Cursor created a complete script with helper functions, random data generation, referential integrity enforcement, CSV writing utilities, and printed row summaries. Running the script generated all 5 CSV files successfully.

Implement ingest_to_db.py so it:

Creates ecom.db
Drops & recreates tables
Loads each CSV with sqlite3 + csv modules
Creates indexes on FK columns
Prints row counts inserted Script must be rerunnable cleanly.
Cursor generated full table DDL, drop/create logic, CSV parsing, bulk inserts, and index creation. Running the script produced a clean SQLite DB with correct row counts.

Write a multi-table join query in sql/queries.sql that computes revenue per customer.

Then implement run_queries.py to:

read the SQL file
execute it
print a formatted table output
warn the user if the DB doesn't exist
Cursor wrote a LEFT JOIN–based revenue aggregation query and a Python script to read & run the SQL. Running the script displayed a sorted table of customers and their total revenue.

Update README to look clean and professional. Include sections: Title, Overview, Tech Stack, Project Structure, How to Run, Prompts Used. Simplify tone and improve readability.

Notes on Cursor Agentic Workflow

This project intentionally highlights: -How agentic prompting can build a full data pipeline -How to guide Cursor using clear, high-level instructions -How to structure iterative prompts for maximum output quality -How meaningful prompts shorten development time significantly

Each code script was entirely AI-generated using the Cursor Agentic IDE, but human-directed through concise instructions.

Workflow

                      ┌──────────────────────────┐
                      │  scripts/generate_data.py│
                      │──────────────────────────│
                      │  • Creates synthetic     │
                      │    e-commerce datasets   │
                      │  • Writes 5 CSV files    │
                      │    (customers, products, │
                      │     orders, items, pay.) │
                      └──────────────┬───────────┘
                                     │
                                     ▼
                     ┌────────────────────────────────┐
                     │             data/              │
                     │────────────────────────────────│
                     │  • CSV intermediates used for  │
                     │    ingestion into SQLite       │
                     └──────────────┬─────────────────┘
                                     │
                                     ▼
                    ┌──────────────────────────────────┐
                    │     scripts/ingest_to_db.py      │
                    │──────────────────────────────────│
                    │  • Drops, recreates SQLite tables│
                    │  • Loads CSVs with referential   │
                    │    integrity                     │
                    │  • Builds indexes                │
                    └──────────────┬───────────────────┘
                                   │
                                   ▼
                  ┌──────────────────────────────────────┐
                  │               ecom.db                │
                  │──────────────────────────────────────│
                  │   SQLite database with:              │
                  │   • customers                        │
                  │   • products                         │
                  │   • orders                           |
                  │   • order_items                      │
                  │   • payments                         │
                  └──────────────┬───────────────────────┘
                                 │
                                 ▼
                ┌──────────────────────────────────────────┐
                │             sql/queries.sql              │
                │──────────────────────────────────────────│
                │   • Revenue aggregation query            │
                │   • Multi-table joins                    │
                └──────────────┬───────────────────────────┘
                               │
                               ▼
               ┌────────────────────────────────────────────┐
               │          scripts/run_queries.py            │
               │────────────────────────────────────────────│
               │   • Loads SQL from file                    │
               │   • Executes query against SQLite DB       │
               │   • Prints formatted output table          │
               └────────────────────────────────────────────┘

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Synthetic E-commerce Database

Overview

Tech stack

Project structure

How to run

Prompts Used

Notes on Cursor Agentic Workflow

Workflow

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
scripts		scripts
sql		sql
README.md		README.md
ecom.db		ecom.db

Folders and files

Latest commit

History

Repository files navigation

Synthetic E-commerce Database

Overview

Tech stack

Project structure

How to run

Prompts Used

Notes on Cursor Agentic Workflow

Workflow

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages