java-full-stack-interview-questions/indexes.md at main · sajidhamdard/java-full-stack-interview-questions

✅ What is an Index in SQL?

An index in SQL is like an index in a book — it helps the database find rows faster without scanning the entire table.

📘 Imagine you’re searching for "Spring Boot" in a 1,000-page book:

Without an index: You flip every page one by one.
With an index: You go straight to page 573.

In SQL, the same thing happens with rows in a table.

🔍 Why Use Indexes?

Speed up queries (especially those with WHERE, JOIN, ORDER BY)
Improve performance on large tables
Help databases avoid full table scans

🛠️ How to Create an Index?

🧱 Basic Syntax:

CREATE INDEX index_name ON table_name (column1, column2, ...);

✅ Example:

CREATE INDEX idx_customer_name ON customers(name);

Now, this helps:

SELECT * FROM customers WHERE name = 'John';

go much faster.

🔢 Types of Indexes

Type	Use Case
Single Column Index	Index on one column
Composite Index	Index on multiple columns
Unique Index	Ensures values are unique (like a unique constraint)
Full-text Index	For fast text search (e.g., searching keywords in articles)
Clustered Index	Data rows are physically stored in index order (SQL Server, MySQL InnoDB)
Non-Clustered Index	Index is separate from the actual data
Bitmap Index	Efficient for columns with few distinct values (mainly in data warehousing)
Partial Index	Index on a filtered subset of rows (e.g., only where `status = 'active'`)
Covering Index	Contains all columns needed by the query so table access is avoided

Great! Let’s go through each index type one by one with:

✅ Simple Explanation
🧠 Use Case
🧾 Example

🔹 1. Single Column Index

✅ Index on one column
⚡ Used for filtering or searching on that column

Example:

CREATE INDEX idx_users_email ON users(email);

Use Case: Search users by email quickly:

SELECT * FROM users WHERE email = 'a@example.com';

🔹 2. Composite Index (Multi-column)

✅ Index on multiple columns
⚠️ Order of columns matters

Example:

CREATE INDEX idx_orders_user_date ON orders(user_id, order_date);

Use Case: This index helps with:

-- Uses full index (both user_id and order_date)
SELECT * FROM orders WHERE user_id = 5 AND order_date = '2023-01-01';

-- Uses only leading column (user_id)
SELECT * FROM orders WHERE user_id = 5;

-- ❌ This query won’t use the index
SELECT * FROM orders WHERE order_date = '2023-01-01';

🔹 3. Unique Index

✅ Ensures all values are unique
🔒 Like a UNIQUE constraint, but can be added explicitly

Example:

CREATE UNIQUE INDEX idx_unique_username ON users(username);

Use Case: Prevent duplicate usernames:

INSERT INTO users(username) VALUES ('john_doe');  -- allowed
INSERT INTO users(username) VALUES ('john_doe');  -- ❌ error

🔹 4. Full-text Index

✅ Optimized for text search
🔍 Useful for searching keywords or phrases in large text columns

Example (MySQL):

CREATE FULLTEXT INDEX idx_articles_content ON articles(content);

Use Case:

SELECT * FROM articles
WHERE MATCH(content) AGAINST('database indexing');

🔹 5. Clustered Index

✅ Data is physically stored in index order
📦 Only one clustered index per table
Used by default in:
- MySQL InnoDB (by primary key)
- SQL Server (can define manually)

Example (SQL Server):

CREATE CLUSTERED INDEX idx_orders_id ON orders(order_id);

Use Case: Speeds up retrieval by primary key, since data is already sorted that way.

🔹 6. Non-Clustered Index

✅ Index stored separately from data
📍 Has a pointer to actual row (row ID or primary key)
You can have many non-clustered indexes per table

Example (SQL Server):

CREATE NONCLUSTERED INDEX idx_users_status ON users(status);

Use Case:

SELECT * FROM users WHERE status = 'active';

🔹 7. Bitmap Index

✅ Uses bits instead of row pointers
🔁 Great for low-cardinality columns like gender, status
📊 Used mostly in data warehouses

Example (Oracle):

CREATE BITMAP INDEX idx_users_gender ON users(gender);

Use Case: Efficient for:

SELECT COUNT(*) FROM users WHERE gender = 'M' OR gender = 'F';

🔹 8. Partial Index

✅ Index on filtered rows only
🔍 Reduces index size & improves performance

Example (PostgreSQL):

CREATE INDEX idx_active_users ON users(email) WHERE status = 'active';

Use Case:

SELECT * FROM users WHERE status = 'active' AND email = 'a@example.com';

→ Index is used only for active users.

🔹 9. Covering Index

✅ Index includes all columns needed by the query
🚫 Avoids going to the table → faster

Example (MySQL):

CREATE INDEX idx_orders_user_date_total ON orders(user_id, order_date, total);

Use Case:

SELECT user_id, order_date, total FROM orders
WHERE user_id = 101 AND order_date = '2023-01-01';

→ Since all selected columns are in the index, no table lookup needed.

✅ Summary Table

Type	Best For / When to Use
Single Column	Simple queries filtering on one column
Composite	Queries filtering/sorting by multiple columns (order matters)
Unique	Enforce uniqueness
Full-text	Search in large text data
Clustered	Fast primary key access (data stored in index order)
Non-Clustered	General-purpose filtering/search
Bitmap	Low-cardinality columns (like 'M'/'F', 'Y'/'N')
Partial	Index only where specific conditions apply
Covering	Query can be answered fully from the index

🕵️‍♂️ When Should You Create an Index?

✅ Use index when:

The column is used in WHERE, JOIN, ORDER BY, or GROUP BY
You frequently filter or sort by that column
You want to speed up search on large tables

Example:

-- Frequently searched column
SELECT * FROM orders WHERE customer_id = 5;

➡️ Create index on customer_id

🚫 When NOT to Use an Index

❌ Avoid indexes on:

Columns with highly unique queries on small tables (not much benefit)
Frequently updated columns (index overhead)
Low-cardinality columns (e.g., gender = 'M' or 'F') – not efficient
Temporary or staging tables

⭐ Benefits of Indexes

⚡ Faster SELECT queries
🔄 Speeds up joins and filters
📚 Can reduce disk I/O
🧠 Smart databases can choose the best index via query planner

⚠️ Side Effects (Disadvantages) of Indexes

🐌 Slower INSERT/UPDATE/DELETE – because the index must also be updated
🧠 Takes memory and disk space
❗ Can be unused if your query doesn’t match how the index is structured
🧼 Requires maintenance (stats, rebuild, etc.)

🧠 Tips for Using Indexes Effectively

✅ Good:

CREATE INDEX idx_user_email ON users(email);
SELECT * FROM users WHERE email = 'test@example.com';

❌ Bad:

SELECT * FROM users WHERE LOWER(email) = 'test@example.com';
-- Index may not be used if function is applied to the column

To fix:

-- Some DBs support function-based indexes
CREATE INDEX idx_lower_email ON users(LOWER(email));

📋 How to See and Analyze Indexes?

-- In PostgreSQL
\d tablename

-- In MySQL
SHOW INDEX FROM tablename;

-- Explain plan
EXPLAIN SELECT * FROM table WHERE ...

These show whether an index is being used.

🔄 How to Drop an Index?

DROP INDEX idx_customer_name ON customers;

✅ Final Summary

Topic	Key Takeaway
What	Index = Shortcut to quickly find data
Why	Speeds up SELECTs, JOINs, WHEREs
How	`CREATE INDEX idx_name ON table(column);`
Types	Single, Composite, Unique, Full-text, Clustered, etc.
Use When	Columns are often filtered, joined, or sorted
Avoid When	Small tables, low-cardinality columns, frequent writes
Downsides	Slower writes, more disk space

Great question! Understanding how indexes work internally helps you write better, faster SQL queries. Let’s dive into it.

✅ What Data Structures Do Indexes Use Internally?

🔷 1. B-Tree (Balanced Tree) – Most Common

Used by: PostgreSQL, MySQL (InnoDB), Oracle, SQL Server
Default index type in most RDBMS

💡 What is a B-Tree?

A B-Tree is a self-balancing tree data structure where:

Data is stored in sorted order
Each node contains multiple keys and child pointers
Allows logarithmic time complexity:
- O(log n) for search, insert, delete
It keeps the tree height low, so less disk access is needed.

✅ Why B-Tree is Fast:

Binary search is done at each level
Branching factor is high (like 100s of keys per node)
Disk reads are minimized (fewer levels to traverse)

📍 How It Locates Data:

Let’s say you run:

SELECT * FROM users WHERE email = 'a@example.com';

If email has a B-Tree index:

DB does binary search at the root
Follows the appropriate pointer to the next level
Repeats until it finds the correct leaf node that stores the actual row’s pointer
Fetches the row from the table

🔷 2. Hash Index

Used by: PostgreSQL (manually), MySQL (Memory engine)
Fast for exact match lookups, like WHERE id = 123
Not used for range queries (<, >, BETWEEN, LIKE)

❌ Drawbacks:

Not sorted → can’t do range scans
Rebuilding needed on crash or upgrade in some engines (like older PostgreSQL versions)

🔷 3. Bitmap Index

Used by: Oracle, PostgreSQL (extensions)
Best for low-cardinality columns (like gender: M/F)
Uses bitmaps instead of row pointers:
- Each value gets a bitmap representing which rows contain it
- Efficient for complex OR, AND conditions

🔷 4. GiST / GIN / SP-GiST / R-Tree

Used in PostgreSQL for advanced data types:
- Arrays
- JSON
- Geospatial data
GIN: Generalized Inverted Index – great for full-text search
R-Tree: used in spatial databases (like for maps, bounding boxes)

🚀 Why Are Indexes So Fast?

Here’s the secret:

Avoid full table scans
- Without an index: scan every row (O(n))
- With index: go directly (O(log n) for B-Tree)
Minimized disk I/O
- B-Tree is designed to work well with disk blocks
- Fewer reads = faster performance
Sorted order helps with range queries
- Useful for ORDER BY, BETWEEN, etc.

🧠 Example: B-Tree Visualization

Imagine you have a B-Tree index on age:

             [30, 60]
           /     |     \
        <30     30-59    >=60
      [10,20]  [35,45]   [65,80]

To find age = 45:

Start at root [30, 60]
Go to middle branch (30–59)
Then to leaf [35,45]
Find row with 45 → return result

This is logarithmic and super efficient.

✅ Summary

Topic	Details
Default Structure	B-Tree – fast for equality + range queries
Exact Match Use	Hash Index
Low-Cardinality	Bitmap Index
Full-Text Search	GIN
Geospatial	R-Tree
Speed Reason	Binary search, sorted data, low tree height, minimal disk I/O

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✅ What is an Index in SQL?

🔍 Why Use Indexes?

🛠️ How to Create an Index?

🧱 Basic Syntax:

✅ Example:

🔢 Types of Indexes

🔹 1. Single Column Index

🔹 2. Composite Index (Multi-column)

🔹 3. Unique Index

🔹 4. Full-text Index

🔹 5. Clustered Index

🔹 6. Non-Clustered Index

🔹 7. Bitmap Index

🔹 8. Partial Index

🔹 9. Covering Index

✅ Summary Table

🕵️‍♂️ When Should You Create an Index?

Example:

🚫 When NOT to Use an Index

⭐ Benefits of Indexes

⚠️ Side Effects (Disadvantages) of Indexes

🧠 Tips for Using Indexes Effectively

📋 How to See and Analyze Indexes?

🔄 How to Drop an Index?

✅ Final Summary

✅ What Data Structures Do Indexes Use Internally?

🔷 1. B-Tree (Balanced Tree) – Most Common

💡 What is a B-Tree?

✅ Why B-Tree is Fast:

📍 How It Locates Data:

🔷 2. Hash Index

❌ Drawbacks:

🔷 3. Bitmap Index

🔷 4. GiST / GIN / SP-GiST / R-Tree

🚀 Why Are Indexes So Fast?

🧠 Example: B-Tree Visualization

✅ Summary

FilesExpand file tree

indexes.md

Latest commit

History

indexes.md

File metadata and controls

✅ What is an Index in SQL?

🔍 Why Use Indexes?

🛠️ How to Create an Index?

🧱 Basic Syntax:

✅ Example:

🔢 Types of Indexes

🔹 1. Single Column Index

🔹 2. Composite Index (Multi-column)

🔹 3. Unique Index

🔹 4. Full-text Index

🔹 5. Clustered Index

🔹 6. Non-Clustered Index

🔹 7. Bitmap Index

🔹 8. Partial Index

🔹 9. Covering Index

✅ Summary Table

🕵️‍♂️ When Should You Create an Index?

Example:

🚫 When NOT to Use an Index

⭐ Benefits of Indexes

⚠️ Side Effects (Disadvantages) of Indexes

🧠 Tips for Using Indexes Effectively

📋 How to See and Analyze Indexes?

🔄 How to Drop an Index?

✅ Final Summary

✅ What Data Structures Do Indexes Use Internally?

🔷 1. B-Tree (Balanced Tree) – Most Common

💡 What is a B-Tree?

✅ Why B-Tree is Fast:

📍 How It Locates Data:

🔷 2. Hash Index

❌ Drawbacks:

🔷 3. Bitmap Index

🔷 4. GiST / GIN / SP-GiST / R-Tree

🚀 Why Are Indexes So Fast?

🧠 Example: B-Tree Visualization

✅ Summary