An intelligent RAG (Retrieval-Augmented Generation) application that generates contextual quiz questions from your documents using vector search and LLM capabilities.
QuizMate is a full-stack application designed to help users create personalized quizzes from their own documents (PDFs and Markdown files). The system uses a RAG pipeline to:
- Ingest documents - Upload and process PDF or Markdown files
- Store embeddings - Convert document chunks into vector embeddings stored in Qdrant
- Generate quizzes - Query relevant content and use OpenAI to create contextual quiz questions
- Interactive testing - Take quizzes with multiple-choice, true/false, and open-ended questions
- Evaluation - Built-in LLM-as-a-judge evaluation system to assess RAG quality
Use Cases:
- Students preparing for exams from textbooks
- Teachers creating assessments from course materials
- Professionals testing knowledge from documentation
- Self-learners validating understanding of technical content
- PDF files - Textbooks, research papers, documentation
- Markdown files - Technical docs, notes, articles
The application uses a hybrid chunking approach that balances semantic coherence with size constraints:
Current Implementation:
- Sentence-aware chunking using Apache OpenNLP
- Max tokens per chunk: 512 tokens (configurable via
quizmate.chunking.max-tokens) - Overlap: 50 tokens between chunks (configurable via
quizmate.chunking.overlap-tokens) - Text cleaning: Removes extra whitespace, normalizes characters
How it works:
1. Split text into sentences using OpenNLP sentence detector
2. Group sentences until reaching token limit (512 tokens)
3. Add overlap from previous chunk (50 tokens) for context continuity
4. Store each chunk with source metadata in Qdrant
The current chunking strategy is functional but has room for improvement:
- Fixed-size chunking - Doesn't consider semantic boundaries beyond sentences
- No hierarchical structure - Headers, sections, and document structure are lost
- Limited text cleaning - Basic normalization only
- Java 25 - Core application language
- Spring Boot 3.5.6 - Application framework
- Spring Data JPA - Database access
- H2 Database - Lightweight embedded database for source tracking
- Qdrant - Vector database for semantic search
- Deep Java Library (DJL) - ML framework for Java
- sentence-transformers/all-MiniLM-L6-v2 - Embedding model (384 dimensions)
- OpenAI GPT - Quiz generation and evaluation
- GPT-3.5-turbo - Default model for quiz generation
- GPT-4 - Evaluation judge model
- Spring AI PDF Reader - PDF text extraction
- Apache OpenNLP - Sentence detection and tokenization
-
Java 25 (or compatible JDK)
java --version
-
Maven (for building)
mvn --version
-
Qdrant Vector Database
Option A: Docker (Recommended)
docker run -p 6334:6334 -p 6333:6333 \ -v $(pwd)/qdrant_storage:/qdrant/storage \ qdrant/qdrantOption B: Local Installation
- Download from Qdrant releases
- Follow installation instructions for your OS
-
OpenAI API Key
Get your API key from OpenAI Platform
-
Set OpenAI API Key (required)
# Linux/Mac export OPENAI_API_KEY="sk-your-api-key-here" # Windows set OPENAI_API_KEY=sk-your-api-key-here
-
Configure Application (optional)
Edit
src/main/resources/application.properties:# Qdrant Configuration (if not using defaults) quizmate.qdrant.host=localhost quizmate.qdrant.port=6334 quizmate.qdrant.collection-name=cs-textbooks # LLM Configuration quizmate.llm.model=gpt-3.5-turbo quizmate.evaluation.judge-model=gpt-4 # Chunking Strategy quizmate.chunking.max-tokens=512 quizmate.chunking.overlap-tokens=50
# If using Docker
docker run -p 6334:6334 qdrant/qdrant# Clone the repository
git clone <your-repo-url>
cd quizmate
# Build with Maven
./mvnw clean install
# Run the application
./mvnw spring-boot:runOpen your browser and navigate to:
http://localhost:8080
Step 1: Upload Documents
- Click "Select File" and choose a PDF or Markdown file
- Enter a source name (e.g., "ReactDocs", "SystemDesignBook")
- Click "Upload and Ingest"
Step 2: Generate Quiz
- Enter a topic/query (e.g., "database normalization")
- Select the source from the dropdown
- Choose number of questions (1-20)
- Click "Generate Quiz"
Step 3: Take the Quiz
- Answer each question (multiple-choice, true/false, or open-ended)
- Click "Submit Quiz"
- View your score and correct answers
src/main/java/ai/quizmate/
βββ config/ # Configuration classes
β βββ QdrantProperties.java # Qdrant connection settings
β βββ ChunkingProperties.java # Chunking strategy config
β βββ LlmProperties.java # LLM model settings
β βββ EvaluationProperties.java # Evaluation system config
β
βββ ingestion/ # Document ingestion pipeline
β βββ IngestionPipeline.java # Orchestrates PDF/MD processing
β
βββ service/ # Business logic
β βββ HybridChunkingService.java # Sentence-aware chunking
β βββ EmbeddingService.java # Text β vectors (DJL)
β βββ RetrievalService.java # Vector search in Qdrant
β βββ AugmentationService.java # Prompt building
β βββ LlmService.java # OpenAI API calls
β βββ SourceService.java # Source management
β
βββ repository/ # Data access
β βββ QdrantRepository.java # Vector DB operations
β βββ SourceRepository.java # H2 database access
β
βββ evaluation/ # RAG quality evaluation
β βββ EvaluationService.java # LLM-as-a-judge logic
β βββ EvaluationController.java # Evaluation endpoints
β βββ model/ # Evaluation data models
β
βββ facade/ # REST controllers
β βββ IngestionRestController.java # File upload & sources API
β βββ QuizRestController.java # Quiz generation API
β
βββ model/ # Data models
βββ entity/Source.java # Source entity (H2)
βββ request/ # API request models
βββ response/ # API response models
Orchestrates the document ingestion process:
PDF/Markdown β Text Extraction β Cleaning β Chunking β Embedding β Qdrant StorageImplements sentence-aware chunking:
- Uses OpenNLP for sentence detection
- Respects token limits
- Maintains overlap between chunks
Converts text to 384-dimensional vectors:
- Uses sentence-transformers/all-MiniLM-L6-v2
- Automatically detects vector dimensions
Performs semantic search:
- Embeds query
- Searches Qdrant with source filtering
- Returns top-K most relevant chunks
Builds prompts for LLM:
- Combines query + retrieved context
- Instructs LLM on quiz format
- Defines question types (multiple-choice, true/false, open-ended)
LLM-as-a-judge for quality assessment:
- Faithfulness: No hallucinations
- Answer Relevance: Addresses the query
- Context Relevance: Good retrieval
- Answer Quality: Well-structured responses
Document Management
POST /ingestion/upload- Upload and ingest documentGET /ingestion/sources- List available sources
Quiz Operations
POST /api/quiz- Generate quiz from query
Evaluation
POST /api/evaluation/evaluate-query- Evaluate single queryPOST /api/evaluation/evaluate-batch- Batch evaluation with aggregate stats
Current limitations:
- Fixed-size chunks ignore document structure
- No semantic boundary detection beyond sentences
- Tables and code blocks poorly handled
Current: Basic whitespace normalization
Current: Simple source-based filtering
Current: User query sent directly to retrieval
Proposed pipeline:
User Query β LLM Enhancement β Better Retrieval β Better Quiz
Enhancement strategies:
- Query expansion:
Input: "React hooks" Enhanced: "React hooks including useState, useEffect, useContext, custom hooks" - Hypothetical Document Embeddings (HyDE):
Input: "What is database normalization?" Generate hypothetical answer β Embed β Search with that embedding - Multi-query generation:
Input: "Explain microservices" Generate: [ "What are microservices?", "Benefits of microservices architecture", "Microservices vs monolithic architecture" ] Search all β Deduplicate results
Current: LLM-as-a-judge on 4 metrics