🖋️ InkQuery

A lightweight Retrieval-Augmented Generation (RAG) demo built with Streamlit. Upload any PDF, ask a question in natural language, and get an answer grounded in the document's content.

Features

PDF text extraction — reads multi-page PDFs with PyPDF2
LLM-powered Q&A — sends extracted context + your question to OpenAI (GPT-4o / GPT-4o-mini)
Grounded answers — the system prompt constrains the model to answer only from the document
Simple UI — clean Streamlit interface with sidebar settings and expandable extracted text

Quick Start

# 1. Clone the repo
git clone https://github.com/murtagh27/inkQuery.git
cd inkQuery

# 2. Create a virtual environment
python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Run the app
streamlit run app.py

Then enter your OpenAI API key in the sidebar and upload a PDF.

How It Works

PDF  ──▶  PyPDF2 (text extraction)  ──▶  OpenAI Chat API  ──▶  Answer
                                            ▲
                                     user question

The uploaded PDF is parsed page-by-page into plain text.
The full text is injected into the LLM's system prompt as document context.
The user's question is sent as the user message.
The model is instructed to answer only from the provided context.

Note: This is a context-stuffing approach (the entire document is sent in one prompt). For production use with very large documents, you would add a vector store and retrieval step.

Project Structure

.
├── app.py               # Streamlit application
├── requirements.txt     # Python dependencies
├── .gitignore
├── LICENSE
└── README.md

Configuration

Setting	Default	Description
Model	`gpt-4o-mini`	Which OpenAI model to use
Max context	80 000 chars	Truncation limit for very large PDFs

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🖋️ InkQuery

Features

Quick Start

How It Works

Project Structure

Configuration

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🖋️ InkQuery

Features

Quick Start

How It Works

Project Structure

Configuration

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages