PostgreSQL extension that brings AI-native capabilities directly into the database through IMMUTABLE ai.embed() and ai.classify() functions. Uses local ONNX Runtime for inference—no external API calls, zero cost, full privacy. Enables generated columns with AI operations for schema-level semantic understanding.
For the full story on architectural decisions, supervised AI development, and production-validated performance (92 docs/sec, 10.9ms latency, 117 MiB/backend), read: How and Why a Supervised AI Coded an AI-Native Postgres Extension
Status: Experimental Implementation Goal: Provide IMMUTABLE embedding and classification functions with generated columns support
- ✅ Loading ONNX model at
_PG_init()(once per backend) - ✅ IMMUTABLE functions (enables generated columns and indexes)
- ✅ Lazy loading (model loaded on first use)
- ✅ BERT WordPiece tokenization (full implementation)
- ✅ ONNX Runtime inference (complete)
- ✅ Input validation and security checks
- ✅ Category embedding cache (10-100× speedup for classification)
- ✅ Comprehensive test suite (20 SQL tests + 7 C unit tests)
Pre-built multi-architecture images (linux/amd64, linux/arm64) are available on GitHub Container Registry:
# Latest PostgreSQL (18)
docker pull ghcr.io/dmonroy/ai-native-pg:dev
# Specific PostgreSQL versions
docker pull ghcr.io/dmonroy/ai-native-pg:pg18-dev
docker pull ghcr.io/dmonroy/ai-native-pg:pg17-dev
docker pull ghcr.io/dmonroy/ai-native-pg:pg16-dev
docker pull ghcr.io/dmonroy/ai-native-pg:pg15-dev
docker pull ghcr.io/dmonroy/ai-native-pg:pg14-devAll images include:
- PostgreSQL (versions 14-18)
- pgvector extension
- ONNX Runtime 1.24.2
- ai extension with nomic-embed-text-v1.5 model
# Pull and run the latest version (PostgreSQL 18)
docker pull ghcr.io/dmonroy/ai-native-pg:dev
docker run -d \
--name ai-native-pg \
-e POSTGRES_PASSWORD=postgres \
-p 5432:5432 \
ghcr.io/dmonroy/ai-native-pg:dev
# Connect
psql -h localhost -U postgres
# Test extension
SELECT ai.health_check();
# Generate embeddings
SELECT vector_dims(ai.embed('Hello world')); -- Returns: 768
SELECT ai.embed('PostgreSQL') <=> ai.embed('database'); -- Cosine distance
# Classify content
SELECT ai.classify('PostgreSQL database', ARRAY['technology', 'sports', 'cooking']);
-- Returns: 'technology'PostgreSQL 18
├─ pgvector (vector type, HNSW indexes)
├─ ONNX Runtime 1.24.2 (CPU only)
└─ ai extension
├─ ai.c (C implementation with category cache)
├─ ai--1.0.sql (SQL definitions)
└─ Model: nomic-embed-text-v1.5 (768-dim, ~64MB, MTEB 62.28)
Pre-built images are recommended, but you can build locally:
# Build and test (runs full test suite)
./build-and-test.sh
# Or build manually with specific versions
docker build \
--build-arg PG_MAJOR=18 \
--build-arg ONNX_VERSION=1.24.2 \
-t ai-native-pg:latest .All versions are parameterized as Docker build arguments (see BUILD.md).
ai-native-pg/ # Repository root
├── Dockerfile # Container definition
├── BUILD.md # Build configuration docs
├── README.md # This file
├── build-and-test.sh # Build and test automation
├── ai_extension/
│ ├── Makefile # Build configuration
│ ├── ai.control # Extension metadata
│ ├── ai--1.0.sql # SQL function definitions
│ └── ai.c # C implementation
└── tests/
├── README.md # Test documentation
└── sql/ # SQL test files (15 tests)
Core Functions:
-
ai.embed(text)- Generate 768-dim embeddings (IMMUTABLE) -
ai.classify(text, text[])- Classify into categories (5 variants) -
ai.health_check()- System health verification -
ai.classify_cache_stats()- Cache performance monitoring -
ai.version()- Extension version
Infrastructure:
- Dockerfile with Postgres 18 + pgvector + ONNX Runtime
- C extension with category embedding cache
- Lazy loading architecture (model loads on first use)
- Input validation and security
- UTF-8 validation
- Null byte detection
- Path traversal protection
- File access validation
- BERT WordPiece tokenization
- ONNX Runtime inference
- Vector conversion (float[] → pgvector format)
- Generated column support (IMMUTABLE functions)
- Comprehensive test suite (20 SQL tests + 7 C unit tests)
- Support multiple embedding models
- Add batching for multiple embeddings
- GPU support via ONNX Runtime
- Model registry for dynamic loading
- Performance optimizations
- Production error handling improvements
-- Generate embeddings (768-dimensional vectors)
SELECT ai.embed('Hello world');
-- Use in semantic search
SELECT content, 1 - (embedding <=> ai.embed('query')) as similarity
FROM docs
ORDER BY embedding <=> ai.embed('query')
LIMIT 10;-- 1. Basic classification - returns single best match
SELECT ai.classify('PostgreSQL database', ARRAY['technology', 'sports', 'cooking']);
-- Returns: 'technology'
-- 2. With confidence threshold - returns NULL if below threshold
SELECT ai.classify(
'Maybe about tech?',
ARRAY['technology', 'sports'],
0.8 -- threshold
);
-- Returns: 'technology' or NULL (if similarity < 0.8)
-- 3. Multi-label - returns top K matches
SELECT ai.classify(
'Database systems and AI',
ARRAY['technology', 'ai', 'sports', 'cooking'],
2 -- top_k
);
-- Returns: ['technology', 'ai']
-- 4. Combined threshold + top_k
SELECT ai.classify(
'PostgreSQL with AI features',
ARRAY['database', 'ai', 'sports'],
0.6, -- threshold
2 -- top_k
);
-- Returns: ['database', 'ai'] (only those above 0.6 similarity)
-- 5. Type-safe enum classification
CREATE TYPE content_type AS ENUM ('technology', 'sports', 'cooking');
SELECT ai.classify('PostgreSQL', NULL::content_type);
-- Returns: 'technology'::content_type-- Create table with generated embedding column
CREATE TABLE docs (
id SERIAL PRIMARY KEY,
content TEXT,
embedding vector(768) GENERATED ALWAYS AS (ai.embed(content)) STORED
);
-- Insert data (embeddings auto-generated)
INSERT INTO docs (content) VALUES
('PostgreSQL is a powerful database'),
('AI-native features in databases'),
('Vector similarity search');
-- Create HNSW index
CREATE INDEX ON docs USING hnsw (embedding vector_cosine_ops);
-- Semantic search
SELECT content
FROM docs
ORDER BY embedding <=> ai.embed('database systems')
LIMIT 3;-- Auto-categorize content on INSERT
CREATE TABLE articles (
id SERIAL PRIMARY KEY,
content TEXT,
category TEXT GENERATED ALWAYS AS (
ai.classify(content, ARRAY['technology', 'sports', 'business', 'science'])
) STORED
);
-- Create index on category
CREATE INDEX ON articles(category);
-- Insert and auto-classify
INSERT INTO articles (content) VALUES
('PostgreSQL adds AI features'),
('Lakers win championship'),
('Stock market reaches new high');
-- Query by category (uses index)
SELECT * FROM articles WHERE category = 'technology';The extension caches category embeddings to avoid redundant computation:
-- First classification with new categories (cache miss)
SELECT ai.classify('text', ARRAY['cat1', 'cat2', 'cat3']);
-- Time: ~30ms (embeds content + 3 categories)
-- Subsequent classifications with same categories (cache hit)
SELECT ai.classify('different text', ARRAY['cat1', 'cat2', 'cat3']);
-- Time: ~6ms (only embeds content, categories cached)
-- Check cache performance
SELECT * FROM ai.classify_cache_stats();Cache benefits:
- 10-100× speedup for repeated categories
- Per-backend process (no locking needed)
- Automatic cleanup on connection close
Per backend process:
- ONNX Runtime initialized: ~5MB
- Model NOT loaded: ~5MB
- Model loaded (on first ai.embed() call): ~64MB
- Category cache: ~2KB per cached category
With 10 connections:
- If no one calls ai.embed(): 10 × 5MB = 50MB
- If all call ai.embed(): 10 × 64MB = 640MB
- Plus category cache: ~2MB typical (1000 categories × 10 connections)
Use PgBouncer for connection pooling in production!
- Single model only (nomic-embed-text-v1.5)
- No batching (one embedding at a time)
- CPU only (no GPU support)
- No model registry (hardcoded model path)
- Memory per backend (~64MB when model is loaded)
- Experimental status (API may change)
Run the comprehensive test suite:
./build-and-test.shThis runs 20 SQL tests and 7 C unit tests covering:
SQL Tests:
- Extension installation and health checks
- Basic embedding operations
- NULL and edge case handling
- Semantic similarity calculations
- Query and search operations
- Input validation
- Concurrent operations
- Category embedding cache
- Classification (5 variants: basic, threshold, top-k, combined, enum)
- Real-world scenarios
C Unit Tests:
- Similarity calculations
- Tokenization
- Memory management
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Copyright 2026 Darwin Monroy