-
Notifications
You must be signed in to change notification settings - Fork 4
Embedding Models
Complete guide to embedding models for vector generation in OpenCode Memory.
Embedding models convert text into numerical vectors for similarity search. OpenCode Memory supports both local models (no API required) and external API-based models.
Local models run entirely on your machine without external API calls.
- No API costs
- Complete privacy
- No internet required
- Consistent performance
- No rate limits
- Initial download required
- Uses local compute resources
- Limited to available models
- May be slower than API
Dimensions: 768
Size: ~140MB
Quality: Excellent
Speed: Fast
Best general-purpose model. Recommended for most users.
Dimensions: 384
Size: ~23MB
Quality: Good
Speed: Very fast
{
"embeddingModel": "Xenova/all-MiniLM-L6-v2"
}Lightweight model for resource-constrained environments.
Dimensions: 768
Size: ~420MB
Quality: Excellent
Speed: Medium
{
"embeddingModel": "Xenova/all-mpnet-base-v2"
}High-quality model with strong semantic understanding.
Dimensions: 384
Size: ~130MB
Quality: Very good
Speed: Fast
{
"embeddingModel": "Xenova/bge-small-en-v1.5"
}Efficient model with good quality-to-size ratio.
Dimensions: 768
Size: ~420MB
Quality: Excellent
Speed: Medium
{
"embeddingModel": "Xenova/bge-base-en-v1.5"
}High-quality model optimized for retrieval tasks.
Dimensions: 1024
Size: ~1.2GB
Quality: Outstanding
Speed: Slow
{
"embeddingModel": "Xenova/bge-large-en-v1.5"
}Best quality but requires more resources.
For Speed:
- Xenova/all-MiniLM-L6-v2 (384 dimensions)
- Xenova/bge-small-en-v1.5 (384 dimensions)
For Quality:
- Xenova/bge-large-en-v1.5 (1024 dimensions)
- Xenova/all-mpnet-base-v2 (768 dimensions)
For Balance:
- Xenova/nomic-embed-text-v1 (768 dimensions, default)
- Xenova/bge-base-en-v1.5 (768 dimensions)
For Low Memory:
- Xenova/all-MiniLM-L6-v2 (384 dimensions, 23MB)
External API models use cloud services for embedding generation.
- No local compute required
- Access to latest models
- Consistent quality
- No model downloads
- API costs
- Requires internet
- Privacy considerations
- Rate limits
- Latency
Dimensions: 1536
Cost: $0.02 per 1M tokens
Quality: Excellent
{
"embeddingModel": "text-embedding-3-small",
"embeddingApiUrl": "https://api.openai.com/v1",
"embeddingApiKey": "sk-..."
}Best value for API-based embeddings.
Dimensions: 3072
Cost: $0.13 per 1M tokens
Quality: Outstanding
{
"embeddingModel": "text-embedding-3-large",
"embeddingApiUrl": "https://api.openai.com/v1",
"embeddingApiKey": "sk-..."
}Highest quality OpenAI model.
Dimensions: 1536
Cost: $0.10 per 1M tokens
Quality: Very good
{
"embeddingModel": "text-embedding-ada-002",
"embeddingApiUrl": "https://api.openai.com/v1",
"embeddingApiKey": "sk-..."
}Older model, use text-embedding-3-small instead.
{
"embeddingModel": "embed-english-v3.0",
"embeddingApiUrl": "https://api.cohere.ai/v1",
"embeddingApiKey": "..."
}{
"embeddingModel": "voyage-2",
"embeddingApiUrl": "https://api.voyageai.com/v1",
"embeddingApiKey": "..."
}{
"embeddingModel": "Xenova/nomic-embed-text-v1"
}Dimensions are auto-detected.
{
"embeddingModel": "text-embedding-3-small",
"embeddingApiUrl": "https://api.openai.com/v1",
"embeddingApiKey": "sk-your-api-key-here"
}Or use environment variable:
export OPENAI_API_KEY=sk-your-api-key-hereOverride auto-detection:
{
"embeddingModel": "custom-model",
"embeddingDimensions": 768
}When changing models with different dimensions, run migration:
POST /api/migrate
{
"newModel": "Xenova/all-MiniLM-L6-v2",
"newDimensions": 384
}- Backup database
- Configure new model
- Run migration
- Wait for completion
- Verify results
Depends on database size:
- 1,000 memories: ~1 minute
- 10,000 memories: ~10 minutes
- 100,000 memories: ~1-2 hours
| Model | Speed | Dimensions |
|---|---|---|
| Xenova/all-MiniLM-L6-v2 | 100+ | 384 |
| Xenova/nomic-embed-text-v1 | 50-80 | 768 |
| Xenova/bge-base-en-v1.5 | 40-60 | 768 |
| Xenova/bge-large-en-v1.5 | 20-30 | 1024 |
| OpenAI API | 1000+ | 1536 |
| Model | Quality | Use Case |
|---|---|---|
| Xenova/all-MiniLM-L6-v2 | Good | General use |
| Xenova/nomic-embed-text-v1 | Excellent | Recommended |
| Xenova/bge-base-en-v1.5 | Excellent | High quality |
| Xenova/bge-large-en-v1.5 | Outstanding | Best quality |
| OpenAI text-embedding-3-small | Excellent | API option |
| OpenAI text-embedding-3-large | Outstanding | Best API |
| Model | RAM | Disk | CPU |
|---|---|---|---|
| Xenova/all-MiniLM-L6-v2 | 200MB | 23MB | Low |
| Xenova/nomic-embed-text-v1 | 500MB | 140MB | Medium |
| Xenova/bge-base-en-v1.5 | 1GB | 420MB | Medium |
| Xenova/bge-large-en-v1.5 | 2GB | 1.2GB | High |
| OpenAI API | Minimal | None | None |
One-time costs:
- Download bandwidth
- Disk space
Ongoing costs:
- CPU/RAM usage
- Electricity
Total: Effectively free after initial download
OpenAI text-embedding-3-small:
- $0.02 per 1M tokens
- Average memory: ~100 tokens
- 10,000 memories: ~$0.02
- 100,000 memories: ~$0.20
OpenAI text-embedding-3-large:
- $0.13 per 1M tokens
- 10,000 memories: ~$0.13
- 100,000 memories: ~$1.30
Start with Default:
Use Xenova/nomic-embed-text-v1 unless you have specific needs.
Optimize for Use Case:
- Speed priority: Xenova/all-MiniLM-L6-v2
- Quality priority: Xenova/bge-large-en-v1.5
- API option: text-embedding-3-small
Test First:
Test new model on small dataset before full migration.
Backup Always:
Always backup database before migration.
Monitor Quality:
Compare search quality before and after migration.
Use for Scale:
API models better for very large databases (100k+ memories).
Monitor Costs:
Track API usage and costs regularly.
Rate Limits:
Be aware of provider rate limits.
Check model name:
{
"embeddingModel": "Xenova/nomic-embed-text-v1"
}Clear cache:
rm -rf ~/.cache/huggingfaceVerify credentials:
curl https://api.openai.com/v1/models \
-H "Authorization: Bearer sk-..."Check endpoint:
{
"embeddingApiUrl": "https://api.openai.com/v1"
}Run migration:
POST /api/migrate
{
"newModel": "Xenova/all-MiniLM-L6-v2",
"newDimensions": 384
}- Configuration Guide - All configuration options
- Database Architecture - Vector storage details
- Performance Tuning - Optimization strategies
{ "embeddingModel": "Xenova/nomic-embed-text-v1" }