Skip to content

A Docker Compose setup for running a local AI and LLM environment with multiple services including AnythingLLM, Flowise, Open WebUI, n8n, and Qdrant.

Notifications You must be signed in to change notification settings

dalekurt/local-llm-stack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Local LLM Stack: Your Complete AI Workbench

A comprehensive Docker-based environment for working with Large Language Models locally

Local LLM Stack Banner

Introduction

In today's rapidly evolving AI landscape, having a reliable local environment for experimenting with Large Language Models (LLMs) has become essential for developers, researchers, and AI enthusiasts. The Local LLM Stack provides exactly that - a complete ecosystem for document management, workflow automation, vector search, and LLM inference capabilities, all running locally on your machine.

Inspired by setups like Peter Nhan's Ollama-AnythingLLM integration, this stack takes the concept further by incorporating additional services and enhanced configuration options to create a more robust and versatile environment.

What Makes This Stack Special?

The Local LLM Stack integrates several powerful open-source tools that work together seamlessly:

  • Document processing and RAG capabilities with AnythingLLM
  • AI workflow automation with Flowise
  • Direct LLM interaction through Open WebUI
  • Text generation pipeline with Open WebUI Pipelines
  • Advanced workflow automation with n8n
  • Vector storage with Qdrant
  • Seamless integration with Ollama for LLM inference

Unlike simpler setups, this stack is designed with flexibility in mind, allowing you to use either a locally installed Ollama instance or run Ollama as a containerized service. All components are configured to work together out of the box, with sensible defaults that can be customized to suit your specific needs.

The Services: A Closer Look

AnythingLLM

AnythingLLM serves as your document management and AI interaction platform. It allows you to chat with any document, such as PDFs or Word files, using various LLMs. In our setup, AnythingLLM is configured to use:

  • Qdrant as the vector database for efficient document embeddings
  • Ollama for LLM capabilities
  • Port 3002 for web access (http://localhost:3002)

Flowise

Flowise provides a visual programming interface for creating AI workflows:

  • Build complex AI applications without coding
  • Connect to various AI services and data sources
  • Port 3001 for web access (http://localhost:3001)

Open WebUI

Open WebUI offers a clean interface for direct interaction with AI models:

  • Chat with models hosted on Ollama
  • Manage and switch between different models
  • Port 11500 for web access (http://localhost:11500)

Open WebUI Pipelines

Pipelines provides optimized text generation capabilities:

  • High-performance inference server for LLMs
  • OpenAI-compatible API for easy integration
  • Works seamlessly with Open WebUI
  • Accessible through Open WebUI or directly via API

n8n

n8n is a powerful workflow automation platform:

  • Create automated workflows connecting various services
  • Includes automatic import of workflows and credentials from backup directory
  • Port 5678 for web access (http://localhost:5678)

Qdrant

Qdrant serves as our vector database for AI applications:

  • Stores and retrieves vector embeddings for semantic search
  • Used by AnythingLLM for document embeddings
  • Port 6333 for API access (http://localhost:6333)

PostgreSQL

PostgreSQL provides database services for n8n:

  • Reliable and robust SQL database
  • Port 5432 for database connections

Ollama (Optional)

Ollama is our LLM inference engine:

  • Run models like Llama, Mistral, and others locally
  • Can be run natively (default) or as a containerized service
  • Used by AnythingLLM and can be used by Flowise
  • Port 11434 for API access (http://localhost:11434)

Getting Started

Prerequisites

Before diving in, make sure you have:

  • Docker and Docker Compose installed on your system
  • Basic understanding of Docker containers and networking
  • Ollama installed locally (or optionally run as a container by uncommenting the Ollama service in docker-compose.yml)

Setup Instructions

  1. Clone the repository:

    git clone https://github.com/yourusername/local-llm-stack.git
    cd local-llm-stack
  2. Create your environment file:

    cp .env.sample .env
  3. Customize your environment: Open the .env file in your favorite editor and modify the credentials and settings to your liking.

  4. Configure Ollama: By default, the setup is configured to use a locally installed Ollama instance. If you prefer to run Ollama as a container:

    • Uncomment the ollama_storage volume in the volumes section of docker-compose.yml
    • Uncomment the entire ollama service definition
    • Update the OLLAMA_BASE_PATH and EMBEDDING_BASE_PATH in .env to use http://ollama:11434
  5. Create the data directory structure:

    mkdir -p ./data/{n8n,postgres,qdrant,openwebui,flowise,anythingllm,pipelines}
  6. Launch the stack:

    docker-compose up -d
  7. Access your services:

Data Persistence

Your data is persisted in the local ./data directory with separate subdirectories for each service:

  • ./data/n8n: n8n data
  • ./data/postgres: PostgreSQL database
  • ./data/qdrant: Qdrant vector database
  • ./data/openwebui: Open WebUI data
  • ./data/flowise: Flowise data
  • ./data/anythingllm: AnythingLLM data
  • ./data/pipelines: Open WebUI Pipelines data
  • ./data/ollama: Ollama data (when using containerized Ollama)

This structure ensures your data persists across container restarts and makes it easy to backup or transfer your data.

Network Configuration

All services are connected to the ai-network Docker network for internal communication, ensuring they can seamlessly work together while remaining isolated from your host network.

Customization Options

AnythingLLM Configuration

AnythingLLM is highly configurable through environment variables. In our setup, it's configured to use:

  • Qdrant as the vector database for document embeddings
  • Ollama for LLM capabilities and embeddings
  • Various settings that can be adjusted in the .env file

Open WebUI Pipelines

To connect Open WebUI with the Pipelines service:

  1. In Open WebUI, go to Settings > Connections > OpenAI API
  2. Set the API URL to http://pipelines:9099
  3. Set the API key to 0p3n-w3bu! (or your custom key if modified)

Ollama Configuration

Ollama can be configured in two ways:

  1. Native Installation (Default):

    • Install Ollama directly on your host machine
    • The services are configured to access Ollama via host.docker.internal
    • Requires Ollama to be running on your host (ollama serve)
  2. Containerized Installation:

    • Uncomment the Ollama service in docker-compose.yml
    • Uncomment the ollama_storage volume
    • Update the OLLAMA_BASE_PATH and EMBEDDING_BASE_PATH in .env to http://ollama:11434
    • No need to run Ollama separately on your host

Use Cases

Document Analysis with RAG

Upload documents to AnythingLLM and chat with them using the power of Retrieval-Augmented Generation (RAG). The system will:

  1. Process and embed your documents using Ollama's embedding models
  2. Store the embeddings in Qdrant
  3. Retrieve relevant context when you ask questions
  4. Generate responses using the LLM with the retrieved context

Workflow Automation

Use n8n to create workflows that:

  • Monitor folders for new documents
  • Trigger document processing in AnythingLLM
  • Send notifications when processing is complete
  • Extract and process information from documents

AI Application Development

Use Flowise to build AI applications with a visual interface:

  • Create chatbots with specific knowledge domains
  • Build document processing pipelines
  • Develop custom AI assistants for specific tasks

High-Performance Inference

Use Open WebUI Pipelines for optimized text generation:

  • Faster response times compared to standard Ollama API
  • OpenAI-compatible API for easy integration with various tools
  • Connect directly from Open WebUI for an enhanced experience

Troubleshooting

If you encounter any issues:

  1. Check the logs:

    docker-compose logs [service-name]
  2. Ensure port availability: Make sure all required ports are available on your host system.

  3. Verify environment variables: Check that the .env file contains all necessary variables.

  4. Ollama status: If using native Ollama, ensure it's running on your host machine:

    ollama serve
  5. Connection issues: If you're having connection issues between containers and your host machine, check that the extra_hosts configuration is correctly set in docker-compose.yml.

  6. Permission issues with data directories: If you encounter permission errors with the data directories, you may need to adjust permissions:

    sudo chmod -R 777 ./data  # Use with caution in production environments

Conclusion

The Local LLM Stack provides a powerful, flexible environment for working with AI and LLMs locally. By combining the best open-source tools available, it offers capabilities that were previously only available through cloud services, all while keeping your data private and under your control.

Whether you're a developer looking to build AI applications, a researcher experimenting with document analysis, or an enthusiast exploring the capabilities of LLMs, this stack provides the foundation you need to get started quickly and efficiently.


References

About

A Docker Compose setup for running a local AI and LLM environment with multiple services including AnythingLLM, Flowise, Open WebUI, n8n, and Qdrant.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published