Skip to content

Commit 36a2b0c

Browse files
committed
add blog post for configuring RAG LLM pattern
1 parent a1dfaeb commit 36a2b0c

File tree

2 files changed

+98
-7
lines changed

2 files changed

+98
-7
lines changed

content/blog/2025-06-03-rag-llm-azure.adoc

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,13 +14,6 @@
1414
:toc:
1515
:imagesdir: /images
1616

17-
[IMPORTANT]
18-
====
19-
Currently, the Azure SQL (MSSQL) support and related Azure deployment improvements are available only in the branch https://github.com/dminnear-rh/rag-llm-gitops/tree/use-mssql-db[`use-mssql-db`] of my fork.
20-
21-
You must fork or base your deployment off this branch until the changes in https://github.com/validatedpatterns/rag-llm-gitops/pull/66[PR #66] are merged into the main pattern repository.
22-
====
23-
2417
== Prerequisites
2518

2619
Before you start, ensure the following:
Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,98 @@
1+
---
2+
date: 2025-06-10
3+
title: How to Configure the RAG-LLM GitOps Pattern for Your Use Case
4+
summary: Learn how to configure the RAG-LLM GitOps pattern by selecting a vector DB backend, customizing document sources, and choosing embedding and LLM models to match your workload.
5+
author: Drew Minnear
6+
blog_tags:
7+
- patterns
8+
- how-to
9+
- rag-llm-gitops
10+
---
11+
:toc:
12+
:imagesdir: /images
13+
14+
== Overview
15+
16+
The RAG-LLM GitOps pattern provides several configuration options that allow you to tailor your deployment to specific workloads and environments. While nearly every setting exposed in the `values` files is adjustable, in practice, most users will focus on a few core areas:
17+
18+
- Choosing and configuring the RAG database (vector store) backend
19+
- Setting up document sources for populating the RAG DB
20+
- Selecting models for embedding and LLM inference
21+
22+
This post walks through how to configure each of these components using the pattern’s provided Helm chart values.
23+
24+
== Configuring the RAG DB Backend
25+
26+
=== Supported Database Providers
27+
28+
The pattern supports several backend options for the RAG vector database. You can specify which one to use by setting the `global.db.type` field in `values-global.yaml`. Current options include:
29+
30+
- `REDIS`
31+
- `EDB`
32+
- `ELASTIC`
33+
- `MSSQL`
34+
- `AZURESQL`
35+
36+
[NOTE]
37+
====
38+
If you're using `AZURESQL`, you must provision the Azure SQL Server instance externally—this is not handled by the pattern itself. For more on this setup, refer to our guide on https://validatedpatterns.io/blog/2025-06-03-rag-llm-azure/[RAG-LLM GitOps on Azure].
39+
====
40+
41+
For other options, the pattern will deploy the necessary database resources during installation.
42+
43+
=== Adding Sources to the RAG DB
44+
45+
You can specify documents to populate your vector DB using the `populateDbJob.repoSources` and `populateDbJob.webSources` fields in `charts/all/rag-llm/values.yaml`.
46+
47+
==== Repository Sources
48+
49+
For Git repository sources, provide a list of `repo` entries with associated glob patterns to select which files to include:
50+
51+
[source,yaml]
52+
----
53+
repoSources:
54+
- repo: https://github.com/RHEcosystemAppEng/llm-on-openshift.git
55+
globs:
56+
- examples/notebooks/langchain/rhods-doc/*.pdf
57+
- **/*.txt
58+
----
59+
60+
[IMPORTANT]
61+
====
62+
While you *can* include all files with a glob like +**/*+, it's typically better to restrict to file types suited for semantic search (e.g., `.pdf`, `.md`, `.txt`, `.adoc`). Including binaries, source code, or image files adds noise and degrades retrieval quality.
63+
====
64+
65+
==== Web Sources
66+
67+
For web pages, use the `webSources` list to define target URLs:
68+
69+
[source,yaml]
70+
----
71+
webSources:
72+
- https://ai-on-openshift.io/getting-started/openshift/
73+
- https://ai-on-openshift.io/getting-started/opendatahub/
74+
----
75+
76+
The contents of these URLs will be fetched and embedded as text documents. PDF URLs are automatically processed using the same logic as Git-sourced PDFs.
77+
78+
== Configuring the Embedding and LLM Models
79+
80+
The models used for embeddings and LLM inference are defined in `values-global.yaml` under:
81+
82+
- `global.model.vllm` – specifies the LLM used by the vLLM inference service
83+
- `global.model.embedding` – specifies the embedding model used for indexing and retrieval
84+
85+
These should be HuggingFace-compatible model names. Be sure to accept any model license terms on HuggingFace prior to use.
86+
87+
[NOTE]
88+
====
89+
Deployments targeting environments like Azure may need to adjust the model choice and serving parameters. For example, the default IBM Granite model requires 24 GiB of VRAM—more than most Azure GPUs provide. See `overrides/values-Azure.yaml` for a working example using an AWQ-quantized model that fits in 16 GiB.
90+
====
91+
92+
You may also need to tweak runtime arguments via `vllmServingRuntime.args` when using quantized or fine-tuned models.
93+
94+
== Summary
95+
96+
The RAG-LLM GitOps pattern is designed to be flexible, but most use cases require tuning only a handful of key values. Whether adjusting the backend DB, tweaking your data sources, or selecting compatible models, the pattern offers the configuration hooks you need to adapt it to your workload.
97+
98+
For more configuration examples and deployment tips, stay tuned to the https://validatedpatterns.io/blog/[Validated Patterns blog].

0 commit comments

Comments
 (0)