Skip to content

Commit a63677a

Browse files
authored
Merge pull request #572 from dminnear-rh/blog/rag-llm-on-azure
add blog post about deploying rag-llm-gitops pattern on Azure with Azure SQL Server
2 parents 6fd5883 + e603d90 commit a63677a

File tree

1 file changed

+148
-0
lines changed

1 file changed

+148
-0
lines changed
Lines changed: 148 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,148 @@
1+
---
2+
date: 2025-06-03
3+
title: Deploying the RAG-LLM GitOps Pattern on Azure
4+
summary: How to deploy the RAG-LLM GitOps Validated Pattern on an ARO cluster
5+
author: Drew Minnear
6+
blog_tags:
7+
- patterns
8+
- how-to
9+
- Azure
10+
- Azure SQL Server
11+
- ARO
12+
- rag-llm-gitops
13+
---
14+
:toc:
15+
:imagesdir: /images
16+
17+
[IMPORTANT]
18+
====
19+
Currently, the Azure SQL (MSSQL) support and related Azure deployment improvements are available only in the branch https://github.com/dminnear-rh/rag-llm-gitops/tree/use-mssql-db[`use-mssql-db`] of my fork.
20+
21+
You must fork or base your deployment off this branch until the changes in https://github.com/validatedpatterns/rag-llm-gitops/pull/66[PR #66] are merged into the main pattern repository.
22+
====
23+
24+
== Prerequisites
25+
26+
Before you start, ensure the following:
27+
28+
* You are logged into an existing ARO cluster.
29+
* Your Azure subscription has sufficient quota for GPU instances (default: `Standard_NC8as_T4_v3`, requiring at least 8 CPUs).
30+
* You've created a token on https://huggingface.co[HuggingFace] and accepted the terms of the model you'll deploy. By default, the pattern uses the https://huggingface.co/solidrust/Mistral-7B-Instruct-v0.3-AWQ[Mistral-7B-Instruct-v0.3-AWQ] model.
31+
32+
TIP: Model and database defaults are defined in `overrides/values-Azure.yaml`. You can override them by editing this file.
33+
34+
== Database Options
35+
36+
The pattern defaults to using Azure SQL Server. Alternatively, you may deploy a local Redis, PostgreSQL, or Elasticsearch instance within your cluster.
37+
38+
To select your database type, edit `overrides/values-Azure.yaml`:
39+
40+
[source,yaml]
41+
----
42+
global:
43+
db:
44+
type: "AZURESQL" # Options: AZURESQL, REDIS, EDB, ELASTIC
45+
----
46+
47+
WARNING: Choosing Redis, PostgreSQL (EDB), or Elasticsearch (ELASTIC) will deploy local database instances. Ensure your cluster has sufficient resources available.
48+
49+
== Deploying Azure SQL Server (Optional)
50+
51+
Follow these steps if you plan to use Azure SQL Server:
52+
53+
. Navigate to the Azure portal and create a new SQL Database server.
54+
. Select `Use SQL authentication`.
55+
. Record your `Server name`, `Server admin login`, and `Password` (these will be needed later).
56+
. On the *Networking* tab, set `Allow Azure services and resources to access this server` to `Yes`.
57+
. Click *Review + create*, and then *Create*.
58+
59+
Wait until the server status shows as active before proceeding.
60+
61+
== Creating Required Secrets
62+
63+
Before installation, create a secrets YAML file at `~/values-secret-rag-llm-gitops.yaml`. Populate it as follows:
64+
65+
[source,yaml]
66+
----
67+
version: "2.0"
68+
69+
secrets:
70+
- name: hfmodel
71+
fields:
72+
- name: hftoken
73+
value: hf_your_huggingface_token
74+
- name: azuresql
75+
fields:
76+
- name: user
77+
value: adminuser
78+
- name: password
79+
value: your_password
80+
- name: server
81+
value: yourservername.database.windows.net
82+
----
83+
84+
Replace these placeholders with your actual credentials:
85+
86+
* `hftoken`: Your HuggingFace token (you must accept the model's terms).
87+
* `user`: Azure SQL server admin username.
88+
* `password`: Azure SQL admin password.
89+
* `server`: Fully qualified Azure SQL server name.
90+
91+
TIP: If you're not using Azure SQL Server, omit the entire `azuresql` section.
92+
93+
== Creating GPU Nodes (MachineSet)
94+
95+
Your cluster requires GPU nodes with a specific taint to host the vLLM inference service:
96+
97+
[source,yaml]
98+
----
99+
- key: odh-notebook
100+
value: "true"
101+
effect: NoSchedule
102+
----
103+
104+
=== Creating GPU Nodes Automatically
105+
106+
If no GPU nodes exist, run this command to provision one default GPU node:
107+
108+
[source,shell]
109+
----
110+
./pattern.sh make create-gpu-machineset-azure
111+
----
112+
113+
This creates a single `Standard_NC8as_T4_v3` GPU node.
114+
115+
=== Customizing GPU Node Creation
116+
117+
To control GPU node specifics, provide additional parameters:
118+
119+
[source,shell]
120+
----
121+
./pattern.sh make create-gpu-machineset-azure GPU_REPLICAS=3 OVERRIDE_ZONE=2 GPU_VM_SIZE=Standard_NC16as_T4_v3
122+
----
123+
124+
Parameters available:
125+
126+
* `GPU_REPLICAS`: Number of GPU nodes to provision.
127+
* `OVERRIDE_ZONE`: Availability zone (optional).
128+
* `GPU_VM_SIZE`: Azure VM SKU for GPU nodes.
129+
130+
The script automatically applies the required taint. The Nvidia Operator installed by the pattern will handle CUDA driver installation on GPU nodes.
131+
132+
== Installing the Pattern
133+
134+
Ensure you've completed the following steps:
135+
136+
. Logged into your ARO cluster.
137+
. Created your database (Azure SQL Server) if applicable.
138+
. Prepared the secrets YAML file (`~/values-secret-rag-llm-gitops.yaml`).
139+
. Provisioned GPU nodes with the required taint.
140+
141+
Finally, install the pattern by running:
142+
143+
[source,shell]
144+
----
145+
./pattern.sh make install
146+
----
147+
148+
Your RAG-LLM GitOps Validated Pattern will now deploy to your Azure Red Hat OpenShift cluster.

0 commit comments

Comments
 (0)