You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
### 3. Run the server using Docker Compose with custom parameters
102
68
103
69
To override default settings, you can provide additional parameters when starting the server. This is a more advanced approach:
@@ -193,7 +159,41 @@ cd vllm-gaudi/.cd/
193
159
> [!NOTE]
194
160
> When using configuration files, you do not need to set the `MODEL` environment variable, as the model name is specified within the configuration file. However, you must still provide your `HF_TOKEN`.
195
161
196
-
### 7. Running the Server Directly with Docker
162
+
### 7. Advance Options with pinning CPU cores for memory access coherence
163
+
164
+
To improve memory access cohererence and release CPUs to other CPU only workloads like a vLLM serving with Llama3 8B,
165
+
pin the CPU cores based on different CPU NUMA nodes by using an auto-generate docker-compose.override.yml file.
166
+
Validated Xeon Processors as for now: Intel Xeon 6960P, and Intel Xeon PLATINUM 8568Y+.
167
+
168
+
Couple python libraries are needed for the python scripts, so install the required packages using following commnad.
0 commit comments