@@ -69,11 +69,13 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
69
69
- ** CPU:** ` $env:ELLM_TARGET_DEVICE='cpu'; pip install -e .[cpu] `
70
70
- ** CUDA:** ` $env:ELLM_TARGET_DEVICE='cuda'; pip install -e .[cuda] `
71
71
- ** IPEX:** ` $env:ELLM_TARGET_DEVICE='ipex'; python setup.py develop `
72
+ - ** OpenVINO:** ` $env:ELLM_TARGET_DEVICE='openvino'; pip install -e .[openvino] `
72
73
- ** With Web UI** :
73
74
- ** DirectML:** ` $env:ELLM_TARGET_DEVICE='directml'; pip install -e .[directml,webui] `
74
75
- ** CPU:** ` $env:ELLM_TARGET_DEVICE='cpu'; pip install -e .[cpu,webui] `
75
76
- ** CUDA:** ` $env:ELLM_TARGET_DEVICE='cuda'; pip install -e .[cuda,webui] `
76
77
- ** IPEX:** ` $env:ELLM_TARGET_DEVICE='ipex'; python setup.py develop; pip install -r requirements-webui.txt `
78
+ - ** OpenVINO:** ` $env:ELLM_TARGET_DEVICE='openvino'; pip install -e .[openvino,webui] `
77
79
78
80
- ** Linux**
79
81
@@ -88,11 +90,13 @@ Run local LLMs on iGPU, APU and CPU (AMD , Intel, and Qualcomm (Coming Soon)). E
88
90
- ** CPU:** ` ELLM_TARGET_DEVICE='cpu' pip install -e .[cpu] `
89
91
- ** CUDA:** ` ELLM_TARGET_DEVICE='cuda' pip install -e .[cuda] `
90
92
- ** IPEX:** ` ELLM_TARGET_DEVICE='ipex' python setup.py develop `
93
+ - ** OpenVINO:** ` ELLM_TARGET_DEVICE='openvino' pip install -e .[openvino] `
91
94
- ** With Web UI** :
92
95
- ** DirectML:** ` ELLM_TARGET_DEVICE='directml' pip install -e .[directml,webui] `
93
96
- ** CPU:** ` ELLM_TARGET_DEVICE='cpu' pip install -e .[cpu,webui] `
94
97
- ** CUDA:** ` ELLM_TARGET_DEVICE='cuda' pip install -e .[cuda,webui] `
95
- - ** IPEX:** ` $env:ELLM_TARGET_DEVICE='ipex'; python setup.py develop; pip install -r requirements-webui.txt `
98
+ - ** IPEX:** ` ELLM_TARGET_DEVICE='ipex' python setup.py develop; pip install -r requirements-webui.txt `
99
+ - ** OpenVINO:** ` ELLM_TARGET_DEVICE='openvino' pip install -e .[openvino,webui] `
96
100
97
101
### Launch OpenAI API Compatible Server
98
102
@@ -131,12 +135,29 @@ It is an interface that allows you to download and deploy OpenAI API compatible
131
135
132
136
## Compile OpenAI-API Compatible Server into Windows Executable
133
137
138
+ **NOTE:** OpenVINO packaging currently uses `torch==2.4.0`. It will not be able to run due to missing dependencies which is `libomp`. Make sure to install `libomp` and add the `libomp-xxxxxxx.dll` to `C:\\Windows\\System32`.
139
+
134
140
1. Install `embeddedllm`.
135
141
2. Install PyInstaller: `pip install pyinstaller==6.9.0`.
136
142
3. Compile Windows Executable: `pyinstaller .\ellm_api_server.spec`.
137
143
4. You can find the executable in the `dist\ellm_api_server`.
138
144
5. Use it like `ellm_server`. `.\ellm_api_server.exe --model_path <path/to/model/weight>`.
139
145
146
+ _Powershell/Terminal Usage_:
147
+
148
+ ```powershell
149
+ ellm_server --model_path <path/to/model/weight>
150
+
151
+ # DirectML
152
+ ellm_server --model_path 'EmbeddedLLM_Phi-3-mini-4k-instruct-062024-onnx\onnx\directml\Phi-3-mini-4k-instruct-062024-int4' --port 5555
153
+
154
+ # IPEX-LLM
155
+ ellm_server --model_path '.\meta-llama_Meta-Llama-3.1-8B-Instruct\' --backend 'ipex' --device 'xpu' --port 5555 --served_model_name 'meta-llama_Meta/Llama-3.1-8B-Instruct'
156
+
157
+ # OpenVINO
158
+ ellm_server --model_path '.\meta-llama_Meta-Llama-3.1-8B-Instruct\' --backend 'openvino' --device 'gpu' --port 5555 --served_model_name 'meta-llama_Meta/Llama-3.1-8B-Instruct'
159
+ ```
160
+
140
161
## Prebuilt OpenAI API Compatible Windows Executable (Alpha)
141
162
142
163
You can find the prebuilt OpenAI API Compatible Windows Executable in the Release page.
@@ -151,6 +172,9 @@ _Powershell/Terminal Usage (Use it like `ellm_server`)_:
151
172
152
173
# IPEX-LLM
153
174
.\ellm_api_server.exe --model_path '.\meta-llama_Meta-Llama-3.1-8B-Instruct\' --backend 'ipex' --device 'xpu' --port 5555 --served_model_name 'meta-llama_Meta/Llama-3.1-8B-Instruct'
175
+
176
+ # OpenVINO
177
+ .\ellm_api_server.exe --model_path '.\meta-llama_Meta-Llama-3.1-8B-Instruct\' --backend 'openvino' --device 'gpu' --port 5555 --served_model_name 'meta-llama_Meta/Llama-3.1-8B-Instruct'
154
178
```
155
179
156
180
## Acknowledgements
0 commit comments