You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* update .env.example
* fix CLI name and add python 3.9 to classifiers
* add CLI name
* Update README (WIP)
* Update README
* Add image
* Correctly use cwd for loading .env file (#67)
* Add gifs
* Minor updates
* Add initial paragraph about hayhooks
* Update README.md
Co-authored-by: Bilge Yücel <[email protected]>
* Update deployment guidelines
* Add explanation of why we need a pipeline wrapper
* Update table of contents
* Add support for short and long option names (#68)
* Add support for short and long option names
* Removed unused import
* Update README.md
Co-authored-by: Stefano Fiorucci <[email protected]>
* Add a section to the former way of pipeline deployment
---------
Co-authored-by: Bilge Yücel <[email protected]>
Co-authored-by: Stefano Fiorucci <[email protected]>
Copy file name to clipboardExpand all lines: docs/deployment_guidelines.md
+18-13Lines changed: 18 additions & 13 deletions
Original file line number
Diff line number
Diff line change
@@ -7,9 +7,10 @@ Following are some guidelines about deploying and running Haystack pipelines.
7
7
8
8
## TL;DR
9
9
10
-
- Use a single worker environment if you have mainly I/O operations in your pipeline and/or a low number of concurrent requests.
11
-
- Use a multi-worker environment if you have mainly CPU-bound operations in your pipeline and/or a high number of concurrent requests.
10
+
- Use a **single worker environment** if you have mainly I/O operations in your pipeline and/or a low number of concurrent requests.
11
+
- Use a **multi-worker environment** if you have mainly CPU-bound operations in your pipeline and/or a high number of concurrent requests.
12
12
- In any case, use `HAYHOOKS_PIPELINES_DIR` to share pipeline definitions across workers (if possible).
13
+
- You can use [any additional supported `uvicorn` environment variables](https://www.uvicorn.org/settings) to the `hayhooks run` command (or put them in a `.env` file).
13
14
14
15
## Single worker environment
15
16
@@ -26,31 +27,35 @@ command (or having a single Docker container running). This will launch a **sing
26
27
You can deploy a pipeline using:
27
28
28
29
```bash
29
-
hayhooks deploy
30
+
hayhooks deploy-files # recommended
31
+
32
+
# or
33
+
34
+
hayhooks deploy ...
30
35
```
31
36
32
-
command or do a `POST /deploy` request.
37
+
or make `POST /deploy` / `POST /deploy-files` requests.
33
38
34
39
### Handling concurrent requests (single worker)
35
40
36
-
The `run()` method of the pipeline instance is synchronous code, and it's executed using `run_in_threadpool` to avoid blocking the main async event loop.
41
+
The `run()` method of the pipeline instance is _synchronous_ code, and it's executed using `run_in_threadpool` to avoid blocking the main async event loop.
37
42
38
43
- If your pipeline is doing **mainly I/O operations** (like making HTTP requests, reading/writing files, etc.), the single worker should be able to handle concurrent requests.
39
44
- If your pipeline is doing **mainly CPU-bound operations** (like computing embeddings), the GIL (Global Interpreter Lock) will prevent the worker from handling concurrent requests, so they will be queued.
40
45
41
46
## Multiple workers environment
42
47
43
-
### Single instance with multiple workers
48
+
### Using `uvicorn` with multiple workers
44
49
45
-
Currently, `hayhooks run` command does not support multiple `uvicorn` workers. However, you can run multiple instances of the application using directly the `uvicorn` command or [FastAPI CLI](https://fastapi.tiangolo.com/fastapi-cli/#fastapi-run) using `fastapi run` command.
50
+
Hayhooks supports multiple `uvicorn` workers running on a single instance, you can use the `hayhooks run` command with the `--workers` flag to start the application with the desired number of workers.
46
51
47
52
For example, if you have enough cores to run 4 workers, you can use the following command:
48
53
49
54
```bash
50
-
fastapi run src/hayhooks/server/app.py --workers 4
55
+
hayhooks run --workers 4
51
56
```
52
57
53
-
This vertical scaling approach allows you to handle more concurrent requests (depending on available resources).
58
+
This vertical scaling approach allows you to handle more concurrent requests (depending on environment available resources).
54
59
55
60
### Multiple single-worker instances behind a load balancer
56
61
@@ -60,12 +65,12 @@ This horizontal scaling approach allows you to handle more concurrent requests.
60
65
61
66
### Pipeline deployment (multiple workers)
62
67
63
-
In both the above scenarios, **it's NOT recommended** to deploy a pipeline using the `hayhooks deploy` command (or `POST /deploy` request) as it will deploy the pipeline only on one of the workers, which is not ideal.
68
+
In both the above scenarios, **it's NOT recommended** to deploy a pipeline using Hayhooks CLI commands (or corresponding API requests) as **it will deploy the pipeline only on one of the workers**, which is not ideal.
64
69
65
-
Instead, you want to provide the env var`HAYHOOKS_PIPELINES_DIR`pointing to a shared folder where all the workers can read the pipeline definitions at startup and load them. This way, all the workers will have the same pipelines available and there will be no issues when calling the API to run a pipeline.
70
+
Instead, set the environment variable`HAYHOOKS_PIPELINES_DIR`to point to a shared directory accessible by all workers. When Hayhooks starts up, each worker will load pipeline definitions from this shared location, ensuring consistent pipeline availability across all workers when handling API requests.
When having multiple workers and pipelines deployed using `HAYHOOKS_PIPELINES_DIR`, you will be able to handle concurrent requests as each worker will be able to run a pipeline independently. This should be enough to make your application scalable, according to your needs.
74
+
When having multiple workers and pipelines deployed using `HAYHOOKS_PIPELINES_DIR`, you will be able to handle concurrent requests as each worker should be able to run a pipeline independently. This may be enough to make your application scalable, according to your needs.
70
75
71
-
Note that even in a multiple-workers environment the individual single workers will have the same GIL limitation discussed above, so if your pipeline is mainly CPU-bound, you will need to scale horizontally according to your needs.
76
+
Note that even in a multiple-workers environment, the individual single worker will have the same GIL limitations discussed above, so if your pipeline is mainly CPU-bound, you will need to scale horizontally according to your needs.
0 commit comments