Add documentation for those looking to eventually run in a production environment #274
Labels
documentation
Improvements or additions to documentation
enhancement
New feature or request
opex
Operational Excellence to make it easy to run in production and debug
Milestone
Summary
Questions I have that I don't see covered in the docs yet. If some of this is TBD, that would also be useful to know.
When (if not already) is numaflow considered production-ready, is there a timeline for this? Is the plan to indicate this by a 1.0.0 release or a note in the docs?
For those ops-teams that have experience managing kafka clusters in production but no experience (yet) managing jetstream clusters, what is the desired path to running a numaflow isbsvc in production (i.e. zero-downtime version upgrades, disk space management, monitoring, alerting, etc.). Maybe one of the following? Using
jetstream
below, but apply generally to any non-kafka isb:(a) Numaflow's goal is to fully manage jetstream. Ops-team is only expected to need to interact with jetstream through numaflow (i.e. isbsvc k8s resources).
(b) Numaflow will manage some concerns of jetstream, but ops-team are responsible for others. If so, understanding the separation of concerns that numaflow is eventually targeting would be useful.
(c) In production, ops-team should become familiar with managing dedicated jetstream clusters and reference it using an
external:
(to be added, similar to redis external exists) section in isbsvc crd property.(d) In production, ops-team can continue managing existing kafka clusters and can reference it using an
external:
on a (to be added, similar to jetstream/redis exist) new isbsvc kafka type. Presumably ops-team providing a dedicated kafka cluster for numaflow's use, or maybe using ACLs on existing cluster to allow numaflow permissions to create/delete/update topics under a prefix.Use Cases
Those interested in using numaflow in a production environment, and needing to understand the path to get there.
Message from the maintainers:
If you wish to see this enhancement implemented please add a 👍 reaction to this issue! We often sort issues this way to know what to prioritize.
The text was updated successfully, but these errors were encountered: