NumaFlow is a Kubernetes-native platform for running massive parallel data processing and streaming jobs.
A NumaFlow Pipeline is implemented as a Kubernetes custom resource, and consists of one or more source, data processing, and sink vertices.
NumaFlow installs in less than a minute and is easier and cheaper for simple data processing applications than full-featured stream processing platforms.
- Kubernetes-native: If know Kubermetes, you already know 90% of what you need to use NumaFlow.
- Language agnostic: Use your favority programming language.
- Exactly-Once semantics: No input element is duplicated or lost even as pods are rescheduled or restarted.
- Auto-scaling with back-pressure: Each vertex automatically scales from zero to whatever is needed.
- Data aggregation (e.g. group-by)
- Check out QUICK START to try it out.
- Take a look at DEVELOPMENT to set up development environment.
- Refer to CONTRIBUTING to contribute to the project.