NumaFlow

Summary

NumaFlow is a Kubernetes-native platform for running massive parallel data processing and streaming jobs.

A NumaFlow Pipeline is implemented as a Kubernetes custom resource, and consists of one or more source, data processing, and sink vertices.

NumaFlow installs in less than a minute and is easier and cheaper for simple data processing applications than full-featured stream processing platforms.

Key Features

Kubernetes-native: If know Kubermetes, you already know 90% of what you need to use NumaFlow.
Language agnostic: Use your favority programming language.
Exactly-Once semantics: No input element is duplicated or lost even as pods are rescheduled or restarted.

Roadmap

Auto-scaling with back-pressure: Each vertex automatically scales from zero to whatever is needed.
Data aggregation (e.g. group-by)

Resources

Check out QUICK START to try it out.
Take a look at DEVELOPMENT to set up development environment.
Refer to CONTRIBUTING to contribute to the project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

NumaFlow

Summary

Key Features

Roadmap

Resources

Files

README.md

Latest commit

History

README.md

File metadata and controls

NumaFlow

Summary

Key Features

Roadmap

Resources