Skip to content

Kubernetes-native platform to run massively parallel data/streaming jobs

License

Notifications You must be signed in to change notification settings

malinamfong/numaflow

 
 

Repository files navigation

NumaFlow

Go Report Card GoDoc License

Summary

NumaFlow is a Kubernetes-native platform for running massive parallel data processing and streaming jobs.

A NumaFlow Pipeline is implemented as a Kubernetes custom resource, and consists of one or more source, data processing, and sink vertices.

NumaFlow installs in less than a minute and is easier and cheaper for simple data processing applications than full-featured stream processing platforms.

Key Features

  • Kubernetes-native: If know Kubermetes, you already know 90% of what you need to use NumaFlow.
  • Language agnostic: Use your favority programming language.
  • Exactly-Once semantics: No input element is duplicated or lost even as pods are rescheduled or restarted.

Roadmap

  • Auto-scaling with back-pressure: Each vertex automatically scales from zero to whatever is needed.
  • Data aggregation (e.g. group-by)

Resources

About

Kubernetes-native platform to run massively parallel data/streaming jobs

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 95.4%
  • Shell 2.9%
  • Makefile 1.0%
  • Smarty 0.4%
  • Dockerfile 0.2%
  • Lua 0.1%