Inference Perf

The Inference Perf project aims to provide GenAI inference performance benchmarking tool. It came out of wg-serving and is sponsored by SIG Scalability. See the proposal for more info.

Status

This project is currently in development.

Getting Started

Run locally

Setup a virtual environment and install inference-perf
```
pip install .
```
Run inference-perf CLI with a configuration file
```
inference-perf --config_file config.yml
```
See more examples

Run in a Docker container

Build the container
```
docker build -t inference-perf .
```

Run the container

docker run -it --rm -v $(pwd)/config.yml:/workspace/config.yml inference-perf

Contributing

Our community meeting is weekly at Th 11:30 PDT (Zoom Link, Meeting Notes).

We currently utilize the #wg-serving Slack channel for communications.

Contributions are welcomed, thanks for joining us!

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.github/workflows		.github/workflows
docs		docs
examples/vllm		examples/vllm
inference_perf		inference_perf
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
OWNERS		OWNERS
OWNERS_ALIASES		OWNERS_ALIASES
README.md		README.md
RELEASE.md		RELEASE.md
SECURITY.md		SECURITY.md
SECURITY_CONTACTS		SECURITY_CONTACTS
code-of-conduct.md		code-of-conduct.md
config.yml		config.yml
pdm.lock		pdm.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Inference Perf

Status

Getting Started

Run locally

Run in a Docker container

Contributing

Code of conduct

About

Releases

Packages

Contributors 8

Languages

License

kubernetes-sigs/inference-perf

Folders and files

Latest commit

History

Repository files navigation

Inference Perf

Status

Getting Started

Run locally

Run in a Docker container

Contributing

Code of conduct

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 8

Languages

Packages