Releases: kaito-project/kaito
Releases · kaito-project/kaito
v0.4.4
v0.4.3
v0.4.3 - 2025-01-30
Changelog
Features 🌈
- e333f2a feat: Add DeepSeek READMEs and Example (#851)
- 31bdf47 feat: Add DeepSeek Model for E2E (#850)
- 0ed89f2 feat: Add Deepseek Model (#848)
- 254dec6 feat: Add DeepSeek Model Plugin (#849)
- 979b739 feat: RAG API Server to use Async/Await (#835)
Bug Fixes 🐞
- 872d1a6 fix: use ghcr image for e2e test during release
- 8d3a1e1 fix: Add DeepSeek Qwen E2E (#852)
- f7ee0ec fix: Prevent blocking healthcheck during Inference (#837)
Code Refactoring 💎
Maintenance 🔧
v0.4.2
v0.4.2 - 2025-01-16
Changelog
Features 🌈
- 03bf1b3 feat: upgrade v1beta1.NodeClaim to v1.NodeClaim (#823)
- c731441 feat: upgrade golangci-lint to v1.63.4 (#821)
- 8ad5146 feat: remove machine and aws/karpenter-core dependency from kaito (#806)
- 5351ff8 feat: Add RAG LLMReranker (#784)
- 1567afa feat: Add Qwen Link
Bug Fixes 🐞
- c0f21fb fix: Ensure
model
provided in vLLM inference (#820) - dcc0b3b fix: Update CodeCov Badge
- b7d8dff fix: Fix Repo Filepaths (#809)
- 47143e5 fix: Update RAG ChromaDB UTs v0.6.1 (#803)
Continuous Integration 💜
Documentation 📘
Maintenance 🔧
- fd8cead chore: default to NC_A100_v4 series gpu (#825)
- d2b3ccd chore: terraform updates (#812)
- 6a98a01 chore: bump github.com/onsi/ginkgo/v2 from 2.22.1 to 2.22.2 (#800)
- 5f914ad chore: bump github.com/stretchr/testify from 1.9.0 to 1.10.0 (#787)
Testing 💚
v0.4.1
v0.4.1 - 2025-01-03
This release includes these major new features:
- Add support for Qwen2.5 Coder model.
- Add support for using LoRA adapter with vLLM runtime.
- Add support for config file for vLLM runtime.
- Add demoui for openai api service.
Changelog
Features 🌈
- 0c2f4a8 feat: qwen2.5 coder Proposal (#801)
- d9dc364 feat: add inference config api (#791)
- 84f58a3 feat: ensureServices for RAG engine (#776)
- a994f4b feat: update preset images (#785)
- be3620b feat: add qwen preset test (#788)
- d399599 feat: add qwen coder model (#783)
- 42f9ebc feat: support config file for vllm runtime (#780)
- 7da6586 feat: add demoui for openai api (#777)
- 83f25cd feat: support LoRA adapters for vllm runtime (#774)
- b099c66 feat: RAGEngine update and validation (#747)
- c3be988 feat: Add build pipeline for RAG Controller (#772)
- 9a2f8d6 feat: Add build pipeline for RAG Service (#770)
Bug Fixes 🐞
- 82451cb fix: unstable testing order causing flaky test (#799)
- 2c1d5bf fix: don't switch current working git branch when determining model changes (#789)
- 24eb89b fix: machine and nodeclaim can not supported at the same time (#769)
- e11c6d4 fix: Update Ragengine Service Dockerfile
- 857c373 fix: RAG service Dockerfile path patch (#767)
- d6f8602 fix: chart tpl when rendering feature gate flag (#760)
Documentation 📘
- 9dd17a3 docs: Add invite link to the Kaito community slack (#792)
- 5d5e342 docs: add LLMs chat template documentation for end-users (#782)
- d783693 docs: update README for new release (#762)
Maintenance 🔧
- 0d061fa chore: bump github.com/onsi/gomega from 1.34.2 to 1.36.2 (#794)
- d97b290 chore: bump goreleaser/goreleaser-action from 6.0.0 to 6.1.0 (#688)
- d96cdd5 chore: bump step-security/harden-runner from 2.10.1 to 2.10.2 (#729)
- 2b18896 chore: bump actions/setup-go from 5.1.0 to 5.2.0 (#781)
- ea9fed4 chore: bump codecov/codecov-action from 5.1.1 to 5.1.2 (#793)
- c96b10e chore: bump golang.org/x/net to 0.33.0 (#786)
- 53319e0 chore: bump github.com/onsi/ginkgo/v2 from 2.20.2 to 2.22.0 (#717)
- eb2bff2 chore: bump thehanimo/pr-title-checker from 1.4.2 to 1.4.3 (#728)
- 47bdfd8 chore: bump codecov/codecov-action from 5.0.7 to 5.1.1 (#766)
- 59e377f chore: update phi3.5 example resource (#763)
- cbd58d9 chore: switch buildkit image to mcr registry (#761)
Testing 💚
v0.4.0
v0.4.0 - 2024-12-06
This release includes these major new features:
- Add support for Phi3.5 model.
- Add support for using vLLM runtime in inference service.
- Add support for using chat template.
- Bump accelerate to 1.0.0.
Changelog
Features 🌈
- e0f28f0 feat: Handle HF Remote API Call Format (#751)
- 0f9a11d feat: support vllm in controller (#635)
- 5269bd7 feat: bump accelerate to 1.0.0 (#739)
- f5d0958 feat: Update Llama Endpoint (#738)
- 2cb5710 feat: add tuning test to preset test (#741)
- f7e6d66 feat: [SKU modularization] AWS chart changes (#710)
- 391b398 feat: Add flag for running 1ES Public Models (#733)
- 0aea28e feat: Custom Dockerfile update BaseImage (#724)
- 0087e09 feat: add preset test for vllm (#694)
- c25e7e9 feat: RAG service health check (#704)
- f3ef4c8 feat: RAG engine validation (#691)
- 1c6eb2e feat: support adaptive
max_model_len
(#657) - cafb947 feat: RAG engine deployment creation (#660)
- 2ecfdf1 feat: RAG engine controller revision (#682)
- 79494a2 feat: Dockerfile for Kaito RAG Service (#680)
- 9f5632a feat: Migrate E2E to Self-Hosted Runner (#641)
- 1676c0d feat: Runner Setup Script (#676)
- 71ddc55 feat: Introduce Abstract Class for Integration Testing (#674)
- ad0dde9 feat: Update VectorStore Base class (#673)
- 7bea782 feat: run e2e test in parallel (#667)
- 1709ba0 feat: package vllm runtime into image (#655)
- 6b216fc feat: Add delete and finalizer to RAGEngine (#646)
- 1d09da0 feat: implement inference server by using vllm (#624)
- 8906190 feat: Part 4 (Final) - Introduce Main RAG Service API and its tests (#603)
- 791c175 feat: add printcolumn to RAG Engine (#623)
- 544df3f feat: add Nodeclaim & Machine provision to RAG Engine controller (#622)
- 941170b feat: Part 3 - Introduce Vector Store Manager and Vector Store Class (#633)
- 65b844a Revert "feat: Migrate E2E pipeline to using Self-Hosted Runner" (#642)
- b6694c2 feat: Migrate E2E pipeline to using Self-Hosted Runner (#638)
- 314a80e feat: Revert the refactoring of RAGEngineStatus and WorkspaceStatus (#636)
- 870a93d feat: Part 2 - Add custom LLM inference class (#630)
- 1d99028 feat: Part 1 - Add RAG Embedding Interface (#628)
- 152e683 feat: refact updateStatusConditionIfNotMatch for both RAG and workspace (#626)
- 920ada5 feat: refactor updateObjStatus for both RAG and workspace (#625)
- 1818551 feat: Update RAG Status (#621)
- f613bb4 feat: update of functions related to nodeclaim and machine for RAG engine (#620)
- 38656dd feat: Clusterrole and Webhook update for RAG Engine (#619)
- cccb1cb feat: add WorkerNodes to RAGEngineStatus (#612)
- ba1a62d feat: Add ragengine controller scaffolding code and chart (#600)
- a06cf97 feat: [SKU modularization] remove sku_config from v1alpha1 and implement skuHandler interface (#602)
- 2cdc682 feat: Add RAGEngine CRD (#597)
- f3d6e09 feat: Options for Building and Running Private/Custom Models (#598)
Bug Fixes 🐞
- 6a75817 fix: set gorelease main pkg to workspace
- be067f6 fix: featuregate flag render problem
- 3882218 fix: disable tensor parallel for falcon7b (#755)
- fea2924 fix: Filepath in custom-deployment-template.yaml (#757)
- dba607b fix: Update Dockerfile Path
- ab1fd7e fix: preset tuning test workflow (#742)
- 1b440d5 fix: create the workspace service for custom models (#745)
- 21056a1 fix: secret update patch (#709)
- 3dbb660 fix: Validate workspace name (#726)
- 511dfa1 fix: Update custom-model-integration-guide.md
- 666c5fc fix: Update README.md (#697)
- 9d19e8f fix: Update Benchmark Pull Image Instructions
- 9d72066 fix: Update Runner Labels (#712)
- a146f3c fix: patch of RAGengine dockerfile (#707)
- 1517106 fix: binary search for best context length avoiding oom (#705)
- bac8d34 fix: skip e2e test for mcr publish pipeline (#699)
- 77ed191 fix: remove secret env when trigger e2e flow (#696)
- 5812927 fix: Update Makefile (#692)
- f1db127 fix: Remove empty var (#681)
- 001d148 fix: Add K8s Env Var (#679)
- 9408b12 fix: patch imagePullSecrets validation in e2e test (#666)
- d926c44 fix: Update Dockerfile.reference (#665)
- f64b35a fix: Populate ImagePullSecrets in Adapter Deployment and Add Corresponding Tests (#656)
- fc925de fix: polish liveness health threshold (#659)
- 0abe5d7 fix: NVML unknown error (#639)
- 7a9468c fix: update Dockerfile (#640)
- 2ec45ac fix: ignore instanceType when selecting preferred nodes (#618)
- 2b07c29 fix: Update kaito-e2e.yml (#614)
Code Refactoring 💎
Documentation 📘
- 692a7da docs: update for multi-runtime support (#754)
- 711c858 docs: [SKU modularizastion] Add AWS installation documentation (#711)
- 00056b5 docs: Update installation.md (#736)
- f889920 docs: Add guide for running Kaito on BYO GPU nodes (#732)
- 2139dfe docs: Update helm list command in installation guide to use new namespaces. (#730)
- 64c8ffb docs: update docs with 0.3.2 release (#700)
- 58894ba docs: fix terraform and update readme (#637)
- 6481b76 docs: quick deploy using terraform (#634)
- 6b8bc80 docs: Update README with the new release (#592)
Maintenance 🔧
- 0e80023 chore: switch buildkit image to mcr registry
- 1e4e699 chore: Mark ragengine as WIP for helm installation (#758)
- 3bc450b chore: bump actions/dependency-review-action from 4.3.4 to 4.5.0 (#714)
- 69986b0 chore: bump codecov/codecov-action from 4.6.0 to 5.0.7 (#716)
- 3a68fe8 chore: bump actions/setup-go from 5.0.2 to 5.1.0 (#687)
- ff41d1c chore: add zhuangqh to codeowners (#701)
- fcd5d1c chore: restruct workspace controller code - part 4 (#685)
- a057d70 chore: restruct workspace controller code - part 3 (#684)
- 3c873ec chore: restruct workspace controller code - part 2 (#683)
- e886346 chore: restruct workspace controller code - part 1 (#675)
- 38fae09 chore: bump step-security/harden-runner from 2.9.1 to 2.10.1 (#596)
- 79e425c chore: bump github.com/Azure/karpenter-provider-azure from 0.5.1 to 0.5.4 (#599)
- 1fb9989 chore: bump azure/CLI from 2.0.0 to 2.1.0 (#588)
- b97ab11 chore: refactor to move ragengine to a central package (#671)
- 2ca998e chore: removed Microsoft trademark, updated contributing guidelines, CoC in readme (#672)
- 40d1321 chore: Updated to CNCF CoC, Maintainers file (#670)
- 1248109 chore: clean up build cmds for workspace (#668)
- 8f894bb chore: bump codecov/codecov-action from 4.5.0 to 4.6.0 (#613)
- 00ad1f6 chore: bump azure/login from 2.1.1 to 2.2.0 (#627)
- bf12222 chore: bump actions/checkout from 4.1.7 to 4.2.2 (#647)
- 5f2f649 chore: Renaming to reflect updated repo (#663)
- f35ca31...
v0.3.2
v0.3.1
v0.3.1 - 2024-09-07
Changelog
Features 🌈
- fd83b2a feat: Add Envar for Setting Pytorch Expandable Segments (#584)
- cb7b83f feat: Add error checking to CheckResourceStatus (#583)
- 8785cc5 feat: Tuning job update supporting (#580)
- 5c30038 feat: Update Preset Model Tags for Release (#578)
- 2e967ca feat: Update Preset Model Tags (#574)
- 1afb924 feat: Update controllerRevision and deployment lifecycle (#559)
- 5f2f531 feat: Debug Flag and tuning /metrics endpoint (#544)
- 9d42673 feat: Update karpenter nodeclass (#570)
- 1f06382 feat: Add Adapter Loading Test for E2E Image Preset Workflow (#567)
- 868efc3 feat: Add adapter logs for Inference API (#543)
- 38c9c40 feat: add NodeClass CRDs (#551)
- 08d2800 feat: Add Dataset Image Use for E2E Tests (#548)
- c54af47 feat: Update deployment object when adapter changes are detected (#540)
- 5413920 feat: Update README.md to add Phi-Medium (#537)
- 2ccc93d feat: Add controllerrevision for workspaceController (#524)
- 4d5cd2d feat: Add workspace status to present the tuning job status (#529)
- 0cbb06f feat: [SKU-modularization] Get SKU Handler (#518)
Bug Fixes 🐞
- f083822 fix: Check HTTP Status code for curl download (#586)
- d4db63d fix: set job's backoffLimit to 0 and report job's active/ready status in workspace (#575)
- 8b7aabf fix: Use Projected Volume and Keep Docker Sidecar Alive for Data Retrieval (#552)
- 9ec071e fix: falcon config path in quick start (#553)
- 5bde8f4 fix: Update Phi 3 Requirements (#550)
- 7899717 fix: Use SKU GPU Count in DeploymentSpec (#541)
- b6f13ed fix: Update memory requirement checks using resource.quantity (#539)
- 66f5711 fix: Clear CUDA cache to reduce OOM (#536)
- 4fdecc2 fix: Ensure Finalizer Patch (#530)
- f42a77d fix: Update README.md with additional details on tuning time calculation (#528)
- 137154b fix: add watcher for job and avoid unnecessary reconcile (#527)
- cefdab9 fix: remove extra char from configmap (#526)
- 78544e1 fix: Add Fail logs for Inference w/ Adapter test (#516)
- 7d1dc82 fix: Update qemu image (#512)
Continuous Integration 💜
- 3db011e ci: Add support for Karpenter in the kaito pipelines (#569)
- eaa1516 ci: adding workflow dispatch to helm-chart workflow (#558)
Documentation 📘
- ab3851a docs: Revise README.md for Kaito inference (#581)
- 370007f docs: update readme with new arch figure (#555)
- f259329 docs: revise the README for introducing Kaito tuning (#535)
- 029d11f docs: add doc for inference (#514)
Maintenance 🔧
- c9c2fa5 chore: Update Phi-3 Example YAML
- d346213 chore: bump github.com/onsi/gomega from 1.34.1 to 1.34.2 (#585)
- 5f1ab79 chore: bump github.com/onsi/ginkgo/v2 from 2.20.1 to 2.20.2 (#582)
- 2e39c5e chore: bump k8s.io/kubernetes from 1.30.2 to 1.31.0 (#562)
- bf5bc95 chore: adapter validation e2e (#549)
- 8e8581e chore: bump github.com/onsi/ginkgo/v2 from 2.20.0 to 2.20.1 (#576)
- 78e9e69 chore: Naming nit
- 4e370fc chore: Remove outdated/unused Config folders (#571)
- 4204d93 chore: bump github.com/onsi/ginkgo/v2 from 2.19.0 to 2.20.0 (#561)
- ea86744 chore: bump github.com/Azure/karpenter-provider-azure from 0.5.0 to 0.5.1 (#525)
- e5cb668 chore: bump k8s.io/kubernetes from 1.30.1 to 1.30.2 (#467)
- 32482b7 chore: bump docker/login-action from 3.2.0 to 3.3.0 (#532)
- 83ec859 chore: bump step-security/harden-runner from 2.8.1 to 2.9.1 (#554)
- a4c231c chore: bump github.com/samber/lo from 1.45.0 to 1.47.0 (#556)
- 08066f6 chore: patch for volumeMounts (#546)
- 39eb92f chore: FAQ Question on PreferredNodes addition (#545)
- 531207e chore: Update nodeClaim manifest and add nodeClass (#509)
- cbfaea8 chore: bump github.com/samber/lo from 1.44.0 to 1.45.0 (#521)
- d7b68d1 chore: bump actions/setup-go from 5.0.0 to 5.0.2 (#519)
- 6c17470 chore: bump actions/dependency-review-action from 4.3.3 to 4.3.4 (#520)
- e5991b0 chore: Add Phi-3 Link to README.md (#515)
- 04339c3 chore: edit README to announce new release (#513)
Testing 💚
v0.3.0
v0.3.0 - 2024-07-12
This release includes three major new features:
- Add support for Phi3 model.
- Add support for LoRA fine-tuning, which is specified via the new
Tuning
field in the CRD. - Add support for using adapters for model inference, which is specified via the new "Adapters" field in the CRD.
Changelog
Features 🌈
- e5e61a7 feat: Add E2E Tests for Phi-3 and Tuning (#476)
- 42bbe92 feat: Bump Tags (#495)
- 766b944 feat: Add cloud provider specific nodeClaim requirement (#496)
- 2e06aff feat: Add Phi-3 Medium Plugin (#494)
- ada4b14 feat: Add Run Build Phi Flag (#493)
- 421bd5f feat: Add Phi-3 Manifests and Custom E2E Run Flag (#491)
- 001df06 feat: Add Target Modules Env Variable (#489)
- f534355 feat: Add Util funcs, updating func names, logs, configs and ensure service requirement (#485)
- f70dcc5 feat: Tuning Resource Validation Check (#484)
- 3ef8e66 feat: Add Phi3 Mini Requirements & Set Active Adapter (#469)
- d38d87b feat: Enhance CI/CD (Build, E2E, Composite Action) with 1ES Migration and Phi-3 Integration (#428)
- 4db58ce feat: Automate the adapters manifests (#463)
- 4cadcd2 feat: add phi-3 to configs.json
- 76a966b feat: Add Min GPU Memory Requirement (#443)
- e24e8b3 feat: Update Default Fine Tuning Params (with comments) (#442)
- 498f92b feat: Fine Tune (Part 10) - Updating fine tuning py (#371)
- 675a245 feat: Add karpenter labels to nodeClaim (#373)
- cb2d964 feat: Support Karpenter NodeClaim API in workspace (#366)
- 52f5d3b feat: Fine tune (Part 9) - Handling image data destination (#367)
- a4bcec4 feat: Add feature gates flag (#368)
- 536f259 feat: Add Karpenter Azure cloud provider NodePool (#369)
- a9af171 feat: Add URL as data source (#365)
- 962e3d6 feat: Add tuning job manifest, image source creation, parameter setup - Part 8 (#363)
- 2d5d1b4 feat: [SKU modularization] adding aws sku handler (#364)
- d9ba35a feat: Add v1beta1 API for Karpenter (#362)
- 5c3fb02 feat: [SKU modularization] adding azure sku handler (#360)
- aa89d20 feat: [SKU modularization] sku handler interface (#357)
- 0b63598 feat: Setup Preset Tuning Util Functions and miscellaneous validation/logging - Part 7 (#358)
- 6be8a0d feat: Simple Configmap Validation Checks - Part 6 (#355)
- bf4acba feat: Update CRD to add Volume and ConfigTemplate (#356)
- e799d62 feat: Add few util functions (#354)
- e2f5e25 feat: Add global client, accessible via webhooks (#353)
- f88b529 feat: Add default configmaps - Part 5 (#346)
- cf7cf94 feat: Add API Docs and Improve Inference Readability (#331)
- 05fb90c feat: Add sample front end helm chart (#320)
- 08dd1f4 feat: Initialize Fine-Tuning Interface and Core Methods - Part 3 (#308)
- 4ba337d feat: Part 2 - Add validation checks for TuningSpec, DataSource, DataDestination (#304)
- ec8a8e2 feat: Part 1 - Add FineTuning API (#201)
- 34f3266 feat: Add force update flag to e2e preset
- af64e89 feat: Add fine tuning spec in CRD (#292)
- 4fef28b feat: Add force run all flag for build images
Bug Fixes 🐞
- e4849a1 fix: Update Mistral Version and Rebuild (#501)
- d601ae8 fix: Bump plugins tags (#500)
- 9e12854 fix: Preset Build Image Naming (#497)
- 5abb030 fix: Remove Phi-3 Small and Update MCR Destination (#492)
- 21d7768 fix: Update e2e-preset-configs.json
- f2b3504 fix: Refactor Naming Conventions, Update Dependencies, Enhance Examples, and Add Volume Validation Check (#470)
- 8f068a3 fix: Adapter Base Image to Include Basic Utilities (#461)
- faa1fc7 fix: Standardize Dataset Input (#460)
- 261877e fix: Minor Tuning Results Filepath Bugfix (#453)
- 0dca449 fix: Update controller-gen version (#429)
- 4ae917b fix: Update dependabot.yml for security updates (#420)
- 64152d4 fix: Add preset nil checks (#347)
- ee9101a fix: Add registry as a pipeline job output (#329)
- d0498a0 fix: Upgrade FastAPI Version (#305)
Continuous Integration 💜
- cdcc7f5 ci: add setup-go GH action and codeQL analysis category (#401)
- 25a2b10 ci: bump codeQL to V3 (#400)
- 2aba9f0 ci: fix codeQL action version (#399)
- 799bda6 ci: fix codeql error due to golang version 1.22 (#397)
- 4bac047 ci: Add Env variable for supported models yaml (#376)
- c49b93a ci: Remove PR trigger from release workflow (#330)
Documentation 📘
- 174f6f5 docs: Add Tuning API Documentation (#480)
- e592eb3 docs: update instructions for llama2 (#475)
- 1812d4a docs: Update namespace patch and dependency for installation.md (#377)
- cc8e02d docs: Update gpu-provisioner installation step (#374)
- 2485c58 docs: Update README.md for announcing v0.2.1 (#307)
Maintenance 🔧
- 2cc97eb chore: remove deprecated goreleaser release actions (#499)
- 7b8416d chore: bump k8s.io/klog/v2 from 2.130.0 to 2.130.1 (#478)
- b4778e7 chore: bump github.com/samber/lo from 1.39.0 to 1.44.0 (#490)
- f613679 chore: e2e test for adapters and validation of adapters (#483)
- 5242169 chore: tuning e2e webhook test (#487)
- ce99b5f chore: bump codecov/codecov-action from 4.4.1 to 4.5.0 (#472)
- 45e75aa chore: bump actions/checkout from 4.1.6 to 4.1.7 (#473)
- 7b874eb chore: increase adapter test coverage (#477)
- d849274 chore: bump k8s.io/klog/v2 from 2.120.1 to 2.130.0 (#474)
- 5f03f40 chore: Update e2e-preset-configs.json
- 06fd57e chore: bump azure/login from 2.1.0 to 2.1.1 (#458)
- 42396db chore: bump goreleaser/goreleaser-action from 5.1.0 to 6.0.0 (#464)
- ffcafbb chore: bump step-security/harden-runner from 2.8.0 to 2.8.1 (#465)
- fb9b7da chore: bump actions/dependency-review-action from 4.3.2 to 4.3.3 (#466)
- 9968c29 chore: bump sigs.k8s.io/controller-runtime from 0.18.3 to 0.18.4 (#462)
- edbbf0e chore: bump docker/login-action from 3.1.0 to 3.2.0 (#457)
- 062ef5d chore: Copy latest CRD to chart (#444)
- 2888461 chore: update crd generated by 0.15 controller-gen (#441)
- 7cfb097 chore: bump sigs.k8s.io/controller-runtime from 0.18.2 to 0.18.3 (#433)
- 31d2f60 chore: bump github.com/onsi/ginkgo/v2 from 2.18.0 to 2.19.0 (#432)
- a2865ae chore: bump azure/login from 2.1.0 to 2.1.1 (#436)
- 6615a02 chore: bump step-security/harden-runner from 2.7.1 to 2.8.0 (#438)
- f3f4cbe chore: bump actions/dependency-review-action from 2.5.1 to 4.3.2 (#437)
- e5344e7 chore: bump github.com/go-logr/logr from 1.4.1 to 1.4.2 (#431)
- b9f2393 chore: bump github.com/onsi/ginkgo/v2 from 2.17.3 to 2.18.0 (#430)
- b65dd2e chore: bump sigs.k8s.io/karpenter from 0.36.1 to 0.36.2 (#426)
- 4a9b6c9 chore: bump step-security/harden-runner from 2.7.0 to 2.7.1 (#424)
- 92d8b71 chore: bump goreleaser/goreleaser-action from 5.0.0 to 5.1.0 (#422)
- a0eb42d chore: bump actions/checkout from 4.1.5 to 4.1.6 (#425)
- 060c91b chore: bump codecov/codecov-action from 4.3.1 to 4.4.1 (#427)
- 422fe20 chore: bump azure/login from 1.6.1 to 2.1.0 (#421)
- da3e63b chore: Update Kubernetes to 1.30.1 (#416)
- ff1d352 chore: bump datasets from 2.16.1 to 2.19.1 in /presets/tuning/tfs (#380)
- 8587d35 chore: Update Python Pip libraries (#412)
- 63fb951 chore: bump azure/login from 1.6.1 to 2.1.0 (#406)
- 19c93d6 chore: bump codecov/codecov-action from 4.1.0 to 4.3.1 (#405)
- 56e297b chore: bump actions/checkout from 3.6.0 to 4.1.5 (#404)
- f904cac chore: bump docker/login-action from 3.0.0 to 3.1.0 (#407)
- 9fe0eef chore: bump azure/CLI from 1.0.9 to 2.0.0 (#408)
- 2c05921 chore: update ginkgo version for e2e test (#403)
- 2f6e1d9 chore: Remove Karpenter noodepool (#402)
- 63f9f72 chore: bump github.com/stretchr/testify from 1.8.4 to 1.9.0 (#393)
- 8fabf20 chore: bump distroless/static from
55c6361
toe9ac71e
in /docker/kaito (#388) - 37acb67 chore: bump github.com/onsi/ginkgo/v2 from 2.17.1 to 2.17.3 (#394)
- 2c42bdd chore: remove gpu-provisioner chart from kaito repo (#372)
- 8b7941a chore: Update Golong 1.20 -> 1.22 (#361)
- 65a99da chore: naming nit preset-inferences.go
- 9e2ac24 chore: Add utils functions and move tests into seperate package (#351)
- 77aa95b chore: Factoring out reusable presets logic - Part 4 (#332)
- 2f0323a chore: Add GitHub issue/PR templates (#316)
Security Fix 🛡️
Testing 💚
v0.2.2
v0.2.2 - 2024-04-04
This release reverts a few default inference parameters back to the values used in v0.1.0 to avoid any user confusions. These parameters significantly impact the inference results.
Changelog
Bug Fixes 🐞
- 98bf904 fix: Update namespace in the helm chart (#337)
- 05b7a46 fix: Add registry as a job output (#328)
- e8ffbd1 fix: Update helm chart to use Release.Namespace (#322)
- 8af6207 fix: Fix typo in Makefile (#315)
- 7f4e683 fix: Update Model Tags (#311)
- f638608 fix: Adjust default model params (#310)
Continuous Integration 💜
Documentation 📘
v0.2.1
v0.2.1 - 2024-03-19
This release includes a critical fix to revert the default inference max sequence length back to 200 as it was in v0.1.0. A commit in v0.2.0 accidentally changes the default max sequence length to 20.
Changelog
Features 🌈
Bug Fixes 🐞
- ed345d6 fix: Protect secret with environment (#300)
- 4c4e803 fix: Update default params and add associated UTs (#294)
- 268675c fix: update manifest and helm charts (#278)
Continuous Integration 💜
- c704f84 ci: fix 1ES pool label name (#301)
- 63ff6cf ci: Update supported_models.yaml (#296)
- ab88635 ci: Add environment for pipelines (#290)
- dd59ef3 ci: Use 1ES runner for kaito workspace workflow jobs that push to ACR (#283)
Documentation 📘
- d58d22f docs: Add gpu-provisioner github repo (#267)
- 13c77e3 docs: update README.md for new models (#279)
Maintenance 🔧
- 9b33f33 chore: bump peter-evans/repository-dispatch from 1 to 3 (#269)
- c54a32b chore: bump azure/setup-helm from 3 to 4 (#270)
- 40a6e03 chore: bump actions/checkout from 3 to 4 (#271)
Security Fix 🛡️
- d5d3c57 Fix protobuf to address CVE-2024-24786
- ebf2f46 Fix fastapi to address CVE-2024-24762