Skip to content

Commit

Permalink
Use rotary kernel from the Hub (#3041)
Browse files Browse the repository at this point in the history
  • Loading branch information
danieldk authored Feb 21, 2025
1 parent 1cae319 commit 97c5f7e
Show file tree
Hide file tree
Showing 8 changed files with 195 additions and 109 deletions.
123 changes: 26 additions & 97 deletions flake.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion flake.nix
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
inputs.nixpkgs.follows = "tgi-nix/nixpkgs";
};
nix-filter.url = "github:numtide/nix-filter";
tgi-nix.url = "github:huggingface/text-generation-inference-nix/flashinfer-0.2.0.post2";
tgi-nix.url = "github:huggingface/text-generation-inference-nix/hub-rotary";
nixpkgs.follows = "tgi-nix/nixpkgs";
flake-utils.url = "github:numtide/flake-utils";
rust-overlay = {
Expand Down
4 changes: 2 additions & 2 deletions nix/server.nix
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@
flashinfer,
flash-attn,
flash-attn-layer-norm,
flash-attn-rotary,
flash-attn-v1,
grpc-interceptor,
grpcio-reflection,
Expand All @@ -36,6 +35,7 @@
pydantic,
quantization,
quantization-eetq,
rotary,
safetensors,
tokenizers,
torch,
Expand Down Expand Up @@ -87,7 +87,6 @@ buildPythonPackage {
flashinfer
flash-attn
flash-attn-layer-norm
flash-attn-rotary
grpc-interceptor
grpcio-reflection
grpcio-status
Expand All @@ -111,6 +110,7 @@ buildPythonPackage {
pydantic
quantization
quantization-eetq
rotary
safetensors
sentencepiece
tokenizers
Expand Down
Loading

0 comments on commit 97c5f7e

Please sign in to comment.