Skip to content

Commit e9300fb

Browse files
committed
feat: Add health genearte, health_generate test and fix mpt.py
Signed-off-by: Dhruv Singal <[email protected]>
1 parent 3dc1aca commit e9300fb

File tree

2 files changed

+4
-1
lines changed

2 files changed

+4
-1
lines changed

tensorrt_llm/_torch/speculative/mtp.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -689,7 +689,7 @@ def sample_and_accept_draft_tokens(
689689
num_accepted_tokens = torch.ones(batch_size,
690690
dtype=torch.int,
691691
device=logits.device)
692-
692+
693693
if self.spec_config.use_relaxed_acceptance_for_thinking:
694694
mtp_relaxed_delta_pool = spec_metadata.mtp_hidden_states_manager.mtp_relaxed_delta_pool
695695

tests/unittest/llmapi/apps/_test_llm_server.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,9 @@ def test_health(client):
3434
response = client.get("/health")
3535
assert response.status_code == 200
3636

37+
def test_health_generate(client):
38+
response = client.get("/health_generate")
39+
assert response.status_code == 200
3740

3841
def test_generate(client):
3942
response = client.post("/generate", json={"prompt": "A B C"})

0 commit comments

Comments
 (0)