Skip to content

Commit 21752a0

Browse files
committed
docs: Rename README to DOC_CI_TEST_README and update parser to skip it
- Rename README.md to DOC_CI_TEST_README.md to clarify it documents the CI test framework - Update parser.py to skip DOC_CI_TEST_README.md when parsing markdown files - Remove warning about escaped backticks (no longer needed since file is skipped) - Add notes that server is NOT restarted between multiple aiperf commands Addresses PR feedback to have test script ignore this documentation file instead of using escaped backticks in examples. Signed-off-by: Ganesh Kudleppanavar <[email protected]>
1 parent 9a24382 commit 21752a0

File tree

2 files changed

+31
-46
lines changed

2 files changed

+31
-46
lines changed

tests/ci/test_docs_end_to_end/README.md renamed to tests/ci/test_docs_end_to_end/DOC_CI_TEST_README.md

Lines changed: 27 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -5,16 +5,6 @@ SPDX-License-Identifier: Apache-2.0
55

66
# Adding New End-to-End Tests for Documentation Examples
77

8-
## IMPORTANT: Code Examples in This File
9-
10-
**The bash code examples in this documentation use backslashes (`\`) before the triple backticks** to prevent them from being parsed as actual test commands by the test framework.
11-
12-
**When copying examples from this file, you MUST remove the backslashes (`\`) before using them.**
13-
14-
For example, this file shows examples like `\```bash` but you should write `​```bash` (without the backslash).
15-
16-
---
17-
188
This guide explains how to add new end-to-end tests for server examples in the AIPerf documentation.
199

2010
## Overview
@@ -46,16 +36,14 @@ To add tests for a new server, you need to add three types of tagged commands to
4636

4737
Tag the bash command that starts your server:
4838

49-
```markdown
5039
<!-- setup-myserver-endpoint-server -->
51-
\```bash
40+
```bash
5241
# Start your server
5342
docker run --gpus all -p 8000:8000 myserver/image:latest \
5443
--model my-model \
5544
--host 0.0.0.0 --port 8000
56-
\```
57-
<!-- /setup-myserver-endpoint-server -->
5845
```
46+
<!-- /setup-myserver-endpoint-server -->
5947

6048
**Important notes:**
6149
- The server name (`myserver` in this example) must be consistent across all three tag types
@@ -67,13 +55,11 @@ docker run --gpus all -p 8000:8000 myserver/image:latest \
6755

6856
Tag a bash command that waits for your server to be ready:
6957

70-
```markdown
7158
<!-- health-check-myserver-endpoint-server -->
72-
\```bash
59+
```bash
7360
timeout 900 bash -c 'while [ "$(curl -s -o /dev/null -w "%{http_code}" localhost:8000/health -H "Content-Type: application/json")" != "200" ]; do sleep 2; done' || { echo "Server not ready after 15min"; exit 1; }
74-
\```
75-
<!-- /health-check-myserver-endpoint-server -->
7661
```
62+
<!-- /health-check-myserver-endpoint-server -->
7763

7864
**Important notes:**
7965
- The health check should poll the server until it responds successfully
@@ -85,9 +71,8 @@ timeout 900 bash -c 'while [ "$(curl -s -o /dev/null -w "%{http_code}" localhost
8571

8672
Tag one or more AIPerf benchmark commands:
8773

88-
```markdown
8974
<!-- aiperf-run-myserver-endpoint-server -->
90-
\```bash
75+
```bash
9176
aiperf profile \
9277
--model my-model \
9378
--endpoint-type chat \
@@ -96,15 +81,13 @@ aiperf profile \
9681
--streaming \
9782
--num-prompts 10 \
9883
--max-tokens 100
99-
\```
100-
<!-- /aiperf-run-myserver-endpoint-server -->
10184
```
85+
<!-- /aiperf-run-myserver-endpoint-server -->
10286

103-
You can have multiple `aiperf-run` commands for the same server. Each will be executed sequentially:
87+
You can have multiple `aiperf-run` commands for the same server. Each will be executed sequentially against the same running server instance (the server is NOT restarted between commands):
10488

105-
```markdown
10689
<!-- aiperf-run-myserver-endpoint-server -->
107-
\```bash
90+
```bash
10891
# First test: streaming mode
10992
aiperf profile \
11093
--model my-model \
@@ -113,58 +96,57 @@ aiperf profile \
11396
--service-kind openai \
11497
--streaming \
11598
--num-prompts 10
116-
\```
99+
```
117100
<!-- /aiperf-run-myserver-endpoint-server -->
118101

119102
<!-- aiperf-run-myserver-endpoint-server -->
120-
\```bash
103+
```bash
121104
# Second test: non-streaming mode
122105
aiperf profile \
123106
--model my-model \
124107
--endpoint-type chat \
125108
--endpoint /v1/chat/completions \
126109
--service-kind openai \
127110
--num-prompts 10
128-
\```
129-
<!-- /aiperf-run-myserver-endpoint-server -->
130111
```
112+
<!-- /aiperf-run-myserver-endpoint-server -->
131113

132114
**Important notes:**
133115
- Do NOT include `--ui-type` flag - the test framework adds `--ui-type simple` automatically
134116
- Each command is executed inside the AIPerf Docker container
135117
- Commands should complete in a reasonable time (default timeout: 300 seconds)
136118
- Use small values for `--num-prompts` and `--max-tokens` to keep tests fast
119+
- The server is NOT restarted between multiple aiperf commands - all commands run against the same server instance
137120

138121
## Complete Example
139122

140123
Here's a complete example for a new server called "fastapi":
141124

142-
```markdown
143125
### Running FastAPI Server
144126

145127
Start the FastAPI server:
146128

147129
<!-- setup-fastapi-endpoint-server -->
148-
\```bash
130+
```bash
149131
docker run --gpus all -p 8000:8000 mycompany/fastapi-llm:latest \
150132
--model-name meta-llama/Llama-3.2-1B \
151133
--host 0.0.0.0 \
152134
--port 8000
153-
\```
135+
```
154136
<!-- /setup-fastapi-endpoint-server -->
155137

156138
Wait for the server to be ready:
157139

158140
<!-- health-check-fastapi-endpoint-server -->
159-
\```bash
141+
```bash
160142
timeout 600 bash -c 'while [ "$(curl -s -o /dev/null -w "%{http_code}" localhost:8000/v1/models)" != "200" ]; do sleep 2; done' || { echo "FastAPI server not ready after 10min"; exit 1; }
161-
\```
143+
```
162144
<!-- /health-check-fastapi-endpoint-server -->
163145

164146
Profile the model:
165147

166148
<!-- aiperf-run-fastapi-endpoint-server -->
167-
\```bash
149+
```bash
168150
aiperf profile \
169151
--model meta-llama/Llama-3.2-1B \
170152
--endpoint-type chat \
@@ -173,9 +155,8 @@ aiperf profile \
173155
--streaming \
174156
--num-prompts 20 \
175157
--max-tokens 50
176-
\```
177-
<!-- /aiperf-run-fastapi-endpoint-server -->
178158
```
159+
<!-- /aiperf-run-fastapi-endpoint-server -->
179160

180161
## Running the Tests
181162

@@ -216,30 +197,31 @@ For each server, the test runner:
216197
1. **Build Phase**: Builds the AIPerf Docker container (once for all tests)
217198
2. **Setup Phase**: Starts the server in the background
218199
3. **Health Check Phase**: Waits for server to be ready (runs in parallel with setup)
219-
4. **Test Phase**: Executes all AIPerf commands sequentially
200+
4. **Test Phase**: Executes all AIPerf commands sequentially against the same running server instance
220201
5. **Cleanup Phase**: Gracefully shuts down the server and cleans up Docker resources
221202

203+
**Note**: The server remains running throughout all AIPerf commands. It is only shut down once during the cleanup phase after all tests complete.
204+
222205
## Common Patterns
223206

224207
### Pattern: OpenAI-compatible API
225208

226-
```markdown
227209
<!-- setup-myserver-endpoint-server -->
228-
\```bash
210+
```bash
229211
docker run --gpus all -p 8000:8000 myserver:latest \
230212
--model model-name \
231213
--host 0.0.0.0 --port 8000
232-
\```
214+
```
233215
<!-- /setup-myserver-endpoint-server -->
234216

235217
<!-- health-check-myserver-endpoint-server -->
236-
\```bash
218+
```bash
237219
timeout 900 bash -c 'while [ "$(curl -s -o /dev/null -w "%{http_code}" localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d "{\"model\":\"model-name\",\"messages\":[{\"role\":\"user\",\"content\":\"test\"}],\"max_tokens\":1}")" != "200" ]; do sleep 2; done' || { echo "Server not ready"; exit 1; }
238-
\```
220+
```
239221
<!-- /health-check-myserver-endpoint-server -->
240222

241223
<!-- aiperf-run-myserver-endpoint-server -->
242-
\```bash
224+
```bash
243225
aiperf profile \
244226
--model model-name \
245227
--endpoint-type chat \
@@ -248,9 +230,8 @@ aiperf profile \
248230
--streaming \
249231
--num-prompts 10 \
250232
--max-tokens 100
251-
\```
252-
<!-- /aiperf-run-myserver-endpoint-server -->
253233
```
234+
<!-- /aiperf-run-myserver-endpoint-server -->
254235

255236
## Troubleshooting
256237

tests/ci/test_docs_end_to_end/parser.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,10 @@ def parse_directory(self, directory: str) -> dict[str, Server]:
3535
logger.info(f"Parsing markdown files in {directory}")
3636

3737
for file_path in Path(directory).rglob("*.md"):
38+
# Skip the documentation file for this test framework
39+
if file_path.name == "DOC_CI_TEST_README.md":
40+
logger.info(f"Skipping documentation file: {file_path}")
41+
continue
3842
logger.info(f"Parsing file: {file_path}")
3943
self._parse_file(str(file_path))
4044

0 commit comments

Comments
 (0)