@@ -176,19 +176,23 @@ At this point, a workload job (map-only MapReduce job) can be launched, e.g.:
176
176
```
177
177
./bin/start-workload.sh
178
178
-Dauditreplay.input-path=hdfs:///dyno/audit_logs/
179
+ -Dauditreplay.output-path=hdfs:///dyno/results/
179
180
-Dauditreplay.num-threads=50
180
181
-nn_uri hdfs://namenode_address:port/
181
182
-start_time_offset 5m
182
183
-mapper_class_name AuditReplayMapper
183
184
```
184
185
The type of workload generation is configurable; AuditReplayMapper replays an audit log trace as discussed previously.
185
- The AuditReplayMapper is configured via configurations; ` auditreplay.input-path ` and ` auditreplay.num-threads ` are
186
- required to specify the input path for audit log files and the number of threads per map task. A number of map tasks
187
- equal to the number of files in ` input-path ` will be launched; each task will read in one of these input files and
188
- use ` num-threads ` threads to replay the events contained within that file. A best effort is made to faithfully replay
189
- the audit log events at the same pace at which they originally occurred (optionally, this can be adjusted by
190
- specifying ` auditreplay.rate-factor ` which is a multiplicative factor towards the rate of replay, e.g. use 2.0 to
191
- replay the events at twice the original speed).
186
+ The AuditReplayMapper is configured via configurations; ` auditreplay.input-path ` , ` auditreplay.output-path ` and
187
+ ` auditreplay.num-threads ` are required to specify the input path for audit log files, the output path for the results,
188
+ and the number of threads per map task. A number of map tasks equal to the number of files in ` input-path ` will be
189
+ launched; each task will read in one of these input files and use ` num-threads ` threads to replay the events contained
190
+ within that file. A best effort is made to faithfully replay the audit log events at the same pace at which they
191
+ originally occurred (optionally, this can be adjusted by specifying ` auditreplay.rate-factor ` which is a multiplicative
192
+ factor towards the rate of replay, e.g. use 2.0 to replay the events at twice the original speed).
193
+
194
+ The AuditReplayMapper will output the benchmark results to a file ` part-r-00000 ` in the output directory in CSV format.
195
+ Each line is in the format ` user,type,operation,numops,cumulativelatency ` , e.g. ` hdfs,WRITE,MKDIRS,2,150 ` .
192
196
193
197
#### Integrated Workload Launch
194
198
@@ -203,6 +207,7 @@ launch an integrated application with the same parameters as were used above, th
203
207
-block_list_path hdfs:///dyno/blocks
204
208
-workload_replay_enable
205
209
-workload_input_path hdfs:///dyno/audit_logs/
210
+ -workload_output_path hdfs:///dyno/results/
206
211
-workload_threads_per_mapper 50
207
212
-workload_start_delay 5m
208
213
```
0 commit comments