diff --git a/docs/development/inference_performance_optimization.md b/docs/development/inference_performance_optimization.md index e60b4728a9f..74d2aa78f74 100644 --- a/docs/development/inference_performance_optimization.md +++ b/docs/development/inference_performance_optimization.md @@ -12,7 +12,7 @@ memory consumption compare to Python. DJL `Predictor` is not designed to be thread-safe (although some implementation is), we recommend creating a new [Predictor](https://javadoc.io/doc/ai.djl/api/latest/ai/djl/inference/Predictor.html) for each thread. -For a reference implementation, see [Multi-threaded Benchmark](https://github.com/deepjavalibrary/djl/blob/master/extensions/benchmark/src/main/java/ai/djl/benchmark/MultithreadedBenchmark.java). +For a reference implementation, see [Multi-threaded Benchmark](https://github.com/deepjavalibrary/djl-serving/blob/master/benchmark/src/main/java/ai/djl/benchmark/MultithreadedBenchmark.java). you need to set corresponding configuration based on the engine you want to use. @@ -111,10 +111,11 @@ This should only be disabled when you do not have the time to "warmup" a model w #### Multithreading Inference You can follow the same steps as other engines for running multithreading inference using TensorFlow engine. It's recommended to use one `Predictor` for each thread and avoid using a new `Predictor` for each inference call. -You can refer to our [Multithreading Benchmark](https://github.com/deepjavalibrary/djl/blob/master/extensions/benchmark/src/main/java/ai/djl/benchmark/MultithreadedBenchmark.java) as an example, +You can refer to our [Multithreading Benchmark](https://github.com/deepjavalibrary/djl-serving/blob/master/benchmark/src/main/java/ai/djl/benchmark/MultithreadedBenchmark.java) as an example, here is how to run it using TensorFlow engine. ```bash +cd djl-serving ./gradlew benchmark --args='-e TensorFlow -c 100 -t -1 -u djl://ai.djl.tensorflow/resnet/0.0.1/resnet50 -s 1,224,224,3' ``` diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index c54afb223bf..fdaedd33340 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -81,7 +81,7 @@ nav: - 'docs/development/configure_logging.md' - 'docs/how_to_collect_metrics.md' - 'docs/development/inference_performance_optimization.md' - - 'extensions/benchmark/README.md' + - 'docs/serving/benchmark/README.md' - 'docs/development/profiler.md' - 'docs/development/cache_management.md' - 'docs/development/memory_management.md' diff --git a/extensions/benchmark/README.md b/extensions/benchmark/README.md index ab3a8ab0f2f..9e32483c872 100644 --- a/extensions/benchmark/README.md +++ b/extensions/benchmark/README.md @@ -13,310 +13,5 @@ With djl-bench, you can easily compare your model's behavior in different use ca - running with different engines - running with different version of the engine -djl-bench currently support benchmark the following type of models: -- PyTorch TorchScript model -- TensorFlow SavedModel bundle -- Apache MXNet model -- ONNX model -- PaddlePaddle model -- TFLite model -- TensorRT model -- XGBoost model -- Python script model -- Neo DLR (TVM) model - -You can build djl-bench from source if you need to benchmark fastText/BlazingText/Sentencepiece models. - -## Installation - -For macOS - -``` -brew install cask djl-bench -``` - -For Ubuntu - -- Install using snap - -``` -sudo snap install djlbench --classic -sudo snap alias djlbench djl-bench -``` - -- Or download .deb package from S3 - -``` -curl -O https://publish.djl.ai/djl-bench/0.17.0/djl-bench_0.17.0-1_all.deb -sudo dpkg -i djl-bench_0.17.0-1_all.deb -``` - -For centOS or Amazon Linux 2 - -You can download djl-bench zip file from [here](https://publish.djl.ai/djl-bench/0.17.0/benchmark-0.17.0.zip). - -``` -curl -O https://publish.djl.ai/djl-bench/0.17.0/benchmark-0.17.0.zip -unzip benchmark-0.17.0.zip -rm benchmark-0.17.0.zip -sudo ln -s $PWD/benchmark-0.17.0/bin/benchmark /usr/bin/djl-bench -``` - -For Windows - -We are considering to create a `chocolatey` package for Windows. For the time being, you can -download djl-bench zip file from [here](https://publish.djl.ai/djl-bench/0.17.0/benchmark-0.17.0.zip). - -Or you can run benchmark using gradle: - -``` -cd djl - -gradlew benchmark --args="--help" -``` - -## Prerequisite - -Please ensure Java 8+ is installed and you are using an OS that DJL supported with. - -After that, you need to clone the djl project and `cd` into the folder. - -DJL supported OS: - -- Ubuntu 18.04 and above -- Amazon Linux 2 and above -- MacOS latest version -- Windows 10 (Windows Server 2016+) - -If you are trying to use GPU, please ensure the CUDA driver is installed. You can verify that through: - -``` -nvcc -V -``` - -to checkout the version. For different Deep Learning engine you are trying to run the benchmark, -they have different CUDA version to support. Please check the individual Engine documentation to ensure your CUDA version is supported. - -## Sample benchmark script - -Here is a few sample benchmark script for you to refer. You can also skip this and directly follow -the 4-step instructions for your own model. - -Benchmark on a Tensorflow model from [tfhub](https://tfhub.dev/) url with all-zeros NDArray input for 10 times: - -``` -djl-bench -e TensorFlow -u https://tfhub.dev/tensorflow/resnet_50/classification/1 -c 10 -s 1,224,224,3 -``` - -Similarly, this is for PyTorch - -``` -djl-bench -e PyTorch -u https://alpha-djl-demos.s3.amazonaws.com/model/djl-blockrunner/pytorch_resnet18.zip -n traced_resnet18 -c 10 -s 1,3,224,224 -``` - -Benchmark a model from [ONNX Model Zoo](https://github.com/onnx/models) - -``` -djl-bench -e OnnxRuntime -u https://s3.amazonaws.com/onnx-model-zoo/resnet/resnet18v1/resnet18v1.tar.gz -s 1,3,224,224 -n resnet18v1/resnet18v1 -c 10 -``` - -### Benchmark from ModelZoo - -#### MXNet - -Resnet50 image classification model: - -``` -djl-bench -c 2 -s 1,3,224,224 -u djl://ai.djl.mxnet/resnet/0.0.1/resnet50_v2 -``` - -#### PyTorch - -SSD object detection model: - -``` -djl-bench -e PyTorch -c 2 -s 1,3,300,300 -u djl://ai.djl.pytorch/ssd/0.0.1/ssd_300_resnet50 -``` - -## Configuration of Benchmark script - -To start your benchmarking, we need to make sure we provide the following information. - -- The Deep Learning Engine -- The source of the model -- How many runs you would like to make -- Sample input for the model -- (Optional) Multi-thread benchmark - -The benchmark script located [here](https://github.com/deepjavalibrary/djl/blob/master/benchmark/src/main/java/ai/djl/benchmark/Benchmark.java). - -Just do the following: - -``` -djl-bench --help -``` - -This will print out the possible arguments to pass in: - -``` -usage: djl-bench [-p MODEL-PATH] -s INPUT-SHAPES [OPTIONS] - -c,--iteration Number of total iterations. - -d,--duration Duration of the test in minutes. - -e,--engine Choose an Engine for the benchmark. - -g,--gpus Number of GPUS to run multithreading inference. - -h,--help Print this help. - -l,--delay Delay of incremental threads. - --model-arguments Specify model loading arguments. - --model-options Specify model loading options. - -n,--model-name Specify model file name. - --neuron-cores Number of neuron cores to run multithreading inference, See - https://awsdocs-neuron.readthedocs-hosted.com. - -o,--output-dir Directory for output logs. - -p,--model-path Model directory file path. - -s,--input-shapes Input data shapes for the model. - -t,--threads Number of inference threads. - -u,--model-url Model archive file URL. -``` - -### Step 1: Pick your deep engine - -By default, the above script will use MXNet as the default Engine, but you can always change that by adding the followings: - -``` --e TensorFlow # TensorFlow --e PyTorch # PyTorch --e MXNet # Apache MXNet --e PaddlePaddle # PaddlePaddle --e OnnxRuntime # pytorch --e TFLite # TFLite --e TensorRT # TensorRT --e DLR # Neo DLR --e XGBoost # XGBoost --e Python # Python script -``` - -### Step 2: Identify the source of your model - -DJL accept variety of models came from different places. - -#### Remote location - -Use `--model-url` option to load a model from a URL. The URL must point to an archive file. - -The following is a pytorch model - -``` --u https://alpha-djl-demos.s3.amazonaws.com/model/djl-blockrunner/pytorch_resnet18.zip -``` -We would recommend to make model files in a zip for better file tracking. - -#### Local directory - -Use `--model-path` option to load model from a local directory or an archive file. - -Mac/Linux - -``` --p /home/ubuntu/models/pytorch_resnet18 -or --p /home/ubuntu/models/pytorch_resnet18.zip -``` - -Windows - -``` --p C:\models\pytorch_resnet18 -or --p C:\models\pytorch_resnet18.zip -``` - -If the model file name is different from the parent folder name (or the archive file name), you need -to specify `--model-name` in the `--args`: - -``` --n traced_resnet18 -``` - -### Step 3: Define how many runs you would like to make - -add `-c` inside with a number - -``` --c 1000 -``` - -This will run 1000 times inference. - -### Step 4: Define your model inputs - -The benchmark script uses dummy NDArray inputs. -It will make fake NDArrays (like `NDArray.ones`) to feed in the model for inference. - -If we would like to fake an image: - -``` --s 1,3,224,224 -``` - -This will create a NDArray (DataType FLOAT32) of shape(1, 3, 224, 224). - -If your model requires multiple inputs like three NDArrays with shape 1, 384 and 384. You can do the followings: - -``` --s (1),(384),(384) -``` - -If you input `DataType` is not FLOAT32, you can specify the data type with suffix: - -- f: FLOAT32, this is default and is optional -- s: FLOAT16 (short float) -- d: FLOAT64 (double) -- u: UINT8 (unsigned byte) -- b: INT8 (byte) -- i: INT32 (int) -- l: INT64 (long) -- B: BOOLEAN (boolean) - -For example: - -``` --s (1)i,(384)f,(384) -``` - -### Optional Step: multithreading inference - -You can also do multi-threading inference with DJL. For example, if you would like to run the inference with 10 threads: - -``` --t 10 -``` - -Best thread number for your system: The same number of cores your system have or double of the total cores. - -You can also add `-l` to simulate the increment load for your inference server. It will add threads with the delay of time. - -``` --t 10 -l 100 -``` - -The above code will create 10 threads with the wait time of 100ms. - -## Advanced use cases - -For different purposes, we designed different mode you can play with. Such as the following arg: - -``` --d 86400 -``` - -This will ask the benchmark script repeatedly running the designed task for 86400 seconds (24 hour). -If you would like to make sure DJL is stable in the long run, you can do that. - -You can also keep monitoring the DJL memory usages by enable the following flag: - -``` -export BENCHMARK_OPTS="-Dcollect-memory=true" -``` - -The memory report will be made available in `build/memory.log`. +**This module has been moved to [deepjavalibrary/djl-serving/benchmark](https://github.com/deepjavalibrary/djl-serving/tree/master/benchmark).** diff --git a/extensions/benchmark/build.gradle b/extensions/benchmark/build.gradle deleted file mode 100644 index fe3a54aa332..00000000000 --- a/extensions/benchmark/build.gradle +++ /dev/null @@ -1,151 +0,0 @@ -plugins { - id 'application' - id "nebula.ospackage" version "9.0.0" -} - -boolean isRelease = project.hasProperty("release") || project.hasProperty("staging") - -dependencies { - implementation "commons-cli:commons-cli:${commons_cli_version}" - implementation "org.apache.logging.log4j:log4j-slf4j-impl:${log4j_slf4j_version}" - if (isRelease) { - implementation platform("ai.djl:bom:${djl_version}") - - implementation "ai.djl:model-zoo" - runtimeOnly "ai.djl.pytorch:pytorch-model-zoo" - runtimeOnly "ai.djl.tensorflow:tensorflow-model-zoo" - runtimeOnly "ai.djl.mxnet:mxnet-model-zoo" - runtimeOnly "ai.djl.paddlepaddle:paddlepaddle-model-zoo" - runtimeOnly "ai.djl.onnxruntime:onnxruntime-engine" - runtimeOnly "ai.djl.tflite:tflite-engine" - runtimeOnly "ai.djl.dlr:dlr-engine" - runtimeOnly "ai.djl.ml.xgboost:xgboost" - runtimeOnly "ai.djl.python:python" - runtimeOnly "ai.djl.tensorrt:tensorrt" - } else { - implementation project(":model-zoo") - - runtimeOnly project(":engines:pytorch:pytorch-model-zoo") - runtimeOnly project(":engines:tensorflow:tensorflow-model-zoo") - runtimeOnly project(":engines:mxnet:mxnet-model-zoo") - runtimeOnly project(":engines:paddlepaddle:paddlepaddle-model-zoo") - - runtimeOnly project(":engines:tflite:tflite-engine") - runtimeOnly project(":engines:tensorrt") - ProcessBuilder pb = new ProcessBuilder("nvidia-smi", "-L") - def hasGPU = false; - try { - Process process = pb.start() - hasGPU = process.waitFor() == 0 - } catch (IOException ignore) { - } - - if (hasGPU) { - runtimeOnly(project(":engines:onnxruntime:onnxruntime-engine")) { - exclude group: "com.microsoft.onnxruntime", module: "onnxruntime" - } - runtimeOnly "com.microsoft.onnxruntime:onnxruntime_gpu:${onnxruntime_version}" - } else { - runtimeOnly project(":engines:onnxruntime:onnxruntime-engine") - } - - runtimeOnly project(":engines:dlr:dlr-engine") - runtimeOnly(project(":engines:ml:xgboost")) { - exclude group: "ml.dmlc", module: "xgboost4j_2.12" - } - } - - testImplementation("org.testng:testng:${testng_version}") { - exclude group: "junit", module: "junit" - } -} - -application { - mainClass = System.getProperty("main", "ai.djl.benchmark.Benchmark") -} - -run { - environment("TF_CPP_MIN_LOG_LEVEL", "1") // turn off TensorFlow print out - systemProperties System.getProperties() - systemProperties.remove("user.dir") - systemProperty("file.encoding", "UTF-8") -} - -task benchmark(type: JavaExec) { - environment("TF_CPP_MIN_LOG_LEVEL", "1") // turn off TensorFlow print out - List arguments = gradle.startParameter["taskRequests"]["args"].getAt(0) - for (String argument : arguments) { - if (argument.trim().startsWith("--args")) { - String[] line = argument.split("=", 2) - if (line.length == 2) { - line = line[1].split(" ") - if (line.contains("-t")) { - if (System.getProperty("ai.djl.default_engine") == "TensorFlow") { - environment("OMP_NUM_THREADS", "1") - environment("TF_NUM_INTRAOP_THREADS", "1") - } else { - environment("MXNET_ENGINE_TYPE", "NaiveEngine") - environment("OMP_NUM_THREADS", "1") - } - } - break - } - } - } - - systemProperties System.getProperties() - systemProperties.remove("user.dir") - systemProperty("file.encoding", "UTF-8") - classpath = sourceSets.main.runtimeClasspath - // restrict the jvm heap size for better monitoring benchmark - jvmArgs = ["-Xmx2g"] - if (Boolean.getBoolean("loggc")) { - if (JavaVersion.current() == JavaVersion.VERSION_1_8) { - jvmArgs += ["-XX:+PrintGCTimeStamps", "-Xloggc:build/gc.log"] - } else { - jvmArgs += ["-Xlog:gc*=debug:file=build/gc.log"] - } - } - mainClass = "ai.djl.benchmark.Benchmark" -} - -task createDeb(type: Deb, dependsOn: distTar) { - doFirst { - exec { - commandLine "tar", "xvf", "${project.buildDir}/distributions/benchmark-${project.version}.tar", "-C", "${project.buildDir}" - } - } - - packageName = "djl-bench" - archiveVersion = "${djl_version}" - release = 1 - maintainer = "Deep Java Library " - summary = "djl-bench is a command line tool that allows you to benchmark the\n" + - " model on all different platforms for single-thread/multi-thread\n" + - " inference performance." - - from("${project.buildDir}/benchmark-${project.version}") { - into "/usr/local/djl-bench-${djl_version}" - } - link("/usr/bin/djl-bench", "/usr/local/djl-bench-${djl_version}/bin/benchmark") -} - -startScripts { - defaultJvmOpts = [] - doLast { - String replacement = 'CLASSPATH=\\$APP_HOME/lib/*\n\n' + - 'if [[ "\\$*" == *-t* || "\\$*" == *--threads* ]]\n' + - 'then\n' + - ' export TF_CPP_MIN_LOG_LEVEL=1\n' + - ' export MXNET_ENGINE_TYPE=NaiveEngine\n' + - ' export OMP_NUM_THREADS=1\n' + - ' export TF_NUM_INTRAOP_THREADS=1\n' + - 'fi' - - String text = unixScript.text.replaceAll('CLASSPATH=\\$APP_HOME/lib/.*', replacement) - text = text.replaceAll("/usr/bin/env sh", "/usr/bin/env bash") - text = text.replaceAll("#!/bin/sh", "#!/bin/bash") - - unixScript.text = text - } -} diff --git a/extensions/benchmark/gradle b/extensions/benchmark/gradle deleted file mode 120000 index 1ce6c4c1ed0..00000000000 --- a/extensions/benchmark/gradle +++ /dev/null @@ -1 +0,0 @@ -../../gradle \ No newline at end of file diff --git a/extensions/benchmark/gradlew b/extensions/benchmark/gradlew deleted file mode 120000 index 343e0d2caa4..00000000000 --- a/extensions/benchmark/gradlew +++ /dev/null @@ -1 +0,0 @@ -../../gradlew \ No newline at end of file diff --git a/extensions/benchmark/snapcraft/snapcraft.yaml b/extensions/benchmark/snapcraft/snapcraft.yaml deleted file mode 100644 index afa85c747db..00000000000 --- a/extensions/benchmark/snapcraft/snapcraft.yaml +++ /dev/null @@ -1,44 +0,0 @@ -name: djlbench -version: '0.17.0' -title: DJL Benhmark -license: Apache-2.0 -summary: A machine learning benchmarking toolkit -description: | - djlbench is a command line tool that allows you to benchmark the - model on all different platforms for single-thread/multi-thread - inference performance. - - Currently djlbench support the models from the following framework: - - PyTorch - - TensorFlow - - Apachmark MXNet - - PaddlePaddle - - ONNXRuntime - - TensorRT - - TensorFlow Lite - - Neo DLR - - XGBoost - - Python - -base: core18 -grade: stable -confinement: classic - -apps: - djlbench: - command: benchmark-$SNAPCRAFT_PROJECT_VERSION/bin/benchmark - environment: - JAVA_HOME: "$SNAP/usr/lib/jvm/java-11-openjdk-amd64" - PATH: "$SNAP/bin:$PATH:$SNAP/usr/lib/jvm/java-11-openjdk-amd64/bin" - -parts: - djlbench: - plugin: gradle - source: https://github.com/deepjavalibrary/djl.git - source-tag: v$SNAPCRAFT_PROJECT_VERSION - gradle-output-dir: extensions/benchmark/build/libs - gradle-options: [ -Pstaging, ':extensions:benchmark:dT' ] - override-build: | - snapcraftctl build - tar xvf $SNAPCRAFT_PART_BUILD/extensions/benchmark/build/distributions/benchmark-*.tar -C $SNAPCRAFT_PART_INSTALL/ - rm -rf $SNAPCRAFT_PART_INSTALL/jar diff --git a/extensions/benchmark/src/main/java/ai/djl/benchmark/AbstractBenchmark.java b/extensions/benchmark/src/main/java/ai/djl/benchmark/AbstractBenchmark.java deleted file mode 100644 index 8ef27cb1fe7..00000000000 --- a/extensions/benchmark/src/main/java/ai/djl/benchmark/AbstractBenchmark.java +++ /dev/null @@ -1,299 +0,0 @@ -/* - * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. - * - * Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance - * with the License. A copy of the License is located at - * - * http://aws.amazon.com/apache2.0/ - * - * or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES - * OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions - * and limitations under the License. - */ -package ai.djl.benchmark; - -import ai.djl.Device; -import ai.djl.ModelException; -import ai.djl.engine.Engine; -import ai.djl.metric.Metrics; -import ai.djl.metric.Unit; -import ai.djl.ndarray.NDList; -import ai.djl.ndarray.types.DataType; -import ai.djl.ndarray.types.Shape; -import ai.djl.repository.zoo.Criteria; -import ai.djl.repository.zoo.ZooModel; -import ai.djl.training.listener.MemoryTrainingListener; -import ai.djl.training.util.ProgressBar; -import ai.djl.translate.NoBatchifyTranslator; -import ai.djl.translate.TranslateException; -import ai.djl.translate.TranslatorContext; -import ai.djl.util.Pair; -import ai.djl.util.PairList; - -import org.apache.commons.cli.CommandLine; -import org.apache.commons.cli.DefaultParser; -import org.apache.commons.cli.Options; -import org.apache.commons.cli.ParseException; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -import java.io.IOException; -import java.nio.FloatBuffer; -import java.time.Duration; - -/** Abstract benchmark class. */ -public abstract class AbstractBenchmark { - - private static final Logger logger = LoggerFactory.getLogger(AbstractBenchmark.class); - - protected ProgressBar progressBar; - - /** - * Abstract predict method that must be implemented by sub class. - * - * @param arguments command line arguments - * @param metrics {@link Metrics} to collect statistic information - * @param iteration number of prediction iteration to run - * @return prediction result - * @throws IOException if io error occurs when loading model. - * @throws ModelException if specified model not found or there is a parameter error - * @throws TranslateException if error occurs when processing input or output - * @throws ClassNotFoundException if input or output class cannot be loaded - */ - protected abstract float[] predict(Arguments arguments, Metrics metrics, int iteration) - throws IOException, ModelException, TranslateException, ClassNotFoundException; - - /** - * Execute benchmark. - * - * @param args input raw arguments - * @return if example execution complete successfully - */ - public final boolean runBenchmark(String[] args) { - Options options = Arguments.getOptions(); - try { - if (Arguments.hasHelp(args)) { - Arguments.printHelp( - "usage: djl-bench [-p MODEL-PATH] -s INPUT-SHAPES [OPTIONS]", options); - return true; - } - DefaultParser parser = new DefaultParser(); - CommandLine cmd = parser.parse(options, args, null, false); - Arguments arguments = new Arguments(cmd); - String engineName = arguments.getEngine(); - Engine engine = Engine.getEngine(engineName); - - long init = System.nanoTime(); - String version = engine.getVersion(); - long loaded = System.nanoTime(); - logger.info( - String.format( - "Load %s (%s) in %.3f ms.", - engineName, version, (loaded - init) / 1_000_000f)); - Duration duration = Duration.ofSeconds(arguments.getDuration()); - Object devices; - if (this instanceof MultithreadedBenchmark) { - devices = engine.getDevices(arguments.getMaxGpus()); - } else { - devices = engine.defaultDevice(); - } - - if (arguments.getDuration() != 0) { - logger.info( - "Running {} on: {}, duration: {} minutes.", - getClass().getSimpleName(), - devices, - duration.toMinutes()); - } else { - logger.info("Running {} on: {}.", getClass().getSimpleName(), devices); - } - int numOfThreads = arguments.getThreads(); - int iteration = arguments.getIteration(); - if (this instanceof MultithreadedBenchmark) { - int expected = 10 * numOfThreads; - if (iteration < expected) { - iteration = expected; - logger.info( - "Iteration is too small for multi-threading benchmark. Adjust to: {}", - iteration); - } - } - while (!duration.isNegative()) { - Metrics metrics = new Metrics(); // Reset Metrics for each test loop. - progressBar = new ProgressBar("Iteration", iteration); - float[] lastResult = predict(arguments, metrics, iteration); - if (lastResult == null) { - return false; - } - - long begin = metrics.getMetric("start").get(0).getValue().longValue(); - long end = metrics.getMetric("end").get(0).getValue().longValue(); - long totalTime = end - begin; - - if (lastResult.length > 3) { - logger.info( - "Inference result: [{}, {}, {} ...]", - lastResult[0], - lastResult[1], - lastResult[2]); - } else { - logger.info("Inference result: {}", lastResult); - } - - String throughput = String.format("%.2f", iteration * 1000d / totalTime); - logger.info( - "Throughput: {}, completed {} iteration in {} ms.", - throughput, - iteration, - totalTime); - - if (metrics.hasMetric("LoadModel")) { - long loadModelTime = - metrics.getMetric("LoadModel").get(0).getValue().longValue(); - logger.info( - "Model loading time: {} ms.", - String.format("%.3f", loadModelTime / 1000f)); - } - - if (metrics.hasMetric("Inference") && iteration > 1) { - float totalP50 = metrics.percentile("Total", 50).getValue().longValue() / 1000f; - float totalP90 = metrics.percentile("Total", 90).getValue().longValue() / 1000f; - float totalP99 = metrics.percentile("Total", 99).getValue().longValue() / 1000f; - float p50 = metrics.percentile("Inference", 50).getValue().longValue() / 1000f; - float p90 = metrics.percentile("Inference", 90).getValue().longValue() / 1000f; - float p99 = metrics.percentile("Inference", 99).getValue().longValue() / 1000f; - float preP50 = - metrics.percentile("Preprocess", 50).getValue().longValue() / 1000f; - float preP90 = - metrics.percentile("Preprocess", 90).getValue().longValue() / 1000f; - float preP99 = - metrics.percentile("Preprocess", 99).getValue().longValue() / 1000f; - float postP50 = - metrics.percentile("Postprocess", 50).getValue().longValue() / 1000f; - float postP90 = - metrics.percentile("Postprocess", 90).getValue().longValue() / 1000f; - float postP99 = - metrics.percentile("Postprocess", 99).getValue().longValue() / 1000f; - logger.info( - String.format( - "total P50: %.3f ms, P90: %.3f ms, P99: %.3f ms", - totalP50, totalP90, totalP99)); - logger.info( - String.format( - "inference P50: %.3f ms, P90: %.3f ms, P99: %.3f ms", - p50, p90, p99)); - logger.info( - String.format( - "preprocess P50: %.3f ms, P90: %.3f ms, P99: %.3f ms", - preP50, preP90, preP99)); - logger.info( - String.format( - "postprocess P50: %.3f ms, P90: %.3f ms, P99: %.3f ms", - postP50, postP90, postP99)); - - if (Boolean.getBoolean("collect-memory")) { - float heapBeforeModel = - metrics.getMetric("Heap").get(0).getValue().longValue(); - float heapBeforeInference = - metrics.getMetric("Heap").get(1).getValue().longValue(); - float heap = metrics.percentile("Heap", 90).getValue().longValue(); - float nonHeap = metrics.percentile("NonHeap", 90).getValue().longValue(); - int mb = 1024 * 1024; - logger.info(String.format("heap (base): %.3f MB", heapBeforeModel / mb)); - logger.info( - String.format("heap (model): %.3f MB", heapBeforeInference / mb)); - logger.info(String.format("heap P90: %.3f MB", heap / mb)); - logger.info(String.format("nonHeap P90: %.3f MB", nonHeap / mb)); - - if (!System.getProperty("os.name").startsWith("Win")) { - float rssBeforeModel = - metrics.getMetric("rss").get(0).getValue().longValue(); - float rssBeforeInference = - metrics.getMetric("rss").get(1).getValue().longValue(); - float rss = metrics.percentile("rss", 90).getValue().longValue(); - float cpu = metrics.percentile("cpu", 90).getValue().longValue(); - logger.info(String.format("cpu P90: %.3f %%", cpu)); - logger.info(String.format("rss (base): %.3f MB", rssBeforeModel / mb)); - logger.info( - String.format("rss (model): %.3f MB", rssBeforeInference / mb)); - logger.info(String.format("rss P90: %.3f MB", rss / mb)); - } - } - } - MemoryTrainingListener.dumpMemoryInfo(metrics, arguments.getOutputDir()); - long delta = System.currentTimeMillis() - begin; - duration = duration.minus(Duration.ofMillis(delta)); - if (!duration.isNegative()) { - logger.info(duration.toMinutes() + " minutes left"); - } - } - return true; - } catch (ParseException e) { - Arguments.printHelp(e.getMessage(), options); - } catch (TranslateException | ModelException | IOException | ClassNotFoundException t) { - logger.error("Unexpected error", t); - } - return false; - } - - protected ZooModel loadModel(Arguments arguments, Metrics metrics, Device device) - throws ModelException, IOException { - long begin = System.nanoTime(); - PairList shapes = arguments.getInputShapes(); - BenchmarkTranslator translator = new BenchmarkTranslator(shapes); - - Criteria criteria = - Criteria.builder() - .setTypes(Void.class, float[].class) - .optModelUrls(arguments.getModelUrl()) - .optModelName(arguments.getModelName()) - .optEngine(arguments.getEngine()) - .optOptions(arguments.getModelOptions()) - .optArguments(arguments.getModelArguments()) - .optDevice(device) - .optTranslator(translator) - .optProgress(new ProgressBar()) - .build(); - - ZooModel model = criteria.loadModel(); - if (device == Device.cpu() || device == Device.gpu()) { - long delta = (System.nanoTime() - begin) / 1000; - logger.info( - "Model {} loaded in: {} ms.", - model.getName(), - String.format("%.3f", delta / 1000f)); - metrics.addMetric("LoadModel", delta, Unit.MICROSECONDS); - } - return model; - } - - private static final class BenchmarkTranslator implements NoBatchifyTranslator { - - private PairList shapes; - - public BenchmarkTranslator(PairList shapes) { - this.shapes = shapes; - } - - /** {@inheritDoc} */ - @Override - public NDList processInput(TranslatorContext ctx, Void input) { - NDList list = new NDList(); - for (Pair pair : shapes) { - DataType dataType = pair.getKey(); - Shape shape = pair.getValue(); - list.add(ctx.getNDManager().zeros(shape, dataType)); - } - return list; - } - - /** {@inheritDoc} */ - @Override - public float[] processOutput(TranslatorContext ctx, NDList list) { - FloatBuffer fb = list.get(0).toByteBuffer().asFloatBuffer(); - float[] ret = new float[fb.remaining()]; - fb.get(ret); - return ret; - } - } -} diff --git a/extensions/benchmark/src/main/java/ai/djl/benchmark/Arguments.java b/extensions/benchmark/src/main/java/ai/djl/benchmark/Arguments.java deleted file mode 100644 index 4a96e934fe5..00000000000 --- a/extensions/benchmark/src/main/java/ai/djl/benchmark/Arguments.java +++ /dev/null @@ -1,340 +0,0 @@ -/* - * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. - * - * Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance - * with the License. A copy of the License is located at - * - * http://aws.amazon.com/apache2.0/ - * - * or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES - * OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions - * and limitations under the License. - */ -package ai.djl.benchmark; - -import ai.djl.Device; -import ai.djl.engine.Engine; -import ai.djl.ndarray.types.DataType; -import ai.djl.ndarray.types.Shape; -import ai.djl.util.PairList; - -import org.apache.commons.cli.CommandLine; -import org.apache.commons.cli.HelpFormatter; -import org.apache.commons.cli.Option; -import org.apache.commons.cli.OptionGroup; -import org.apache.commons.cli.Options; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -import java.io.IOException; -import java.nio.file.Path; -import java.nio.file.Paths; -import java.util.Arrays; -import java.util.List; -import java.util.Map; -import java.util.concurrent.ConcurrentHashMap; - -/** A class represents parsed command line arguments. */ -public class Arguments { - - private static final Logger logger = LoggerFactory.getLogger(Arguments.class); - - private String modelUrl; - private String modelName; - private String engine; - private String modelOptions; - private String modelArguments; - private String outputDir; - private int duration; - private int iteration; - private int threads; - private int maxGpus; - private int neuronCores; - private int delay; - private PairList inputShapes; - - /** - * Constructs a {@code Arguments} instance. - * - * @param cmd command line options - */ - Arguments(CommandLine cmd) { - if (cmd.hasOption("model-path")) { - String modelPath = cmd.getOptionValue("model-path"); - Path path = Paths.get(modelPath); - try { - modelUrl = path.toUri().toURL().toExternalForm(); - } catch (IOException e) { - throw new IllegalArgumentException("Invalid model-path: " + modelPath, e); - } - } else if (cmd.hasOption("model-url")) { - modelUrl = cmd.getOptionValue("model-url"); - } - - modelName = cmd.getOptionValue("model-name"); - modelOptions = cmd.getOptionValue("model-options"); - modelArguments = cmd.getOptionValue("model-arguments"); - outputDir = cmd.getOptionValue("output-dir"); - - if (cmd.hasOption("engine")) { - engine = cmd.getOptionValue("engine"); - } else { - engine = Engine.getDefaultEngineName(); - } - - if (cmd.hasOption("duration")) { - duration = Integer.parseInt(cmd.getOptionValue("duration")); - } - iteration = 1; - if (cmd.hasOption("iteration")) { - iteration = Integer.parseInt(cmd.getOptionValue("iteration")); - } - if (cmd.hasOption("gpus")) { - maxGpus = Integer.parseInt(cmd.getOptionValue("gpus")); - if (maxGpus < 0) { - maxGpus = Integer.MAX_VALUE; - } - } else { - maxGpus = Integer.MAX_VALUE; - } - if (cmd.hasOption("neuron-cores")) { - neuronCores = Integer.parseInt(cmd.getOptionValue("neuron-cores")); - } - if (cmd.hasOption("threads")) { - threads = Integer.parseInt(cmd.getOptionValue("threads")); - Engine eng = Engine.getEngine(engine); - Device[] devices = eng.getDevices(maxGpus); - if (devices[0].isGpu()) { - // one thread per GPU - if (threads <= 0) { - threads = devices.length; - } else if (threads < devices.length) { - threads = devices.length; - logger.warn( - "Number of threads is less than GPU count, adjust to: {}", - devices.length); - } else if ("MXNet".equals(engine) && threads > devices.length) { - threads = devices.length; - logger.warn("MXNet inference can only have one worker per GPU."); - } else if (threads % devices.length != 0) { - threads = threads / devices.length * devices.length; - logger.warn("threads should be multiple of GPU count, change to: {}", threads); - } - } else if (threads <= 0) { - threads = Runtime.getRuntime().availableProcessors(); - } - } - if (cmd.hasOption("delay")) { - delay = Integer.parseInt(cmd.getOptionValue("delay")); - } - - String shape = cmd.getOptionValue("input-shapes"); - inputShapes = NDListGenerator.parseShape(shape); - } - - static Options getOptions() { - Options options = new Options(); - options.addOption( - Option.builder("h").longOpt("help").hasArg(false).desc("Print this help.").build()); - OptionGroup artifactGroup = new OptionGroup(); - artifactGroup.setRequired(true); - artifactGroup.addOption( - Option.builder("p") - .longOpt("model-path") - .hasArg() - .argName("MODEL-PATH") - .desc("Model directory file path.") - .build()); - artifactGroup.addOption( - Option.builder("u") - .longOpt("model-url") - .hasArg() - .argName("MODEL-URL") - .desc("Model archive file URL.") - .build()); - options.addOptionGroup(artifactGroup); - options.addOption( - Option.builder("n") - .longOpt("model-name") - .hasArg() - .argName("MODEL-NAME") - .desc("Specify model file name.") - .build()); - options.addOption( - Option.builder() - .longOpt("model-options") - .hasArg() - .argName("MODEL-OPTIONS") - .desc("Specify model loading options.") - .build()); - options.addOption( - Option.builder() - .longOpt("model-arguments") - .hasArg() - .argName("MODEL-ARGUMENTS") - .desc("Specify model loading arguments.") - .build()); - options.addOption( - Option.builder("e") - .longOpt("engine") - .hasArg() - .argName("ENGINE-NAME") - .desc("Choose an Engine for the benchmark.") - .build()); - options.addOption( - Option.builder("s") - .required() - .longOpt("input-shapes") - .hasArg() - .argName("INPUT-SHAPES") - .desc("Input data shapes for the model.") - .build()); - options.addOption( - Option.builder("d") - .longOpt("duration") - .hasArg() - .argName("DURATION") - .desc("Duration of the test in minutes.") - .build()); - options.addOption( - Option.builder("c") - .longOpt("iteration") - .hasArg() - .argName("ITERATION") - .desc("Number of total iterations.") - .build()); - options.addOption( - Option.builder("t") - .longOpt("threads") - .hasArg() - .argName("NUMBER_THREADS") - .desc("Number of inference threads.") - .build()); - OptionGroup deviceGroup = new OptionGroup(); - deviceGroup.addOption( - Option.builder("g") - .longOpt("gpus") - .hasArg() - .argName("NUMBER_GPUS") - .desc("Number of GPUS to run multithreading inference.") - .build()); - deviceGroup.addOption( - Option.builder() - .longOpt("neuron-cores") - .hasArg() - .argName("NEURON-CORES") - .desc( - "Number of neuron cores to run multithreading inference, See" - + " https://awsdocs-neuron.readthedocs-hosted.com.") - .build()); - options.addOptionGroup(deviceGroup); - options.addOption( - Option.builder("l") - .longOpt("delay") - .hasArg() - .argName("DELAY") - .desc("Delay of incremental threads.") - .build()); - options.addOption( - Option.builder("o") - .longOpt("output-dir") - .hasArg() - .argName("OUTPUT-DIR") - .desc("Directory for output logs.") - .build()); - return options; - } - - static boolean hasHelp(String[] args) { - List list = Arrays.asList(args); - return list.contains("-h") || list.contains("--help"); - } - - static void printHelp(String msg, Options options) { - HelpFormatter formatter = new HelpFormatter(); - formatter.setSyntaxPrefix(""); - formatter.setLeftPadding(1); - formatter.setWidth(120); - formatter.printHelp(msg, options); - } - - int getDuration() { - return duration; - } - - String getEngine() { - return engine; - } - - String getModelUrl() { - return modelUrl; - } - - String getModelName() { - return modelName; - } - - Map getModelOptions() { - if (modelOptions == null) { - return null; - } - Map map = new ConcurrentHashMap<>(); - for (String option : modelOptions.split(",")) { - String[] tokens = option.split("=", 2); - if (tokens.length == 2) { - map.put(tokens[0].trim(), tokens[1].trim()); - } else { - map.put(tokens[0].trim(), ""); - } - } - return map; - } - - Map getModelArguments() { - if (modelArguments == null) { - return null; - } - - Map map = new ConcurrentHashMap<>(); - for (String option : modelArguments.split(",")) { - String[] tokens = option.split("=", 2); - if (tokens.length == 2) { - map.put(tokens[0].trim(), tokens[1].trim()); - } else { - map.put(tokens[0].trim(), ""); - } - } - return map; - } - - int getIteration() { - return iteration; - } - - int getThreads() { - return threads; - } - - int getMaxGpus() { - return maxGpus; - } - - int getNeuronCores() { - return neuronCores; - } - - String getOutputDir() { - if (outputDir == null) { - outputDir = "build"; - } - return outputDir; - } - - int getDelay() { - return delay; - } - - PairList getInputShapes() { - return inputShapes; - } -} diff --git a/extensions/benchmark/src/main/java/ai/djl/benchmark/Benchmark.java b/extensions/benchmark/src/main/java/ai/djl/benchmark/Benchmark.java deleted file mode 100644 index 3fad6c37798..00000000000 --- a/extensions/benchmark/src/main/java/ai/djl/benchmark/Benchmark.java +++ /dev/null @@ -1,119 +0,0 @@ -/* - * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. - * - * Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance - * with the License. A copy of the License is located at - * - * http://aws.amazon.com/apache2.0/ - * - * or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES - * OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions - * and limitations under the License. - */ -package ai.djl.benchmark; - -import ai.djl.Device; -import ai.djl.ModelException; -import ai.djl.engine.Engine; -import ai.djl.engine.EngineException; -import ai.djl.inference.Predictor; -import ai.djl.metric.Metrics; -import ai.djl.metric.Unit; -import ai.djl.repository.zoo.ZooModel; -import ai.djl.training.listener.MemoryTrainingListener; -import ai.djl.translate.TranslateException; - -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -import java.io.IOException; -import java.util.Arrays; -import java.util.List; - -/** A class runs single threaded benchmark. */ -public final class Benchmark extends AbstractBenchmark { - - private static final Logger logger = LoggerFactory.getLogger(Benchmark.class); - - /** - * Main entry point. - * - * @param args command line arguments - */ - public static void main(String[] args) { - List list = Arrays.asList(args); - try { - boolean success; - if (!list.isEmpty() && "ndlist-gen".equals(list.get(0))) { - success = NDListGenerator.generate(Arrays.copyOfRange(args, 1, args.length)); - } else { - boolean multithreading = list.contains("-t") || list.contains("--threads"); - configEngines(multithreading); - if (multithreading) { - success = new MultithreadedBenchmark().runBenchmark(args); - } else { - success = new Benchmark().runBenchmark(args); - } - } - if (!success) { - System.exit(-1); // NOPMD - } - } catch (EngineException e) { - String osName = System.getProperty("os.name"); - String arch = System.getProperty("os.arch"); - logger.warn("Engine is not supported on {}:{}.", osName, arch); - logger.debug("Failed to load engine", e); - } - } - - /** {@inheritDoc} */ - @Override - public float[] predict(Arguments arguments, Metrics metrics, int iteration) - throws IOException, ModelException, TranslateException { - Device device = Engine.getEngine(arguments.getEngine()).defaultDevice(); - try (ZooModel model = loadModel(arguments, metrics, device)) { - float[] predictResult = null; - - try (Predictor predictor = model.newPredictor()) { - predictor.predict(null); // warmup - - predictor.setMetrics(metrics); // Let predictor collect metrics - metrics.addMetric("start", System.currentTimeMillis(), Unit.MILLISECONDS); - for (int i = 0; i < iteration; ++i) { - predictResult = predictor.predict(null); - - progressBar.update(i); - MemoryTrainingListener.collectMemoryInfo(metrics); - } - metrics.addMetric("end", System.currentTimeMillis(), Unit.MILLISECONDS); - } - return predictResult; - } - } - - private static void configEngines(boolean multithreading) { - if (multithreading) { - if (System.getProperty("ai.djl.pytorch.num_interop_threads") == null) { - System.setProperty("ai.djl.pytorch.num_interop_threads", "1"); - } - if (System.getProperty("ai.djl.pytorch.num_threads") == null) { - System.setProperty("ai.djl.pytorch.num_threads", "1"); - } - } - if (System.getProperty("ai.djl.tflite.disable_alternative") == null) { - System.setProperty("ai.djl.tflite.disable_alternative", "true"); - } - if (System.getProperty("ai.djl.dlr.disable_alternative") == null) { - System.setProperty("ai.djl.dlr.disable_alternative", "true"); - } - if (System.getProperty("ai.djl.paddlepaddle.disable_alternative") == null) { - System.setProperty("ai.djl.paddlepaddle.disable_alternative", "true"); - } - if (System.getProperty("ai.djl.onnx.disable_alternative") == null) { - System.setProperty("ai.djl.onnx.disable_alternative", "true"); - } - if (System.getProperty("ai.djl.tensorrt.disable_alternative") == null) { - System.setProperty("ai.djl.tensorrt.disable_alternative", "true"); - } - } -} diff --git a/extensions/benchmark/src/main/java/ai/djl/benchmark/MultithreadedBenchmark.java b/extensions/benchmark/src/main/java/ai/djl/benchmark/MultithreadedBenchmark.java deleted file mode 100644 index 438b176d4ef..00000000000 --- a/extensions/benchmark/src/main/java/ai/djl/benchmark/MultithreadedBenchmark.java +++ /dev/null @@ -1,194 +0,0 @@ -/* - * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. - * - * Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance - * with the License. A copy of the License is located at - * - * http://aws.amazon.com/apache2.0/ - * - * or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES - * OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions - * and limitations under the License. - */ -package ai.djl.benchmark; - -import ai.djl.Device; -import ai.djl.ModelException; -import ai.djl.engine.Engine; -import ai.djl.inference.Predictor; -import ai.djl.metric.Metrics; -import ai.djl.metric.Unit; -import ai.djl.repository.zoo.ZooModel; -import ai.djl.training.listener.MemoryTrainingListener; -import ai.djl.translate.TranslateException; - -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -import java.io.IOException; -import java.util.ArrayList; -import java.util.Arrays; -import java.util.List; -import java.util.concurrent.Callable; -import java.util.concurrent.ExecutionException; -import java.util.concurrent.ExecutorService; -import java.util.concurrent.Executors; -import java.util.concurrent.Future; -import java.util.concurrent.atomic.AtomicInteger; - -/** A class runs single threaded benchmark. */ -public class MultithreadedBenchmark extends AbstractBenchmark { - - private static final Logger logger = LoggerFactory.getLogger(MultithreadedBenchmark.class); - - /** {@inheritDoc} */ - @Override - public float[] predict(Arguments arguments, Metrics metrics, int iteration) - throws IOException, ModelException, TranslateException { - - MemoryTrainingListener.collectMemoryInfo(metrics); // Measure memory before loading model - - Engine engine = Engine.getEngine(arguments.getEngine()); - Device[] devices = engine.getDevices(arguments.getMaxGpus()); - int numOfThreads = arguments.getThreads(); - int neuronCores = arguments.getNeuronCores(); - if (neuronCores > 0) { - devices = new Device[neuronCores]; - Arrays.fill(devices, Device.cpu()); - if (numOfThreads > 1) { - numOfThreads = 2 * neuronCores; - } - } - - int delay = arguments.getDelay(); - AtomicInteger counter = new AtomicInteger(iteration); - logger.info("Multithreading inference with {} threads.", numOfThreads); - - List> models = new ArrayList<>(devices.length); - List callables = new ArrayList<>(numOfThreads); - for (Device device : devices) { - ZooModel model = loadModel(arguments, metrics, device); - models.add(model); - - for (int i = 0; i < numOfThreads / devices.length; ++i) { - callables.add(new PredictorCallable(model, metrics, counter, i, i == 0)); - } - } - - float[] result = null; - ExecutorService executorService = Executors.newFixedThreadPool(numOfThreads); - - MemoryTrainingListener.collectMemoryInfo(metrics); // Measure memory before worker kickoff - - int successThreads = 0; - try { - for (PredictorCallable callable : callables) { - callable.warmup(); - } - - metrics.addMetric("start", System.currentTimeMillis(), Unit.MILLISECONDS); - try { - List> futures; - if (delay > 0) { - futures = new ArrayList<>(); - for (PredictorCallable callable : callables) { - futures.add(executorService.submit(callable)); - Thread.sleep(delay); - } - } else { - futures = executorService.invokeAll(callables); - } - - for (Future future : futures) { - result = future.get(); - if (result != null) { - ++successThreads; - } - } - } catch (InterruptedException | ExecutionException e) { - logger.error("", e); - } - metrics.addMetric("end", System.currentTimeMillis(), Unit.MILLISECONDS); - for (PredictorCallable callable : callables) { - callable.close(); - } - } finally { - executorService.shutdown(); - } - - models.forEach(ZooModel::close); - if (successThreads != numOfThreads) { - logger.error("Only {}/{} threads finished.", successThreads, numOfThreads); - return null; - } - - return result; - } - - private static class PredictorCallable implements Callable { - - private Predictor predictor; - - private Metrics metrics; - private String workerId; - private boolean collectMemory; - private AtomicInteger counter; - private int total; - private int steps; - - public PredictorCallable( - ZooModel model, - Metrics metrics, - AtomicInteger counter, - int workerId, - boolean collectMemory) { - this.predictor = model.newPredictor(); - this.metrics = metrics; - this.counter = counter; - this.workerId = String.format("%02d", workerId); - this.collectMemory = collectMemory; - predictor.setMetrics(metrics); - total = counter.get(); - if (total < 10) { - steps = 1; - } else { - steps = (int) Math.pow(10, (int) Math.log10(total)); - } - } - - /** {@inheritDoc} */ - @Override - public float[] call() throws Exception { - float[] result = null; - int count = 0; - int remaining; - while ((remaining = counter.decrementAndGet()) > 0 || result == null) { - try { - result = predictor.predict(null); - } catch (Exception e) { - // stop immediately when we find any exception - counter.set(0); - throw e; - } - if (collectMemory) { - MemoryTrainingListener.collectMemoryInfo(metrics); - } - int processed = total - remaining + 1; - logger.trace("Worker-{}: {} iteration finished.", workerId, ++count); - if (processed % steps == 0 || processed == total) { - logger.info("Completed {} requests", processed); - } - } - logger.debug("Worker-{}: finished.", workerId); - return result; - } - - public void warmup() throws TranslateException { - predictor.predict(null); - } - - public void close() { - predictor.close(); - } - } -} diff --git a/extensions/benchmark/src/main/java/ai/djl/benchmark/NDListGenerator.java b/extensions/benchmark/src/main/java/ai/djl/benchmark/NDListGenerator.java deleted file mode 100644 index eb52178b6b5..00000000000 --- a/extensions/benchmark/src/main/java/ai/djl/benchmark/NDListGenerator.java +++ /dev/null @@ -1,171 +0,0 @@ -/* - * Copyright 2021 Amazon.com, Inc. or its affiliates. All Rights Reserved. - * - * Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance - * with the License. A copy of the License is located at - * - * http://aws.amazon.com/apache2.0/ - * - * or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES - * OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions - * and limitations under the License. - */ -package ai.djl.benchmark; - -import ai.djl.Device; -import ai.djl.ndarray.NDList; -import ai.djl.ndarray.NDManager; -import ai.djl.ndarray.types.DataType; -import ai.djl.ndarray.types.Shape; -import ai.djl.util.Pair; -import ai.djl.util.PairList; - -import org.apache.commons.cli.CommandLine; -import org.apache.commons.cli.DefaultParser; -import org.apache.commons.cli.Option; -import org.apache.commons.cli.Options; -import org.apache.commons.cli.ParseException; -import org.slf4j.Logger; -import org.slf4j.LoggerFactory; - -import java.io.BufferedOutputStream; -import java.io.OutputStream; -import java.nio.file.Files; -import java.nio.file.Path; -import java.nio.file.Paths; -import java.util.Arrays; -import java.util.regex.Matcher; -import java.util.regex.Pattern; - -/** A class generates NDList files. */ -final class NDListGenerator { - - private static final Logger logger = LoggerFactory.getLogger(NDListGenerator.class); - - private NDListGenerator() {} - - static boolean generate(String[] args) { - Options options = getOptions(); - try { - if (Arguments.hasHelp(args)) { - Arguments.printHelp( - "usage: djl-bench ndlist-gen -s INPUT-SHAPES -o OUTPUT_FILE", options); - return true; - } - DefaultParser parser = new DefaultParser(); - CommandLine cmd = parser.parse(options, args, null, false); - String inputShapes = cmd.getOptionValue("input-shapes"); - String output = cmd.getOptionValue("output-file"); - boolean ones = cmd.hasOption("ones"); - Path path = Paths.get(output); - - try (NDManager manager = NDManager.newBaseManager(Device.cpu(), "PyTorch")) { - NDList list = new NDList(); - for (Pair pair : parseShape(inputShapes)) { - DataType dataType = pair.getKey(); - Shape shape = pair.getValue(); - if (ones) { - list.add(manager.ones(shape, dataType)); - } else { - list.add(manager.zeros(shape, dataType)); - } - } - try (OutputStream os = new BufferedOutputStream(Files.newOutputStream(path))) { - list.encode(os); - } - } - logger.info("NDList file created: {}", path.toAbsolutePath()); - return true; - } catch (ParseException e) { - Arguments.printHelp(e.getMessage(), options); - } catch (Throwable t) { - logger.error("Unexpected error", t); - } - return false; - } - - static PairList parseShape(String shape) { - PairList inputShapes = new PairList<>(); - if (shape != null) { - if (shape.contains("(")) { - Pattern pattern = - Pattern.compile("\\((\\s*(\\d+)([,\\s]+\\d+)*\\s*)\\)([sdubilBfS]?)"); - Matcher matcher = pattern.matcher(shape); - while (matcher.find()) { - String[] tokens = matcher.group(1).split(","); - long[] array = Arrays.stream(tokens).mapToLong(Long::parseLong).toArray(); - DataType dataType; - String dataTypeStr = matcher.group(4); - if (dataTypeStr == null || dataTypeStr.isEmpty()) { - dataType = DataType.FLOAT32; - } else { - switch (dataTypeStr) { - case "s": - dataType = DataType.FLOAT16; - break; - case "d": - dataType = DataType.FLOAT64; - break; - case "u": - dataType = DataType.UINT8; - break; - case "b": - dataType = DataType.INT8; - break; - case "i": - dataType = DataType.INT32; - break; - case "l": - dataType = DataType.INT64; - break; - case "B": - dataType = DataType.BOOLEAN; - break; - case "f": - dataType = DataType.FLOAT32; - break; - default: - throw new IllegalArgumentException("Invalid input-shape: " + shape); - } - } - inputShapes.add(dataType, new Shape(array)); - } - } else { - String[] tokens = shape.split(","); - long[] shapes = Arrays.stream(tokens).mapToLong(Long::parseLong).toArray(); - inputShapes.add(DataType.FLOAT32, new Shape(shapes)); - } - } - return inputShapes; - } - - private static Options getOptions() { - Options options = new Options(); - options.addOption( - Option.builder("h").longOpt("help").hasArg(false).desc("Print this help.").build()); - options.addOption( - Option.builder("s") - .required() - .longOpt("input-shapes") - .hasArg() - .argName("INPUT-SHAPES") - .desc("Input data shapes for the model.") - .build()); - options.addOption( - Option.builder("o") - .required() - .longOpt("output-file") - .hasArg() - .argName("OUTPUT-FILE") - .desc("Write output NDList to file.") - .build()); - options.addOption( - Option.builder("1") - .longOpt("ones") - .hasArg(false) - .argName("ones") - .desc("Use all ones instead of zeros.") - .build()); - return options; - } -} diff --git a/extensions/benchmark/src/main/java/ai/djl/benchmark/package-info.java b/extensions/benchmark/src/main/java/ai/djl/benchmark/package-info.java deleted file mode 100644 index 6436a24fe15..00000000000 --- a/extensions/benchmark/src/main/java/ai/djl/benchmark/package-info.java +++ /dev/null @@ -1,15 +0,0 @@ -/* - * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. - * - * Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance - * with the License. A copy of the License is located at - * - * http://aws.amazon.com/apache2.0/ - * - * or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES - * OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions - * and limitations under the License. - */ - -/** Contains benchmarking utility classes. */ -package ai.djl.benchmark; diff --git a/extensions/benchmark/src/main/resources/log4j2.xml b/extensions/benchmark/src/main/resources/log4j2.xml deleted file mode 100644 index cb6d3c6fbee..00000000000 --- a/extensions/benchmark/src/main/resources/log4j2.xml +++ /dev/null @@ -1,20 +0,0 @@ - - - - - - - - - - - - - - - - - - - diff --git a/extensions/benchmark/src/test/java/ai/djl/benchmark/BenchmarkTest.java b/extensions/benchmark/src/test/java/ai/djl/benchmark/BenchmarkTest.java deleted file mode 100644 index 3b53e6fc8dc..00000000000 --- a/extensions/benchmark/src/test/java/ai/djl/benchmark/BenchmarkTest.java +++ /dev/null @@ -1,123 +0,0 @@ -/* - * Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved. - * - * Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance - * with the License. A copy of the License is located at - * - * http://aws.amazon.com/apache2.0/ - * - * or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES - * OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions - * and limitations under the License. - */ -package ai.djl.benchmark; - -import ai.djl.ndarray.types.DataType; - -import org.apache.commons.cli.CommandLine; -import org.apache.commons.cli.DefaultParser; -import org.apache.commons.cli.Options; -import org.apache.commons.cli.ParseException; -import org.testng.Assert; -import org.testng.annotations.Test; - -import java.net.MalformedURLException; -import java.nio.file.Paths; -import java.util.Map; - -public class BenchmarkTest { - - @Test - public void testHelp() { - String[] args = {"-h"}; - Benchmark.main(args); - } - - @Test - public void testArguments() throws ParseException, MalformedURLException { - Options options = Arguments.getOptions(); - DefaultParser parser = new DefaultParser(); - - String[] args = { - "-p", - "/opt/ml/resnet18_v1", - "-s", - "(1)s,(1)d,(1)u,(1)b,(1)i,(1)l,(1)B,(1)", - "--model-options", - "fp16,dlaCore=1", - "--model-arguments", - "width=28" - }; - CommandLine cmd = parser.parse(options, args, null, false); - Arguments arguments = new Arguments(cmd); - String expected = Paths.get("/opt/ml/resnet18_v1").toUri().toURL().toString(); - Assert.assertEquals(arguments.getModelUrl(), expected); - DataType[] types = arguments.getInputShapes().keyArray(new DataType[0]); - Assert.assertEquals(types[0], DataType.FLOAT16); - Assert.assertEquals(types[1], DataType.FLOAT64); - Assert.assertEquals(types[2], DataType.UINT8); - Assert.assertEquals(types[3], DataType.INT8); - Assert.assertEquals(types[4], DataType.INT32); - Assert.assertEquals(types[5], DataType.INT64); - Assert.assertEquals(types[6], DataType.BOOLEAN); - Assert.assertEquals(types[7], DataType.FLOAT32); - - Assert.assertThrows( - IllegalArgumentException.class, - () -> { - String[] arg = {"-p", "/opt/ml/resnet18_v1", "-s", "(1)S"}; - CommandLine commandLine = parser.parse(options, arg, null, false); - new Arguments(commandLine); - }); - - Map map = arguments.getModelOptions(); - Assert.assertEquals(map.get("dlaCore"), "1"); - Assert.assertTrue(map.containsKey("fp16")); - - Map modelArguments = arguments.getModelArguments(); - Assert.assertEquals(modelArguments.get("width"), "28"); - } - - @Test - public void testBenchmark() { - String[] args = { - "-e", - "PyTorch", - "-u", - "djl://ai.djl.pytorch/resnet/0.0.1/traced_resnet18", - "-s", - "1,3,224,224", - "-c", - "2" - }; - new Benchmark().runBenchmark(args); - } - - @Test - public void testMultithreadedBenchmark() { - System.setProperty("collect-memory", "true"); - try { - String[] args = { - "-e", - "PyTorch", - "-u", - "djl://ai.djl.pytorch/resnet/0.0.1/traced_resnet18", - "-s", - "(1,3,224,224)f", - "-d", - "1", - "-l", - "1", - "-c", - "2", - "-t", - "-1", - "-g", - "-1" - }; - Benchmark.main(args); - } finally { - System.clearProperty("collect-memory"); - } - } -} diff --git a/extensions/benchmark/src/test/java/ai/djl/benchmark/NDListGeneratorTest.java b/extensions/benchmark/src/test/java/ai/djl/benchmark/NDListGeneratorTest.java deleted file mode 100644 index 50964a9a1f5..00000000000 --- a/extensions/benchmark/src/test/java/ai/djl/benchmark/NDListGeneratorTest.java +++ /dev/null @@ -1,46 +0,0 @@ -/* - * Copyright 2021 Amazon.com, Inc. or its affiliates. All Rights Reserved. - * - * Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance - * with the License. A copy of the License is located at - * - * http://aws.amazon.com/apache2.0/ - * - * or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES - * OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions - * and limitations under the License. - */ -package ai.djl.benchmark; - -import org.testng.Assert; -import org.testng.annotations.Test; - -public class NDListGeneratorTest { - - @Test - public void testHelp() { - String[] args = {"ndlist-gen", "-h"}; - Benchmark.main(args); - } - - @Test - public void testMissingOptions() { - String[] args = {"ndlist-gen", "-s"}; - boolean success = NDListGenerator.generate(args); - Assert.assertFalse(success); - } - - @Test - public void testOnes() { - String[] args = {"ndlist-gen", "-s", "1", "-o", "build/ones.ndlist", "-1"}; - boolean success = NDListGenerator.generate(args); - Assert.assertTrue(success); - } - - @Test - public void testZeros() { - String[] args = {"ndlist-gen", "-s", "1", "-o", "build/ones.ndlist"}; - boolean success = NDListGenerator.generate(args); - Assert.assertTrue(success); - } -} diff --git a/extensions/benchmark/src/test/java/ai/djl/benchmark/package-info.java b/extensions/benchmark/src/test/java/ai/djl/benchmark/package-info.java deleted file mode 100644 index fd842219c53..00000000000 --- a/extensions/benchmark/src/test/java/ai/djl/benchmark/package-info.java +++ /dev/null @@ -1,15 +0,0 @@ -/* - * Copyright 2021 Amazon.com, Inc. or its affiliates. All Rights Reserved. - * - * Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance - * with the License. A copy of the License is located at - * - * http://aws.amazon.com/apache2.0/ - * - * or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES - * OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions - * and limitations under the License. - */ - -/** Contains tests for the benchmark module. */ -package ai.djl.benchmark; diff --git a/settings.gradle b/settings.gradle index 859c5362eff..412daf4fee3 100644 --- a/settings.gradle +++ b/settings.gradle @@ -27,7 +27,6 @@ include ':engines:tflite:tflite-native' include ':examples' include 'extensions:audio' include ':extensions:aws-ai' -include ':extensions:benchmark' include ':extensions:fasttext' include ':extensions:hadoop' include ':extensions:opencv' diff --git a/tools/gradle/release.gradle b/tools/gradle/release.gradle index 5645ad5fe0d..85662d8b1f1 100644 --- a/tools/gradle/release.gradle +++ b/tools/gradle/release.gradle @@ -59,7 +59,6 @@ task increaseFinalVersion { collection += fileTree(".").filter { it.name.endsWith(".md") || it.name.endsWith("overview.html") } - collection += file("extensions/benchmark/snapcraft/snapcraft.yaml") collection.each { File file -> file.text = file.text.replaceAll("/${previousVersion}/", "/${djl_version}/")