Skip to content
This repository has been archived by the owner on Dec 9, 2024. It is now read-only.

Files

Latest commit

5996abc · Sep 20, 2023

History

History

tf_cnn_benchmarks

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Sep 19, 2023
Mar 16, 2023
Mar 16, 2023
Jan 16, 2020
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Sep 19, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Sep 20, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Mar 16, 2023
Sep 19, 2023
Mar 16, 2023

tf_cnn_benchmarks: High performance benchmarks

Note: tf_cnn_benchmarks is no longer maintained.

tf_cnn_benchmarks contains TensorFlow 1 implementations of several popular convolutional models, and is designed to be as fast as possible. tf_cnn_benchmarks supports both running on a single machine or running in distributed mode across multiple hosts.

tf_cnn_benchmarks is no longer maintained. Although it will run with TensorFlow 2, it was written and optimized for TensorFlow 1, and has not been maintained since TensorFlow 2 was released. For clean and easy-to-read TensorFlow 2 models, please see the TensorFlow Official Models.

Getting Started

To run ResNet50 with synthetic data without distortions with a single GPU, run

python tf_cnn_benchmarks.py --num_gpus=1 --batch_size=32 --model=resnet50 --variable_update=parameter_server

Note that the master branch of tf_cnn_benchmarks occasionally requires the latest nightly version of TensorFlow. You can install the nightly version by running pip install tf-nightly-gpu in a clean environment, or by installing TensorFlow from source. We sometimes will create a branch of tf_cnn_benchmarks, in the form of cnn_tf_vX.Y_compatible, that is compatible with TensorFlow version X.Y. For example, branch cnn_tf_v1.9_compatible works with TensorFlow 1.9. However, as tf_cnn_benchmarks is no longer maintained, we will likely no longer create new branches.

Some important flags are

  • model: Model to use, e.g. resnet50, inception3, vgg16, and alexnet.
  • num_gpus: Number of GPUs to use.
  • data_dir: Path to data to process. If not set, synthetic data is used. To use Imagenet data use these instructions as a starting point.
  • batch_size: Batch size for each GPU.
  • variable_update: The method for managing variables: parameter_server ,replicated, distributed_replicated, independent
  • local_parameter_device: Device to use as parameter server: cpu or gpu.

To see the full list of flags, run python tf_cnn_benchmarks.py --help.

To run ResNet50 with real data with 8 GPUs, run:

python tf_cnn_benchmarks.py --data_format=NCHW --batch_size=256 \
--model=resnet50 --optimizer=momentum --variable_update=replicated \
--nodistortions --gradient_repacking=8 --num_gpus=8 \
--num_epochs=90 --weight_decay=1e-4 --data_dir=${DATA_DIR} --use_fp16 \
--train_dir=${CKPT_DIR}

This will train a ResNet-50 model on ImageNet with 2048 batch size on 8 GPUs. The model should train to around 76% accuracy.

Running the tests

To run the tests, run

pip install portpicker
python run_tests.py && python run_tests.py --run_distributed_tests

Note the tests require portpicker.

The command above runs a subset of tests that is both fast and fairly comprehensive. Alternatively, all the tests can be run, but this will take a long time:

python run_tests.py --full_tests && python run_tests.py --full_tests --run_distributed_tests

We will run all tests on every PR before merging them, so it is not necessary to pass --full_tests when running tests yourself.

To run an individual test, such as method testParameterServer of test class TfCnnBenchmarksTest of module benchmark_cnn_test, run

python -m unittest -v benchmark_cnn_test.TfCnnBenchmarksTest.testParameterServer