diff --git a/README.md b/README.md index d47b7c8f..b0064510 100644 --- a/README.md +++ b/README.md @@ -8,11 +8,23 @@ The pre-trained models of backbone networks can be found here: - [SE-ResNet50](https://github.com/HiKapok/TF_Se_ResNe_t) - [SE-ResNeXt50](https://github.com/HiKapok/TF_Se_ResNe_t) +## Introduction + The main goal of this competition is to detect the keypoints of the clothes' image colleted from Alibaba's e-commerce platforms. There are tens of thousands images in total five categories: blouse, outwear, trousers, skirt, dress. The keypoints for each category is defined as follows. ![](demos/outline.jpg "The Keypoints for Each Category") -All the codes was writen by myself and tested under TensorFlow 1.6, Python 3.5, Ubuntu 16.04. I tried to use the latest possible TensorFlow's best practice paradigm, like [tf.estimator](https://www.tensorflow.org/api_docs/python/tf/estimator) and [tf.layers](https://www.tensorflow.org/api_docs/python/tf/layers). Almost none py_func was used in my codes to maximize the performance. Augumentations like flip, rotate, random crop, color distort were used to reduce overfit. The current performance of the model is ~0.4% in Normalized Error and got to ~20th-place in the second stage of the competition. +Almost all the codes was writen by myself and tested under TensorFlow 1.6, Python 3.5, Ubuntu 16.04. I tried to use the latest possible TensorFlow's best practice paradigm, like [tf.estimator](https://www.tensorflow.org/api_docs/python/tf/estimator) and [tf.layers](https://www.tensorflow.org/api_docs/python/tf/layers). Almost none py_func was used in my codes to maximize the performance. Augumentations like flip, rotate, random crop, color distort were used to reduce overfitting. The current performance of the model is ~0.4% in Normalized Error and got to ~20th-place in the second stage of the competition. + +About the model: + +- DetNet is better, perform almost the same as SEResNeXt, while SEResNet showed little improvement than ResNet +- Enforce the loss of invisible keypoints to zero gave better performance +- OHKM is useful +- It's bad to do gaussian blur on the predicted heatmap, but it's better to do gaussian blur on the target heatmaps for lower-level prediction +- Ensemble of the heatmaps for fliped images is worser than emsemble of the predictions of fliped images, and do one quarter correction is also useful +- Do cascaded prediction on whole network can eliminate the using of clothes detection network as well as larger input image +- The native hourglass model was the worst but still have great potential, see the top solution of [here](http://human-pose.mpi-inf.mpg.de/#results) There are still other ways to further improve the performance but I didn't try those in this competition because of their limitations in applications, for example: @@ -25,10 +37,56 @@ There are still other ways to further improve the performance but I didn't try t If you find it's useful to your research or competitions, any contribution or star to this repo is welcomed. -By the way, I'm looking for one computer vision related job recently. I'm very looking forward to your contact if you are interested in. +## Usage +- Download [fashionAI Dataset](https://tianchi.aliyun.com/competition/information.htm?spm=5176.11165261.5678.2.34b72ec5iFguTn&raceId=231648&_lang=en_US) and reorganize the directory as follows: + ``` + DATA_DIR/ + |->train_0/ + | |->Annotations/ + | | |->annotations.csv + | |->Images/ + | | |->blouse + | | |->... + |->train_1/ + | |->Annotations/ + | | |->annotations.csv + | |->Images/ + | | |->blouse + | | |->... + |->... + |->test_0/ + | |->test.csv + | |->Images/ + | | |->blouse + | | |->... + ``` + DATA_DIR is your root path of the fashionAI Dataset. + - train_0 -> [update] warm_up_train_20180222.tar + - train_1 -> fashionAI_key_points_train_20180227.tar.gz + - train_2 -> fashionAI_key_points_test_a_20180227.tar + - train_3 -> fashionAI_key_points_test_b_20180418.tgz + - test_0 -> round2_fashionAI_key_points_test_a_20180426.tar + - test_1 -> round2_fashionAI_key_points_test_b_20180601.tar + +- set your local dataset path in [config.py](https://github.com/HiKapok/tf.fashionAI/blob/e90c5b0072338fa638c56ae788f7146d3f36cb1f/config.py#L20) +- create one file foler named 'model' under the root path of your codes, download all the pre-trained weights of the backbone networks and put them into different sub-folders named 'resnet50', 'seresnet50' and 'seresnext50'. Then start training(set RECORDS_DATA_DIR and TEST_RECORDS_DATA_DIR according to your [config.py](https://github.com/HiKapok/tf.fashionAI/blob/e90c5b0072338fa638c56ae788f7146d3f36cb1f/config.py#L20)): + ```sh + python train_detxt_cpn_onebyone.py --run_on_cloud=False --data_dir=RECORDS_DATA_DIR + python eval_all_cpn_onepass.py --run_on_cloud=False --backbone=detnext50_cpn --data_dir=TEST_RECORDS_DATA_DIR + ``` + Submit the generated 'detnext50_cpn_sub.csv' will give you ~0.0427 + ```sh + python train_senet_cpn_onebyone.py --run_on_cloud=False --data_dir=RECORDS_DATA_DIR + python eval_all_cpn_onepass.py --run_on_cloud=False --backbone=seresnext50_cpn --data_dir=TEST_RECORDS_DATA_DIR + ``` + Submit the generated 'seresnext50_cpn_sub.csv' will give you ~0.0424 + + Copy both 'detnext50_cpn_sub.csv' and 'seresnext50_cpn_sub.csv' to a new folder and modify the path and filename in [ensemble_from_csv.py](https://github.com/HiKapok/tf.fashionAI/blob/e90c5b0072338fa638c56ae788f7146d3f36cb1f/ensemble_from_csv.py#L27), then run 'python ensemble_from_csv.py' and submit the generated 'ensmeble.csv' will give you ~0.0407. +- training more deeper backbone networks will give better results (+0.001). +- the training of hourglass model is almost the same as above but gave inferior performance ## ## -Some Detection Results: +Some Detection Results (satge one): - Cascaded Pyramid Network: diff --git a/ensemble_from_csv.py b/ensemble_from_csv.py index b003dc99..a44a001e 100644 --- a/ensemble_from_csv.py +++ b/ensemble_from_csv.py @@ -32,7 +32,7 @@ # 'sub_2_hg_4_256_64-half_epoch.csv', # 'sub_2_hg_8_256_64_v1-half_epoch.csv']#['cpn_2_320_160_1e-3.csv', 'sub_2_hg_4_256_64.csv', 'sub_2_cpn_320_100_1e-3.csv', 'sub_2_hg_8_256_64.csv'] -ensemble_subs = ['sext_cpn_flip.csv', 'detxt_cpn_flip.csv'] +ensemble_subs = ['large_seresnext_cpn_sub.csv', 'large_detnext_cpn_sub.csv'] def parse_comma_list(args): diff --git a/eval_all_cpn_onepass.py b/eval_all_cpn_onepass.py index a3eec0d9..8ed922fe 100644 --- a/eval_all_cpn_onepass.py +++ b/eval_all_cpn_onepass.py @@ -112,9 +112,9 @@ 'seresnet50_cpn': {'backbone': seresnet_cpn.cascaded_pyramid_net, 'logs_sub_dir': 'logs_se_cpn'}, 'seresnext50_cpn': {'backbone': seresnet_cpn.xt_cascaded_pyramid_net, 'logs_sub_dir': 'logs_sext_cpn'}, 'detnext50_cpn': {'backbone': detxt_cpn.cascaded_pyramid_net, 'logs_sub_dir': 'logs_detxt_cpn'}, - 'large_seresnext_cpn': {'backbone': lambda inputs, output_channals, heatmap_size, istraining, data_format : seresnet_cpn.xt_cascaded_pyramid_net(inputs, output_channals, heatmap_size, istraining, data_format, net_depth=50), + 'large_seresnext_cpn': {'backbone': lambda inputs, output_channals, heatmap_size, istraining, data_format : seresnet_cpn.xt_cascaded_pyramid_net(inputs, output_channals, heatmap_size, istraining, data_format, net_depth=101), 'logs_sub_dir': 'logs_large_sext_cpn'}, - 'large_detnext_cpn': {'backbone': lambda inputs, output_channals, heatmap_size, istraining, data_format : detxt_cpn.cascaded_pyramid_net(inputs, output_channals, heatmap_size, istraining, data_format, net_depth=50), + 'large_detnext_cpn': {'backbone': lambda inputs, output_channals, heatmap_size, istraining, data_format : detxt_cpn.cascaded_pyramid_net(inputs, output_channals, heatmap_size, istraining, data_format, net_depth=101), 'logs_sub_dir': 'logs_large_detxt_cpn'}, 'head_seresnext50_cpn': {'backbone': seresnet_cpn.head_xt_cascaded_pyramid_net, 'logs_sub_dir': 'logs_head_sext_cpn'}, } @@ -164,7 +164,7 @@ def save_image_with_heatmap(image, height, width, heatmap_size, heatmap, predict imsave(os.path.join(config.EVAL_DEBUG_DIR, file_name), img.astype(np.uint8)) return save_image_with_heatmap.counter -def get_keypoint(image, predictions, heatmap_size, height, width, category, clip_at_zero=True, data_format='channels_last', name=None): +def get_keypoint(image, predictions, heatmap_size, height, width, category, clip_at_zero=False, data_format='channels_last', name=None): # expand_border = 10 # pad_pred = tf.pad(predictions, tf.constant([[0, 0], [0, 0], [expand_border, expand_border], [expand_border, expand_border]]), # mode='CONSTANT', name='pred_padding', constant_values=0) @@ -242,7 +242,7 @@ def keypoint_model_fn(features, labels, mode, params): if params['data_format'] == 'channels_last': pred_outputs = [tf.transpose(pred_outputs[ind], [0, 3, 1, 2], name='outputs_trans_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] - pred_x_first_stage, pred_y_first_stage = get_keypoint(image, pred_outputs[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) + pred_x_first_stage, pred_y_first_stage = get_keypoint(image, pred_outputs[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=False, data_format=params['data_format']) else: # test augumentation on the fly if params['data_format'] == 'channels_last': @@ -270,8 +270,8 @@ def cond_flip(heatmap_ind): pred_outputs = [tf.split(_, 2) for _ in pred_outputs] pred_outputs_1 = [_[0] for _ in pred_outputs] pred_outputs_2 = [_[1] for _ in pred_outputs] - pred_x_first_stage1, pred_y_first_stage1 = get_keypoint(image, pred_outputs_1[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) - pred_x_first_stage2, pred_y_first_stage2 = get_keypoint(image, pred_outputs_2[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) + pred_x_first_stage1, pred_y_first_stage1 = get_keypoint(image, pred_outputs_1[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=False, data_format=params['data_format']) + pred_x_first_stage2, pred_y_first_stage2 = get_keypoint(image, pred_outputs_2[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=False, data_format=params['data_format']) dist = tf.pow(tf.pow(pred_x_first_stage1 - pred_x_first_stage2, 2.) + tf.pow(pred_y_first_stage1 - pred_y_first_stage2, 2.), .5) @@ -318,7 +318,7 @@ def cond_flip(heatmap_ind): if params['data_format'] == 'channels_last': pred_outputs = [tf.transpose(pred_outputs[ind], [0, 3, 1, 2], name='outputs_trans_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] - pred_x, pred_y = get_keypoint(image, pred_outputs[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) + pred_x, pred_y = get_keypoint(image, pred_outputs[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=False, data_format=params['data_format']) else: # test augumentation on the fly with tf.name_scope("refine_prediction"): @@ -347,8 +347,9 @@ def cond_flip(heatmap_ind): pred_outputs = [tf.split(_, 2) for _ in pred_outputs] pred_outputs_1 = [_[0] for _ in pred_outputs] pred_outputs_2 = [_[1] for _ in pred_outputs] - pred_x_first_stage1, pred_y_first_stage1 = get_keypoint(image, pred_outputs_1[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) - pred_x_first_stage2, pred_y_first_stage2 = get_keypoint(image, pred_outputs_2[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) + #pred_outputs_1[-1] = tf.Print(pred_outputs_1[-1], [pred_outputs_1[-1]], summarize=10000) + pred_x_first_stage1, pred_y_first_stage1 = get_keypoint(image, pred_outputs_1[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=False, data_format=params['data_format']) + pred_x_first_stage2, pred_y_first_stage2 = get_keypoint(image, pred_outputs_2[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=False, data_format=params['data_format']) dist = tf.pow(tf.pow(pred_x_first_stage1 - pred_x_first_stage2, 2.) + tf.pow(pred_y_first_stage1 - pred_y_first_stage2, 2.), .5) @@ -435,17 +436,17 @@ def main(_): #Images/blouse/ab669925e96490ec698af976586f0b2f.jpg df.loc[cur_record] = [filename, m] + temp_list cur_record = cur_record + 1 - df.to_csv('./{}.csv'.format(m), encoding='utf-8', index=False) + df.to_csv('./{}_{}.csv'.format(FLAGS.backbone.strip(), m), encoding='utf-8', index=False) # merge dataframe - df_list = [pd.read_csv('./{}.csv'.format(model_to_eval[0]), encoding='utf-8')] + df_list = [pd.read_csv('./{}_{}.csv'.format(FLAGS.backbone.strip(), model_to_eval[0]), encoding='utf-8')] for m in model_to_eval[1:]: if m == '': continue - df_list.append(pd.read_csv('./{}.csv'.format(m), encoding='utf-8')) - pd.concat(df_list, ignore_index=True).to_csv('./sub.csv', encoding='utf-8', index=False) + df_list.append(pd.read_csv('./{}_{}.csv'.format(FLAGS.backbone.strip(), m), encoding='utf-8')) + pd.concat(df_list, ignore_index=True).to_csv('./{}_sub.csv'.format(FLAGS.backbone.strip()), encoding='utf-8', index=False) if FLAGS.run_on_cloud: - tf.gfile.Copy('./sub.csv', os.path.join(full_model_dir, 'sub.csv'), overwrite=True) + tf.gfile.Copy('./{}_sub.csv'.format(FLAGS.backbone.strip()), os.path.join(full_model_dir, '{}_sub.csv'.format(FLAGS.backbone.strip())), overwrite=True) if __name__ == '__main__': tf.logging.set_verbosity(tf.logging.INFO) diff --git a/eval_hg_subnet.py b/eval_hg_subnet.py index 77a21cce..a2c125ca 100644 --- a/eval_hg_subnet.py +++ b/eval_hg_subnet.py @@ -44,7 +44,7 @@ 'gpu_memory_fraction', 1., 'GPU memory fraction to use.') # scaffold related configuration tf.app.flags.DEFINE_string( - 'data_dir', '../Datasets/tfrecords_test',# tfrecords_test_stage1_b tfrecords_test + 'data_dir', '../Datasets/tfrecords_test_stage1_b',# tfrecords_test_stage1_b tfrecords_test 'The directory where the dataset input data is stored.') tf.app.flags.DEFINE_string( 'dataset_name', '{}_*.tfrecord', 'The pattern of the dataset name to load.') @@ -85,9 +85,6 @@ tf.app.flags.DEFINE_string( 'checkpoint_path', None, 'The path to a checkpoint from which to fine-tune.') -tf.app.flags.DEFINE_string( - 'coarse_pred_path', None, - 'The path to a pred csv file from which to crop the input image for finer prediction.') tf.app.flags.DEFINE_boolean( 'flip_on_test', False, 'Wether we will average predictions of left-right fliped image.') @@ -105,53 +102,11 @@ #--model_scope=blouse --checkpoint_path=./logs/blouse FLAGS = tf.app.flags.FLAGS -def preprocessing_fn(org_image, file_name, shape): - pd_df = None - if FLAGS.coarse_pred_path is not None: - if tf.gfile.Exists(FLAGS.coarse_pred_path): - tf.logging.info('Finetuning Prediction From {}.'.format(FLAGS.coarse_pred_path)) - tf.gfile.Copy(FLAGS.coarse_pred_path, './__coarse_pred.csv', overwrite=True) - pd_df = pd.read_csv('./__coarse_pred.csv', encoding='utf-8') - - all_filenames = [] - all_xmin = [] - all_ymin = [] - all_xmax = [] - all_ymax = [] - - all_values = pd_df.values.tolist() - for records in all_values: - all_filenames.append(records[0].encode('utf8')) - xmin = 2000 - ymin = 2000 - xmax = -1 - ymax = -1 - for kp in records[2:]: - keypoint_info = kp.strip().split('_') - if int(keypoint_info[2]) == -1: - continue - xmin = min(xmin, int(keypoint_info[0])) - ymin = min(ymin, int(keypoint_info[1])) - xmax = max(xmax, int(keypoint_info[0])) - ymax = max(ymax, int(keypoint_info[1])) - all_xmin.append(xmin) - all_ymin.append(ymin) - all_xmax.append(xmax) - all_ymax.append(ymax) - #print(all_filenames, all_xmin, all_ymin, all_xmax, all_ymax) - xmin_table = tf.contrib.lookup.HashTable(tf.contrib.lookup.KeyValueTensorInitializer(tf.constant(all_filenames, dtype=tf.string), tf.constant(all_xmin, dtype=tf.int64)), -1) - ymin_table = tf.contrib.lookup.HashTable(tf.contrib.lookup.KeyValueTensorInitializer(tf.constant(all_filenames, dtype=tf.string), tf.constant(all_ymin, dtype=tf.int64)), -1) - xmax_table = tf.contrib.lookup.HashTable(tf.contrib.lookup.KeyValueTensorInitializer(tf.constant(all_filenames, dtype=tf.string), tf.constant(all_xmax, dtype=tf.int64)), -1) - ymax_table = tf.contrib.lookup.HashTable(tf.contrib.lookup.KeyValueTensorInitializer(tf.constant(all_filenames, dtype=tf.string), tf.constant(all_ymax, dtype=tf.int64)), -1) - pd_df = [xmin_table, ymin_table, xmax_table, ymax_table] - #pred_item['file_name'].encode('utf8') - - #lnorm_table = tf.contrib.lookup.HashTable(tf.contrib.lookup.KeyValueTensorInitializer(tf.constant(config.global_norm_key, dtype=tf.int64), tf.constant(config.global_norm_lvalues, dtype=tf.int64)), 0) - return preprocessing.preprocess_for_test(org_image, file_name, shape, FLAGS.train_image_size, FLAGS.train_image_size, data_format=('NCHW' if FLAGS.data_format=='channels_first' else 'NHWC'), bbox_border=FLAGS.bbox_border, heatmap_sigma=FLAGS.heatmap_sigma, heatmap_size=FLAGS.heatmap_size, pred_df=pd_df) def input_pipeline(model_scope=FLAGS.model_scope): #preprocessing_fn = lambda org_image, shape: preprocessing.preprocess_for_test(org_image, shape, FLAGS.train_image_size, FLAGS.train_image_size, data_format=('NCHW' if FLAGS.data_format=='channels_first' else 'NHWC'), bbox_border=FLAGS.bbox_border, heatmap_sigma=FLAGS.heatmap_sigma, heatmap_size=FLAGS.heatmap_size) + preprocessing_fn = lambda org_image, file_name, shape: preprocessing.preprocess_for_test_raw_output(org_image, file_name, shape, FLAGS.train_image_size, FLAGS.train_image_size, data_format=('NCHW' if FLAGS.data_format=='channels_first' else 'NHWC'), bbox_border=FLAGS.bbox_border, heatmap_sigma=FLAGS.heatmap_sigma, heatmap_size=FLAGS.heatmap_size) - images, shape, file_name, classid, offsets = dataset.slim_test_get_split(FLAGS.data_dir, preprocessing_fn, FLAGS.num_readers, FLAGS.num_preprocessing_threads, file_pattern=FLAGS.dataset_name, category=(model_scope if 'all' not in model_scope else '*'), reader=None) + images, shape, file_name, classid, offsets = dataset.slim_test_get_split(FLAGS.data_dir, None, FLAGS.num_readers, FLAGS.num_preprocessing_threads, file_pattern=FLAGS.dataset_name, category=(model_scope if 'all' not in model_scope else '*'), reader=None, dynamic_pad=True) return {'images': images, 'shape': shape, 'classid': classid, 'file_name': file_name, 'pred_offsets': offsets} @@ -316,49 +271,138 @@ def keypoint_model_fn(features, labels, mode, params): file_name = tf.identity(file_name, name='current_file') + image = preprocessing.preprocess_for_test_raw_output(features, params['train_image_size'], params['train_image_size'], data_format=('NCHW' if FLAGS.data_format=='channels_first' else 'NHWC'), scope='first_stage') + if not params['flip_on_test']: - with tf.variable_scope(params['model_scope'], default_name=None, values=[features], reuse=tf.AUTO_REUSE): - pred_outputs = hg.create_model(features, params['num_stacks'], params['feats_channals'], + with tf.variable_scope(params['model_scope'], default_name=None, values=[image], reuse=tf.AUTO_REUSE): + pred_outputs = hg.create_model(image, params['num_stacks'], params['feats_channals'], config.class_num_joints[(params['model_scope'] if 'all' not in params['model_scope'] else '*')], params['num_modules'], (mode == tf.estimator.ModeKeys.TRAIN), params['data_format']) if params['data_format'] == 'channels_last': pred_outputs = [tf.transpose(pred_outputs[ind], [0, 3, 1, 2], name='outputs_trans_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] + + pred_x_first_stage, pred_y_first_stage = get_keypoint(image, pred_outputs[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) else: # test augumentation on the fly if params['data_format'] == 'channels_last': - double_features = tf.reshape(tf.stack([features, tf.map_fn(tf.image.flip_left_right, features, back_prop=False)], axis = 1), [-1, params['train_image_size'], params['train_image_size'], 3]) + double_features = tf.reshape(tf.stack([image, tf.map_fn(tf.image.flip_left_right, image, back_prop=False)], axis = 1), [-1, params['train_image_size'], params['train_image_size'], 3]) else: - double_features = tf.reshape(tf.stack([features, tf.transpose(tf.map_fn(tf.image.flip_left_right, tf.transpose(features, [0, 2, 3, 1], name='nchw2nhwc'), back_prop=False), [0, 3, 1, 2], name='nhwc2nchw')], axis = 1), [-1, 3, params['train_image_size'], params['train_image_size']]) + double_features = tf.reshape(tf.stack([image, tf.transpose(tf.map_fn(tf.image.flip_left_right, tf.transpose(image, [0, 2, 3, 1], name='nchw2nhwc'), back_prop=False), [0, 3, 1, 2], name='nhwc2nchw')], axis = 1), [-1, 3, params['train_image_size'], params['train_image_size']]) num_joints = config.class_num_joints[(params['model_scope'] if 'all' not in params['model_scope'] else '*')] with tf.variable_scope(params['model_scope'], default_name=None, values=[double_features], reuse=tf.AUTO_REUSE): pred_outputs = hg.create_model(double_features, params['num_stacks'], params['feats_channals'], - num_joints, params['num_modules'], + config.class_num_joints[(params['model_scope'] if 'all' not in params['model_scope'] else '*')], params['num_modules'], (mode == tf.estimator.ModeKeys.TRAIN), params['data_format']) if params['data_format'] == 'channels_last': pred_outputs = [tf.transpose(pred_outputs[ind], [0, 3, 1, 2], name='outputs_trans_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] - # [[0, 0, 0, ..], [1, 1, 1, ...], ...] - row_indices = tf.tile(tf.reshape(tf.range(tf.shape(double_features)[0]), [-1, 1]), [1, num_joints]) - # [[0, 1, 2, ...], [1, 0, 2, ...], [0, 1, 2], [1, 0, 2], ...] - col_indices = tf.reshape(tf.tile(tf.reshape(tf.stack([tf.range(num_joints), tf.constant(config.left_right_remap[(params['model_scope'] if 'all' not in params['model_scope'] else '*')])], axis=0), [-1]), [tf.shape(features)[0]]), [-1, num_joints]) - # [[[0, 0], [0, 1], [0, 2], ...], [[1, 1], [1, 0], [1, 2], ...], [[2, 0], [2, 1], [2, 2], ...], ...] + row_indices = tf.tile(tf.reshape(tf.stack([tf.range(0, tf.shape(double_features)[0], delta=2), tf.range(1, tf.shape(double_features)[0], delta=2)], axis=0), [-1, 1]), [1, num_joints]) + col_indices = tf.reshape(tf.tile(tf.reshape(tf.stack([tf.range(num_joints), tf.constant(config.left_right_remap[(params['model_scope'] if 'all' not in params['model_scope'] else '*')])], axis=0), [2, -1]), [1, tf.shape(features)[0]]), [-1, num_joints]) flip_indices=tf.stack([row_indices, col_indices], axis=-1) #flip_indices = tf.Print(flip_indices, [flip_indices], summarize=500) pred_outputs = [tf.gather_nd(pred_outputs[ind], flip_indices, name='gather_nd_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] def cond_flip(heatmap_ind): - return tf.cond(heatmap_ind[1] < 1, lambda : heatmap_ind[0], lambda : tf.transpose(tf.image.flip_left_right(tf.transpose(heatmap_ind[0], [1, 2, 0], name='pred_nchw2nhwc')), [2, 0, 1], name='pred_nhwc2nchw')) - # all the heatmap of the fliped image should also be fliped back, a little complicated - pred_outputs = [tf.map_fn(cond_flip, [pred_outputs[ind], tf.tile(tf.reshape(tf.range(2), [-1]), [tf.shape(features)[0]])], dtype=tf.float32, parallel_iterations=10, back_prop=True, swap_memory=False, infer_shape=True, name='map_fn_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] - # average predictions of left_reight_fliped image - segment_indices = tf.reshape(tf.tile(tf.reshape(tf.range(tf.shape(features)[0]), [-1, 1]), [1, 2]), [-1]) - pred_outputs = [tf.segment_mean(pred_outputs[ind], segment_indices, name='segment_mean_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] + return tf.cond(heatmap_ind[1] < tf.shape(features)[0], lambda : heatmap_ind[0], lambda : tf.transpose(tf.image.flip_left_right(tf.transpose(heatmap_ind[0], [1, 2, 0], name='pred_nchw2nhwc')), [2, 0, 1], name='pred_nhwc2nchw')) + # all the heatmap of the fliped image should also be fliped back + pred_outputs = [tf.map_fn(cond_flip, [pred_outputs[ind], tf.range(tf.shape(double_features)[0])], dtype=tf.float32, parallel_iterations=10, back_prop=True, swap_memory=False, infer_shape=True, name='map_fn_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] + pred_outputs = [tf.split(_, 2) for _ in pred_outputs] + pred_outputs_1 = [_[0] for _ in pred_outputs] + pred_outputs_2 = [_[1] for _ in pred_outputs] + pred_x_first_stage1, pred_y_first_stage1 = get_keypoint(image, pred_outputs_1[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) + pred_x_first_stage2, pred_y_first_stage2 = get_keypoint(image, pred_outputs_2[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) + + dist = tf.pow(tf.pow(pred_x_first_stage1 - pred_x_first_stage2, 2.) + tf.pow(pred_y_first_stage1 - pred_y_first_stage2, 2.), .5) + + pred_x_first_stage = tf.where(dist < 1e-3, pred_x_first_stage1, pred_x_first_stage1 + (pred_x_first_stage2 - pred_x_first_stage1) * 0.25 / dist) + pred_y_first_stage = tf.where(dist < 1e-3, pred_y_first_stage1, pred_y_first_stage1 + (pred_y_first_stage2 - pred_y_first_stage1) * 0.25 / dist) + + xmin = tf.cast(tf.reduce_min(pred_x_first_stage), tf.int64) + xmax = tf.cast(tf.reduce_max(pred_x_first_stage), tf.int64) + ymin = tf.cast(tf.reduce_min(pred_y_first_stage), tf.int64) + ymax = tf.cast(tf.reduce_max(pred_y_first_stage), tf.int64) + + xmin, ymin, xmax, ymax = xmin - 100, ymin - 80, xmax + 100, ymax + 80 + + xmin = tf.clip_by_value(xmin, 0, shape[0][1][0]-1) + ymin = tf.clip_by_value(ymin, 0, shape[0][0][0]-1) + xmax = tf.clip_by_value(xmax, 0, shape[0][1][0]-1) + ymax = tf.clip_by_value(ymax, 0, shape[0][0][0]-1) + + bbox_h = ymax - ymin + bbox_w = xmax - xmin + areas = bbox_h * bbox_w + + offsets=tf.stack([xmin, ymin], axis=0) + crop_shape = tf.stack([bbox_h, bbox_w, shape[0][2][0]], axis=0) + + ymin, xmin, bbox_h, bbox_w = tf.cast(ymin, tf.int32), tf.cast(xmin, tf.int32), tf.cast(bbox_h, tf.int32), tf.cast(bbox_w, tf.int32) + + single_image = tf.squeeze(features, [0]) + crop_image = tf.image.crop_to_bounding_box(single_image, ymin, xmin, bbox_h, bbox_w) + crop_image = tf.expand_dims(crop_image, 0) - pred_x, pred_y = get_keypoint(features, pred_outputs[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) + image, shape, offsets = tf.cond(areas > 0, lambda : (crop_image, crop_shape, offsets), + lambda : (features, shape, tf.constant([0, 0], tf.int64))) + offsets.set_shape([2]) + offsets = tf.to_float(offsets) + shape = tf.reshape(shape, [1, 3]) - predictions = {'pred_x': pred_x + pred_offsets[:, 0], 'pred_y': pred_y + pred_offsets[:, 1], 'file_name': file_name} + image = preprocessing.preprocess_for_test_raw_output(image, params['train_image_size'], params['train_image_size'], data_format=('NCHW' if FLAGS.data_format=='channels_first' else 'NHWC'), scope='second_stage') + + if not params['flip_on_test']: + with tf.variable_scope(params['model_scope'], default_name=None, values=[image], reuse=True): + pred_outputs = hg.create_model(image, params['num_stacks'], params['feats_channals'], + config.class_num_joints[(params['model_scope'] if 'all' not in params['model_scope'] else '*')], params['num_modules'], + (mode == tf.estimator.ModeKeys.TRAIN), params['data_format']) + with tf.name_scope("refine_prediction"): + if params['data_format'] == 'channels_last': + pred_outputs = [tf.transpose(pred_outputs[ind], [0, 3, 1, 2], name='outputs_trans_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] + + pred_x, pred_y = get_keypoint(image, pred_outputs[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) + else: + # test augumentation on the fly + with tf.name_scope("refine_prediction"): + if params['data_format'] == 'channels_last': + double_features = tf.reshape(tf.stack([image, tf.map_fn(tf.image.flip_left_right, image, back_prop=False)], axis = 1), [-1, params['train_image_size'], params['train_image_size'], 3]) + else: + double_features = tf.reshape(tf.stack([image, tf.transpose(tf.map_fn(tf.image.flip_left_right, tf.transpose(image, [0, 2, 3, 1], name='nchw2nhwc'), back_prop=False), [0, 3, 1, 2], name='nhwc2nchw')], axis = 1), [-1, 3, params['train_image_size'], params['train_image_size']]) + + num_joints = config.class_num_joints[(params['model_scope'] if 'all' not in params['model_scope'] else '*')] + with tf.variable_scope(params['model_scope'], default_name=None, values=[double_features], reuse=True): + pred_outputs = hg.create_model(double_features, params['num_stacks'], params['feats_channals'], + config.class_num_joints[(params['model_scope'] if 'all' not in params['model_scope'] else '*')], params['num_modules'], + (mode == tf.estimator.ModeKeys.TRAIN), params['data_format']) + with tf.name_scope("refine_prediction"): + if params['data_format'] == 'channels_last': + pred_outputs = [tf.transpose(pred_outputs[ind], [0, 3, 1, 2], name='outputs_trans_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] + row_indices = tf.tile(tf.reshape(tf.stack([tf.range(0, tf.shape(double_features)[0], delta=2), tf.range(1, tf.shape(double_features)[0], delta=2)], axis=0), [-1, 1]), [1, num_joints]) + col_indices = tf.reshape(tf.tile(tf.reshape(tf.stack([tf.range(num_joints), tf.constant(config.left_right_remap[(params['model_scope'] if 'all' not in params['model_scope'] else '*')])], axis=0), [2, -1]), [1, tf.shape(features)[0]]), [-1, num_joints]) + flip_indices=tf.stack([row_indices, col_indices], axis=-1) + + #flip_indices = tf.Print(flip_indices, [flip_indices], summarize=500) + pred_outputs = [tf.gather_nd(pred_outputs[ind], flip_indices, name='gather_nd_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] + + def cond_flip(heatmap_ind): + return tf.cond(heatmap_ind[1] < tf.shape(features)[0], lambda : heatmap_ind[0], lambda : tf.transpose(tf.image.flip_left_right(tf.transpose(heatmap_ind[0], [1, 2, 0], name='pred_nchw2nhwc')), [2, 0, 1], name='pred_nhwc2nchw')) + # all the heatmap of the fliped image should also be fliped back + pred_outputs = [tf.map_fn(cond_flip, [pred_outputs[ind], tf.range(tf.shape(double_features)[0])], dtype=tf.float32, parallel_iterations=10, back_prop=True, swap_memory=False, infer_shape=True, name='map_fn_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] + pred_outputs = [tf.split(_, 2) for _ in pred_outputs] + pred_outputs_1 = [_[0] for _ in pred_outputs] + pred_outputs_2 = [_[1] for _ in pred_outputs] + pred_x_first_stage1, pred_y_first_stage1 = get_keypoint(image, pred_outputs_1[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) + pred_x_first_stage2, pred_y_first_stage2 = get_keypoint(image, pred_outputs_2[-1], params['heatmap_size'], shape[0][0], shape[0][1], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) + + dist = tf.pow(tf.pow(pred_x_first_stage1 - pred_x_first_stage2, 2.) + tf.pow(pred_y_first_stage1 - pred_y_first_stage2, 2.), .5) + + pred_x = tf.where(dist < 1e-3, pred_x_first_stage1, pred_x_first_stage1 + (pred_x_first_stage2 - pred_x_first_stage1) * 0.25 / dist) + pred_y = tf.where(dist < 1e-3, pred_y_first_stage1, pred_y_first_stage1 + (pred_y_first_stage2 - pred_y_first_stage1) * 0.25 / dist) + # for var in tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES):#TRAINABLE_VARIABLES): + # print(var.op.name) + + predictions = {'pred_x': pred_x + offsets[0], 'pred_y': pred_y + offsets[1], 'file_name': file_name} if mode == tf.estimator.ModeKeys.PREDICT: return tf.estimator.EstimatorSpec( diff --git a/eval_script.sh b/eval_script.sh deleted file mode 100644 index b3f75370..00000000 --- a/eval_script.sh +++ /dev/null @@ -1,16 +0,0 @@ -#! /bin/bash - -# export CUDA_VISIBLE_DEVICES='0' -# source /home/kapok/pyenv35/bin/activate -# cd /media/rs/0E06CD1706CD0127/Kapok/Chi/fashionAI/Codes -python eval_all_cpn_onepass.py --run_on_cloud=False --backbone=seresnext50_cpn -python eval_all_cpn_onepass.py --run_on_cloud=False --backbone=detnext50_cpn -python eval_all_cpn_onepass.py --run_on_cloud=False --backbone=large_seresnext_cpn --train_image_size=512 --heatmap_size=128 -python eval_all_cpn_onepass.py --run_on_cloud=False --backbone=large_detnext_cpn --train_image_size=512 --heatmap_size=128 - -# for training -python train_senet_cpn_onebyone.py --run_on_cloud=False -python train_detxt_cpn_onebyone.py --run_on_cloud=False -python train_large_xt_cpn_onebyone.py --run_on_cloud=False --backbone=detxt -python train_large_xt_cpn_onebyone.py --run_on_cloud=False --backbone=sext - diff --git a/preprocessing/dataset.py b/preprocessing/dataset.py index d8c490db..553f8bee 100644 --- a/preprocessing/dataset.py +++ b/preprocessing/dataset.py @@ -28,7 +28,7 @@ # blouse_0000.tfrecord # {}_????_val.tfrecord #category = * -def slim_get_split(dataset_dir, image_preprocessing_fn, batch_size, num_readers, num_preprocessing_threads, num_epochs=None, is_training=True, category='blouse', file_pattern='{}_????', reader=None): +def slim_get_split(dataset_dir, image_preprocessing_fn, batch_size, num_readers, num_preprocessing_threads, num_epochs=None, is_training=True, category='blouse', file_pattern='{}_????', reader=None, return_keypoints=False): # Allowing None in the signature so that dataset_factory can use the default. if reader is None: reader = tf.TFRecordReader @@ -97,10 +97,15 @@ def slim_get_split(dataset_dir, image_preprocessing_fn, batch_size, num_readers, key_x, key_y, key_v, key_id, key_gid = tf.gather(key_x, gather_ind), tf.gather(key_y, gather_ind), tf.gather(key_v, gather_ind), tf.gather(key_id, gather_ind), tf.gather(key_gid, gather_ind) shape = tf.stack([height, width, channels], axis=0) - image, targets, new_key_v, isvalid, norm_value = image_preprocessing_fn(org_image, classid, shape, key_x, key_y, key_v) - batch_input = tf.train.batch([image, shape, - classid, targets, new_key_v, isvalid, norm_value], + if not return_keypoints: + image, targets, new_key_v, isvalid, norm_value = image_preprocessing_fn(org_image, classid, shape, key_x, key_y, key_v) + batch_list = [image, shape, classid, targets, new_key_v, isvalid, norm_value] + else: + image, targets, new_key_x, new_key_y, new_key_v, isvalid, norm_value = image_preprocessing_fn(org_image, classid, shape, key_x, key_y, key_v) + batch_list = [image, shape, classid, targets, new_key_x, new_key_y, new_key_v, isvalid, norm_value] + + batch_input = tf.train.batch(batch_list, #classid, key_x, key_y, key_v, key_id, key_gid], dynamic_pad=False,#(not is_training), batch_size = batch_size, diff --git a/preprocessing/preprocessing.py b/preprocessing/preprocessing.py index d4533fc7..b34caacc 100644 --- a/preprocessing/preprocessing.py +++ b/preprocessing/preprocessing.py @@ -766,6 +766,7 @@ def preprocess_for_train(image, data_format, category, bbox_border, heatmap_sigma, heatmap_size, + return_keypoints=False, resize_side_min=_RESIZE_SIDE_MIN, resize_side_max=_RESIZE_SIDE_MAX, fast_mode=False, @@ -894,7 +895,10 @@ def preprocess_for_train(image, if data_format == 'NCHW': distorted_image = tf.transpose(distorted_image, perm=(2, 0, 1)) - return distorted_image, targets, new_key_v, isvalid, norm_value + if not return_keypoints: + return distorted_image, targets, new_key_v, isvalid, norm_value + else: + return distorted_image, targets, new_key_x, new_key_y, new_key_v, isvalid, norm_value def preprocess_for_train_v0(image, @@ -906,6 +910,7 @@ def preprocess_for_train_v0(image, data_format, category, bbox_border, heatmap_sigma, heatmap_size, + return_keypoints=False, resize_side_min=_RESIZE_SIDE_MIN, resize_side_max=_RESIZE_SIDE_MAX, fast_mode=True, @@ -1208,6 +1213,7 @@ def preprocess_image(image, classid, shape, output_height, output_width, data_format='NCHW', category='*', bbox_border=25., heatmap_sigma=1., heatmap_size=64, + return_keypoints=False, resize_side_min=_RESIZE_SIDE_MIN, resize_side_max=_RESIZE_SIDE_MAX): """Preprocesses the given image. @@ -1231,7 +1237,7 @@ def preprocess_image(image, classid, shape, output_height, output_width, """ if is_training: return preprocess_for_train(image, classid, shape, output_height, output_width, key_x, key_y, key_v, norm_table, data_format, - category, bbox_border, heatmap_sigma, heatmap_size, resize_side_min, resize_side_max) + category, bbox_border, heatmap_sigma, heatmap_size, return_keypoints, resize_side_min, resize_side_max) else: return preprocess_for_eval(image, classid, shape, output_height, output_width, key_x, key_y, key_v, norm_table, data_format, category, bbox_border, heatmap_sigma, heatmap_size, min(output_height, output_width)) diff --git a/tf_replicate_model_fn.py b/tf_replicate_model_fn.py index 129004a3..c9153c93 100644 --- a/tf_replicate_model_fn.py +++ b/tf_replicate_model_fn.py @@ -780,7 +780,7 @@ def _predict_spec(tower_specs, aggregation_device): def _concat_tensor_dicts(*tensor_dicts): return { name: array_ops.concat(tensors, axis=0, name=name) - for name, tensors in six.iteritems(_dict_concat(*tensor_dicts))tf_replicate_model_fn.py + for name, tensors in six.iteritems(_dict_concat(*tensor_dicts)) } diff --git a/train_cpn_onebyone.py b/train_cpn_onebyone.py index 1b59d4ad..424b9407 100644 --- a/train_cpn_onebyone.py +++ b/train_cpn_onebyone.py @@ -559,8 +559,8 @@ def main(_): detail_params = { 'blouse': { 'model_dir' : os.path.join(FLAGS.model_dir, 'blouse'), - 'train_epochs': 40, - 'epochs_per_eval': 15, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'blouse', @@ -571,8 +571,8 @@ def main(_): }, 'dress': { 'model_dir' : os.path.join(FLAGS.model_dir, 'dress'), - 'train_epochs': 40, - 'epochs_per_eval': 15, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'dress', @@ -583,8 +583,8 @@ def main(_): }, 'outwear': { 'model_dir' : os.path.join(FLAGS.model_dir, 'outwear'), - 'train_epochs': 40, - 'epochs_per_eval': 15, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'outwear', @@ -595,8 +595,8 @@ def main(_): }, 'skirt': { 'model_dir' : os.path.join(FLAGS.model_dir, 'skirt'), - 'train_epochs': 40, - 'epochs_per_eval': 15, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'skirt', @@ -607,8 +607,8 @@ def main(_): }, 'trousers': { 'model_dir' : os.path.join(FLAGS.model_dir, 'trousers'), - 'train_epochs': 40, - 'epochs_per_eval': 15, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'trousers', diff --git a/train_detnet_cpn_onebyone.py b/train_detnet_cpn_onebyone.py index ae9b315d..d355f8f0 100644 --- a/train_detnet_cpn_onebyone.py +++ b/train_detnet_cpn_onebyone.py @@ -559,8 +559,8 @@ def main(_): detail_params = { 'blouse': { 'model_dir' : os.path.join(FLAGS.model_dir, 'blouse'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'blouse', @@ -571,8 +571,8 @@ def main(_): }, 'dress': { 'model_dir' : os.path.join(FLAGS.model_dir, 'dress'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'dress', @@ -583,8 +583,8 @@ def main(_): }, 'outwear': { 'model_dir' : os.path.join(FLAGS.model_dir, 'outwear'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'outwear', @@ -595,8 +595,8 @@ def main(_): }, 'skirt': { 'model_dir' : os.path.join(FLAGS.model_dir, 'skirt'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'skirt', @@ -607,8 +607,8 @@ def main(_): }, 'trousers': { 'model_dir' : os.path.join(FLAGS.model_dir, 'trousers'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'trousers', diff --git a/train_detxt_cpn_onebyone.py b/train_detxt_cpn_onebyone.py index 91859b50..04d8a7fa 100644 --- a/train_detxt_cpn_onebyone.py +++ b/train_detxt_cpn_onebyone.py @@ -559,8 +559,8 @@ def main(_): detail_params = { 'blouse': { 'model_dir' : os.path.join(FLAGS.model_dir, 'blouse'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'blouse', @@ -571,8 +571,8 @@ def main(_): }, 'dress': { 'model_dir' : os.path.join(FLAGS.model_dir, 'dress'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'dress', @@ -583,8 +583,8 @@ def main(_): }, 'outwear': { 'model_dir' : os.path.join(FLAGS.model_dir, 'outwear'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'outwear', @@ -595,8 +595,8 @@ def main(_): }, 'skirt': { 'model_dir' : os.path.join(FLAGS.model_dir, 'skirt'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'skirt', @@ -607,8 +607,8 @@ def main(_): }, 'trousers': { 'model_dir' : os.path.join(FLAGS.model_dir, 'trousers'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'trousers', @@ -620,13 +620,6 @@ def main(_): } model_to_train = [s.strip() for s in FLAGS.model_to_train.split(',')] - # import datetime - # import time - # while True: - # time.sleep(1600) - # if '8' in datetime.datetime.now().time().strftime('%H'): - # break - for m in model_to_train: sub_loop(keypoint_model_fn, m, detail_params[m]['model_dir'], run_config, detail_params[m]['train_epochs'], detail_params[m]['epochs_per_eval'], detail_params[m]['lr_decay_factors'], detail_params[m]['decay_boundaries'], detail_params[m]['checkpoint_path'], detail_params[m]['checkpoint_exclude_scopes'], detail_params[m]['checkpoint_model_scope'], detail_params[m]['ignore_missing_vars']) diff --git a/train_head_senet_cpn_onebyone.py b/train_head_senet_cpn_onebyone.py deleted file mode 100644 index 56fe8b0e..00000000 --- a/train_head_senet_cpn_onebyone.py +++ /dev/null @@ -1,640 +0,0 @@ -# Copyright 2018 Changan Wang - -# Licensed under the Apache License, Version 2.0 (the "License"); -# you may not use this file except in compliance with the License. -# You may obtain a copy of the License at - -# http://www.apache.org/licenses/LICENSE-2.0 - -# Unless required by applicable law or agreed to in writing, software -# distributed under the License is distributed on an "AS IS" BASIS, -# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. -# See the License for the specific language governing permissions and -# limitations under the License. -# ============================================================================= -from __future__ import absolute_import -from __future__ import division -from __future__ import print_function - -import os -import sys -import numpy as np -#from scipy.misc import imread, imsave, imshow, imresize -import tensorflow as tf - -from net import seresnet_cpn as cpn -from utility import train_helper -from utility import mertric - -from preprocessing import preprocessing -from preprocessing import dataset -import config - -# hardware related configuration -tf.app.flags.DEFINE_integer( - 'num_readers', 16,#16 - 'The number of parallel readers that read data from the dataset.') -tf.app.flags.DEFINE_integer( - 'num_preprocessing_threads', 48,#48 - 'The number of threads used to create the batches.') -tf.app.flags.DEFINE_integer( - 'num_cpu_threads', 0, - 'The number of cpu cores used to train.') -tf.app.flags.DEFINE_float( - 'gpu_memory_fraction', 1., 'GPU memory fraction to use.') -# scaffold related configuration -tf.app.flags.DEFINE_string( - 'data_dir', '../Datasets/tfrecords',#'/media/rs/0E06CD1706CD0127/Kapok/Chi/Datasets/tfrecords', - 'The directory where the dataset input data is stored.') -tf.app.flags.DEFINE_string( - 'dataset_name', '{}_????', 'The pattern of the dataset name to load.') -tf.app.flags.DEFINE_string( - 'model_dir', './logs_head_sext_cpn/', - 'The parent directory where the model will be stored.') -tf.app.flags.DEFINE_integer( - 'log_every_n_steps', 10, - 'The frequency with which logs are print.') -tf.app.flags.DEFINE_integer( - 'save_summary_steps', 100, - 'The frequency with which summaries are saved, in seconds.') -tf.app.flags.DEFINE_integer( - 'save_checkpoints_secs', 3600, - 'The frequency with which the model is saved, in seconds.') -# model related configuration -tf.app.flags.DEFINE_integer( - 'train_image_size', 384, - 'The size of the input image for the model to use.') -tf.app.flags.DEFINE_integer( - 'heatmap_size', 96, - 'The size of the output heatmap of the model.') -tf.app.flags.DEFINE_float( - 'heatmap_sigma', 1., - 'The sigma of Gaussian which generate the target heatmap.') -tf.app.flags.DEFINE_float( - 'bbox_border', 25., - 'The nearest distance of the crop border to al keypoints.') -tf.app.flags.DEFINE_integer( - 'train_epochs', 50, - 'The number of epochs to use for training.') -tf.app.flags.DEFINE_integer( - 'epochs_per_eval', 20, - 'The number of training epochs to run between evaluations.') -tf.app.flags.DEFINE_integer( - 'batch_size', 10, - 'Batch size for training and evaluation.') -tf.app.flags.DEFINE_integer( - 'xt_batch_size', 10, - 'Batch size for training and evaluation.') -tf.app.flags.DEFINE_boolean( - 'use_ohkm', True, - 'Wether we will use the ohkm for hard keypoints.') -tf.app.flags.DEFINE_string( - 'data_format', 'channels_first', # 'channels_first' or 'channels_last' - 'A flag to override the data format used in the model. channels_first ' - 'provides a performance boost on GPU but is not always compatible ' - 'with CPU. If left unspecified, the data format will be chosen ' - 'automatically based on whether TensorFlow was built for CPU or GPU.') -# optimizer related configuration -tf.app.flags.DEFINE_integer( - 'tf_random_seed', 20180417, 'Random seed for TensorFlow initializers.') -tf.app.flags.DEFINE_float( - 'weight_decay', 1e-5, 'The weight decay on the model weights.') -tf.app.flags.DEFINE_float( - 'mse_weight', 1., 'The weight decay on the model weights.') -tf.app.flags.DEFINE_float( - 'momentum', 0.9, - 'The momentum for the MomentumOptimizer and RMSPropOptimizer.') -tf.app.flags.DEFINE_float('learning_rate', 1e-4, 'Initial learning rate.')#1e-3 -tf.app.flags.DEFINE_float( - 'end_learning_rate', 0.000001, - 'The minimal end learning rate used by a polynomial decay learning rate.') -tf.app.flags.DEFINE_float( - 'warmup_learning_rate', 0.00001, - 'The start warm-up learning rate to avoid NAN.') -tf.app.flags.DEFINE_integer( - 'warmup_steps', 100, - 'The total steps to warm-up.') -# for learning rate piecewise_constant decay -tf.app.flags.DEFINE_string( - 'decay_boundaries', '2, 3', - 'Learning rate decay boundaries by global_step (comma-separated list).') -tf.app.flags.DEFINE_string( - 'lr_decay_factors', '1, 0.5, 0.1', - 'The values of learning_rate decay factor for each segment between boundaries (comma-separated list).') -# checkpoint related configuration -tf.app.flags.DEFINE_string( - 'checkpoint_path', './model', - 'The path to a checkpoint from which to fine-tune.') -tf.app.flags.DEFINE_string( - 'checkpoint_model_scope', '', - 'Model scope in the checkpoint. None if the same as the trained model.') -tf.app.flags.DEFINE_string( - #'blouse', 'dress', 'outwear', 'skirt', 'trousers', 'all' - 'model_scope', None, - 'Model scope name used to replace the name_scope in checkpoint.') -tf.app.flags.DEFINE_string( - 'checkpoint_exclude_scopes', None, - 'Comma-separated list of scopes of variables to exclude when restoring from a checkpoint.') -tf.app.flags.DEFINE_boolean( - 'ignore_missing_vars', True, - 'When restoring a checkpoint would ignore missing variables.') -tf.app.flags.DEFINE_boolean( - 'run_on_cloud', True, - 'Wether we will train on cloud.') -tf.app.flags.DEFINE_boolean( - 'seq_train', False, - 'Wether we will train a sequence model.') -tf.app.flags.DEFINE_string( - 'model_to_train', 'blouse, dress, outwear, skirt, trousers', #'all, blouse, dress, outwear, skirt, trousers', 'skirt, dress, outwear, trousers', - 'The sub-model to train (comma-separated list).') - -FLAGS = tf.app.flags.FLAGS -#--model_scope=blouse --checkpoint_path=./logs/all --data_format=channels_last --batch_size=1 -def input_pipeline(is_training=True, model_scope=FLAGS.model_scope, num_epochs=FLAGS.epochs_per_eval): - if 'all' in model_scope: - lnorm_table = tf.contrib.lookup.HashTable(tf.contrib.lookup.KeyValueTensorInitializer(tf.constant(config.global_norm_key, dtype=tf.int64), - tf.constant(config.global_norm_lvalues, dtype=tf.int64)), 0) - rnorm_table = tf.contrib.lookup.HashTable(tf.contrib.lookup.KeyValueTensorInitializer(tf.constant(config.global_norm_key, dtype=tf.int64), - tf.constant(config.global_norm_rvalues, dtype=tf.int64)), 1) - else: - lnorm_table = tf.contrib.lookup.HashTable(tf.contrib.lookup.KeyValueTensorInitializer(tf.constant(config.local_norm_key, dtype=tf.int64), - tf.constant(config.local_norm_lvalues, dtype=tf.int64)), 0) - rnorm_table = tf.contrib.lookup.HashTable(tf.contrib.lookup.KeyValueTensorInitializer(tf.constant(config.local_norm_key, dtype=tf.int64), - tf.constant(config.local_norm_rvalues, dtype=tf.int64)), 1) - - preprocessing_fn = lambda org_image, classid, shape, key_x, key_y, key_v: preprocessing.preprocess_image(org_image, classid, shape, FLAGS.train_image_size, FLAGS.train_image_size, key_x, key_y, key_v, (lnorm_table, rnorm_table), is_training=is_training, data_format=('NCHW' if FLAGS.data_format=='channels_first' else 'NHWC'), category=(model_scope if 'all' not in model_scope else '*'), bbox_border=FLAGS.bbox_border, heatmap_sigma=FLAGS.heatmap_sigma, heatmap_size=FLAGS.heatmap_size) - - images, shape, classid, targets, key_v, isvalid, norm_value = dataset.slim_get_split(FLAGS.data_dir, preprocessing_fn, FLAGS.batch_size, FLAGS.num_readers, FLAGS.num_preprocessing_threads, num_epochs=num_epochs, is_training=is_training, file_pattern=FLAGS.dataset_name, category=(model_scope if 'all' not in model_scope else '*'), reader=None) - - return images, {'targets': targets, 'key_v': key_v, 'shape': shape, 'classid': classid, 'isvalid': isvalid, 'norm_value': norm_value} - -if config.PRED_DEBUG: - from scipy.misc import imread, imsave, imshow, imresize - def save_image_with_heatmap(image, height, width, heatmap_size, targets, pred_heatmap, indR, indG, indB): - if not hasattr(save_image_with_heatmap, "counter"): - save_image_with_heatmap.counter = 0 # it doesn't exist yet, so initialize it - save_image_with_heatmap.counter += 1 - - img_to_save = np.array(image.tolist()) + 128 - #print(img_to_save.shape) - - img_to_save = img_to_save.astype(np.uint8) - - heatmap0 = np.sum(targets[indR, ...], axis=0).astype(np.uint8) - heatmap1 = np.sum(targets[indG, ...], axis=0).astype(np.uint8) - heatmap2 = np.sum(targets[indB, ...], axis=0).astype(np.uint8) if len(indB) > 0 else np.zeros((heatmap_size, heatmap_size), dtype=np.float32) - - img_to_save = imresize(img_to_save, (height, width), interp='lanczos') - heatmap0 = imresize(heatmap0, (height, width), interp='lanczos') - heatmap1 = imresize(heatmap1, (height, width), interp='lanczos') - heatmap2 = imresize(heatmap2, (height, width), interp='lanczos') - - img_to_save = img_to_save/2 - img_to_save[:,:,0] = np.clip((img_to_save[:,:,0] + heatmap0 + heatmap2), 0, 255) - img_to_save[:,:,1] = np.clip((img_to_save[:,:,1] + heatmap1 + heatmap2), 0, 255) - #img_to_save[:,:,2] = np.clip((img_to_save[:,:,2]/4. + heatmap2), 0, 255) - file_name = 'targets_{}.jpg'.format(save_image_with_heatmap.counter) - imsave(os.path.join(config.DEBUG_DIR, file_name), img_to_save.astype(np.uint8)) - - pred_heatmap = np.array(pred_heatmap.tolist()) - #print(pred_heatmap.shape) - for ind in range(pred_heatmap.shape[0]): - img = pred_heatmap[ind] - img = img - img.min() - img *= 255.0/img.max() - file_name = 'heatmap_{}_{}.jpg'.format(save_image_with_heatmap.counter, ind) - imsave(os.path.join(config.DEBUG_DIR, file_name), img.astype(np.uint8)) - return save_image_with_heatmap.counter - -def get_keypoint(image, targets, predictions, heatmap_size, height, width, category, clip_at_zero=True, data_format='channels_last', name=None): - predictions = tf.reshape(predictions, [1, -1, heatmap_size*heatmap_size]) - - pred_max = tf.reduce_max(predictions, axis=-1) - pred_indices = tf.argmax(predictions, axis=-1) - pred_x, pred_y = tf.cast(tf.floormod(pred_indices, heatmap_size), tf.float32), tf.cast(tf.floordiv(pred_indices, heatmap_size), tf.float32) - - width, height = tf.cast(width, tf.float32), tf.cast(height, tf.float32) - pred_x, pred_y = pred_x * width / tf.cast(heatmap_size, tf.float32), pred_y * height / tf.cast(heatmap_size, tf.float32) - - if clip_at_zero: - pred_x, pred_y = pred_x * tf.cast(pred_max>0, tf.float32), pred_y * tf.cast(pred_max>0, tf.float32) - pred_x = pred_x * tf.cast(pred_max>0, tf.float32) + tf.cast(pred_max<=0, tf.float32) * (width / 2.) - pred_y = pred_y * tf.cast(pred_max>0, tf.float32) + tf.cast(pred_max<=0, tf.float32) * (height / 2.) - - if config.PRED_DEBUG: - pred_indices_ = tf.squeeze(pred_indices) - image_ = tf.squeeze(image) * 255. - pred_heatmap = tf.one_hot(pred_indices_, heatmap_size*heatmap_size, on_value=1., off_value=0., axis=-1, dtype=tf.float32) - - pred_heatmap = tf.reshape(pred_heatmap, [-1, heatmap_size, heatmap_size]) - if data_format == 'channels_first': - image_ = tf.transpose(image_, perm=(1, 2, 0)) - save_image_op = tf.py_func(save_image_with_heatmap, - [image_, height, width, - heatmap_size, - tf.reshape(pred_heatmap * 255., [-1, heatmap_size, heatmap_size]), - tf.reshape(predictions, [-1, heatmap_size, heatmap_size]), - config.left_right_group_map[category][0], - config.left_right_group_map[category][1], - config.left_right_group_map[category][2]], - tf.int64, stateful=True) - with tf.control_dependencies([save_image_op]): - pred_x, pred_y = pred_x * 1., pred_y * 1. - return pred_x, pred_y - -def gaussian_blur(inputs, inputs_filters, sigma, data_format, name=None): - with tf.name_scope(name, "gaussian_blur", [inputs]): - data_format_ = 'NHWC' if data_format=='channels_last' else 'NCHW' - if data_format_ == 'NHWC': - inputs = tf.transpose(inputs, [0, 2, 3, 1]) - ksize = int(6 * sigma + 1.) - x = tf.expand_dims(tf.range(ksize, delta=1, dtype=tf.float32), axis=1) - y = tf.transpose(x, [1, 0]) - kernel_matrix = tf.exp(- ((x - ksize/2.) ** 2 + (y - ksize/2.) ** 2) / (2 * sigma ** 2)) - #print(kernel_matrix) - kernel_filter = tf.reshape(kernel_matrix, [ksize, ksize, 1, 1]) - kernel_filter = tf.tile(kernel_filter, [1, 1, inputs_filters, 1]) - #kernel_filter = tf.transpose(kernel_filter, [1, 0, 2, 3]) - outputs = tf.nn.depthwise_conv2d(inputs, kernel_filter, strides=[1, 1, 1, 1], padding='SAME', data_format=data_format_, name='blur') - if data_format_ == 'NHWC': - outputs = tf.transpose(outputs, [0, 3, 1, 2]) - return outputs - -def keypoint_model_fn(features, labels, mode, params): - targets = labels['targets'] - shape = labels['shape'] - classid = labels['classid'] - key_v = labels['key_v'] - isvalid = labels['isvalid'] - norm_value = labels['norm_value'] - - cur_batch_size = tf.shape(features)[0] - #features= tf.ones_like(features) - - with tf.variable_scope(params['model_scope'], default_name=None, values=[features], reuse=tf.AUTO_REUSE): - pred_outputs = cpn.head_xt_cascaded_pyramid_net(features, config.class_num_joints[(params['model_scope'] if 'all' not in params['model_scope'] else '*')], params['heatmap_size'], (mode == tf.estimator.ModeKeys.TRAIN), params['data_format']) - - if params['data_format'] == 'channels_last': - pred_outputs = [tf.transpose(pred_outputs[ind], [0, 3, 1, 2], name='outputs_trans_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] - - score_map = pred_outputs[-1] - - pred_x, pred_y = get_keypoint(features, targets, score_map, params['heatmap_size'], params['train_image_size'], params['train_image_size'], (params['model_scope'] if 'all' not in params['model_scope'] else '*'), clip_at_zero=True, data_format=params['data_format']) - - # this is important!!! - targets = 255. * targets - blur_list = [1., 1.37, 1.73, 2.4, None]#[1., 1.5, 2., 3., None] - #blur_list = [None, None, None, None, None] - - targets_list = [] - for sigma in blur_list: - if sigma is None: - targets_list.append(targets) - else: - # always channels first foe targets - targets_list.append(gaussian_blur(targets, config.class_num_joints[(params['model_scope'] if 'all' not in params['model_scope'] else '*')], sigma, params['data_format'], 'blur_{}'.format(sigma))) - - # print(key_v) - #targets = tf.reshape(255.*tf.one_hot(tf.ones_like(key_v,tf.int64)*(params['heatmap_size']*params['heatmap_size']//2+params['heatmap_size']), params['heatmap_size']*params['heatmap_size']), [cur_batch_size,-1,params['heatmap_size'],params['heatmap_size']]) - #norm_value = tf.ones_like(norm_value) - # score_map = tf.reshape(tf.one_hot(tf.ones_like(key_v,tf.int64)*(31*64+31), params['heatmap_size']*params['heatmap_size']), [cur_batch_size,-1,params['heatmap_size'],params['heatmap_size']]) - - #with tf.control_dependencies([pred_x, pred_y]): - ne_mertric = mertric.normalized_error(targets, score_map, norm_value, key_v, isvalid, - cur_batch_size, - config.class_num_joints[(params['model_scope'] if 'all' not in params['model_scope'] else '*')], - params['heatmap_size'], - params['train_image_size']) - - # last_pred_mse = tf.metrics.mean_squared_error(score_map, targets, - # weights=1.0 / tf.cast(cur_batch_size, tf.float32), - # name='last_pred_mse') - # filter all invisible keypoint maybe better for this task - # all_visible = tf.logical_and(key_v>0, isvalid>0) - # targets_list = [tf.boolean_mask(targets_list[ind], all_visible) for ind in list(range(len(targets_list)))] - # pred_outputs = [tf.boolean_mask(pred_outputs[ind], all_visible, name='boolean_mask_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] - all_visible = tf.expand_dims(tf.expand_dims(tf.cast(tf.logical_and(key_v>0, isvalid>0), tf.float32), axis=-1), axis=-1) - targets_list = [targets_list[ind] * all_visible for ind in list(range(len(targets_list)))] - pred_outputs = [pred_outputs[ind] * all_visible for ind in list(range(len(pred_outputs)))] - - sq_diff = tf.reduce_sum(tf.squared_difference(targets, pred_outputs[-1]), axis=-1) - last_pred_mse = tf.metrics.mean_absolute_error(sq_diff, tf.zeros_like(sq_diff), name='last_pred_mse') - - metrics = {'normalized_error': ne_mertric, 'last_pred_mse':last_pred_mse} - predictions = {'normalized_error': ne_mertric[1]} - ne_mertric = tf.identity(ne_mertric[1], name='ne_mertric') - - base_learning_rate = params['learning_rate'] - mse_loss_list = [] - if params['use_ohkm']: - base_learning_rate = 1. * base_learning_rate - for pred_ind in list(range(len(pred_outputs) - 1)): - mse_loss_list.append(0.5 * tf.losses.mean_squared_error(targets_list[pred_ind], pred_outputs[pred_ind], - weights=1.0 / tf.cast(cur_batch_size, tf.float32), - scope='loss_{}'.format(pred_ind), - loss_collection=None,#tf.GraphKeys.LOSSES, - # mean all elements of all pixels in all batch - reduction=tf.losses.Reduction.MEAN))# SUM, SUM_OVER_BATCH_SIZE, default mean by all elements - - temp_loss = tf.reduce_mean(tf.reshape(tf.losses.mean_squared_error(targets_list[-1], pred_outputs[-1], weights=1.0, loss_collection=None, reduction=tf.losses.Reduction.NONE), [cur_batch_size, config.class_num_joints[(params['model_scope'] if 'all' not in params['model_scope'] else '*')], -1]), axis=-1) - - num_topk = config.class_num_joints[(params['model_scope'] if 'all' not in params['model_scope'] else '*')] // 2 - gather_col = tf.nn.top_k(temp_loss, k=num_topk, sorted=True)[1] - gather_row = tf.reshape(tf.tile(tf.reshape(tf.range(cur_batch_size), [-1, 1]), [1, num_topk]), [-1, 1]) - gather_indcies = tf.stop_gradient(tf.stack([gather_row, tf.reshape(gather_col, [-1, 1])], axis=-1)) - - select_targets = tf.gather_nd(targets_list[-1], gather_indcies) - select_heatmap = tf.gather_nd(pred_outputs[-1], gather_indcies) - - mse_loss_list.append(tf.losses.mean_squared_error(select_targets, select_heatmap, - weights=1.0 / tf.cast(cur_batch_size, tf.float32), - scope='loss_{}'.format(len(pred_outputs) - 1), - loss_collection=None,#tf.GraphKeys.LOSSES, - # mean all elements of all pixels in all batch - reduction=tf.losses.Reduction.MEAN)) - else: - for pred_ind in list(range(len(pred_outputs))): - mse_loss_list.append(tf.losses.mean_squared_error(targets_list[pred_ind], pred_outputs[pred_ind], - weights=1.0 / tf.cast(cur_batch_size, tf.float32), - scope='loss_{}'.format(pred_ind), - loss_collection=None,#tf.GraphKeys.LOSSES, - # mean all elements of all pixels in all batch - reduction=tf.losses.Reduction.MEAN))# SUM, SUM_OVER_BATCH_SIZE, default mean by all elements - - mse_loss = tf.multiply(params['mse_weight'], tf.add_n(mse_loss_list), name='mse_loss') - tf.summary.scalar('mse', mse_loss) - tf.losses.add_loss(mse_loss) - - # bce_loss_list = [] - # for pred_ind in list(range(len(pred_outputs))): - # bce_loss_list.append(tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=pred_outputs[pred_ind], labels=targets_list[pred_ind]/255., name='loss_{}'.format(pred_ind)), name='loss_mean_{}'.format(pred_ind))) - - # mse_loss = tf.multiply(params['mse_weight'] / params['num_stacks'], tf.add_n(bce_loss_list), name='mse_loss') - # tf.summary.scalar('mse', mse_loss) - # tf.losses.add_loss(mse_loss) - - # Add weight decay to the loss. We exclude the batch norm variables because - # doing so leads to a small improvement in accuracy. - loss = mse_loss + params['weight_decay'] * tf.add_n([tf.nn.l2_loss(v) for v in tf.trainable_variables() if 'batch_normalization' not in v.name]) - total_loss = tf.identity(loss, name='total_loss') - tf.summary.scalar('loss', total_loss) - - if mode == tf.estimator.ModeKeys.EVAL: - return tf.estimator.EstimatorSpec(mode=mode, loss=loss, predictions=predictions, eval_metric_ops=metrics) - - if mode == tf.estimator.ModeKeys.TRAIN: - global_step = tf.train.get_or_create_global_step() - - lr_values = [params['warmup_learning_rate']] + [base_learning_rate * decay for decay in params['lr_decay_factors']] - learning_rate = tf.train.piecewise_constant(tf.cast(global_step, tf.int32), - [params['warmup_steps']] + [int(float(ep)*params['steps_per_epoch']) for ep in params['decay_boundaries']], - lr_values) - truncated_learning_rate = tf.maximum(learning_rate, tf.constant(params['end_learning_rate'], dtype=learning_rate.dtype), name='learning_rate') - tf.summary.scalar('lr', truncated_learning_rate) - - optimizer = tf.train.MomentumOptimizer(learning_rate=truncated_learning_rate, - momentum=params['momentum']) - - # Batch norm requires update_ops to be added as a train_op dependency. - update_ops = tf.get_collection(tf.GraphKeys.UPDATE_OPS) - with tf.control_dependencies(update_ops): - train_op = optimizer.minimize(loss, global_step) - else: - train_op = None - - return tf.estimator.EstimatorSpec( - mode=mode, - predictions=predictions, - loss=loss, - train_op=train_op, - eval_metric_ops=metrics, - scaffold=tf.train.Scaffold(init_fn=train_helper.get_init_fn_for_scaffold_(params['checkpoint_path'], params['model_dir'], params['checkpoint_exclude_scopes'], params['model_scope'], params['checkpoint_model_scope'], params['ignore_missing_vars']))) - -def parse_comma_list(args): - return [float(s.strip()) for s in args.split(',')] - -def sub_loop(model_fn, model_scope, model_dir, run_config, train_epochs, epochs_per_eval, lr_decay_factors, decay_boundaries, checkpoint_path=None, checkpoint_exclude_scopes='', checkpoint_model_scope='', ignore_missing_vars=True): - steps_per_epoch = config.split_size[(model_scope if 'all' not in model_scope else '*')]['train'] // FLAGS.batch_size - fashionAI = tf.estimator.Estimator( - model_fn=model_fn, model_dir=model_dir, config=run_config, - params={ - 'checkpoint_path': checkpoint_path, - 'model_dir': model_dir, - 'checkpoint_exclude_scopes': checkpoint_exclude_scopes, - 'model_scope': model_scope, - 'checkpoint_model_scope': checkpoint_model_scope, - 'ignore_missing_vars': ignore_missing_vars, - 'train_image_size': FLAGS.train_image_size, - 'heatmap_size': FLAGS.heatmap_size, - 'data_format': FLAGS.data_format, - 'steps_per_epoch': steps_per_epoch, - 'use_ohkm': FLAGS.use_ohkm, - 'batch_size': FLAGS.batch_size, - 'weight_decay': FLAGS.weight_decay, - 'mse_weight': FLAGS.mse_weight, - 'momentum': FLAGS.momentum, - 'learning_rate': FLAGS.learning_rate, - 'end_learning_rate': FLAGS.end_learning_rate, - 'warmup_learning_rate': FLAGS.warmup_learning_rate, - 'warmup_steps': FLAGS.warmup_steps, - 'decay_boundaries': parse_comma_list(decay_boundaries), - 'lr_decay_factors': parse_comma_list(lr_decay_factors), - }) - - tf.gfile.MakeDirs(model_dir) - tf.logging.info('Starting to train model {}.'.format(model_scope)) - for _ in range(train_epochs // epochs_per_eval): - tensors_to_log = { - 'lr': 'learning_rate', - 'loss': 'total_loss', - 'mse': 'mse_loss', - 'ne': 'ne_mertric', - } - - logging_hook = tf.train.LoggingTensorHook(tensors=tensors_to_log, every_n_iter=FLAGS.log_every_n_steps, formatter=lambda dicts: '{}:'.format(model_scope) + (', '.join(['%s=%.6f' % (k, v) for k, v in dicts.items()]))) - - tf.logging.info('Starting a training cycle.') - fashionAI.train(input_fn=lambda : input_pipeline(True, model_scope, epochs_per_eval), hooks=[logging_hook], max_steps=(steps_per_epoch*train_epochs)) - - tf.logging.info('Starting to evaluate.') - eval_results = fashionAI.evaluate(input_fn=lambda : input_pipeline(False, model_scope, 1)) - tf.logging.info(eval_results) - tf.logging.info('Finished model {}.'.format(model_scope)) - -def main(_): - # Using the Winograd non-fused algorithms provides a small performance boost. - os.environ['TF_ENABLE_WINOGRAD_NONFUSED'] = '1' - - gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction = FLAGS.gpu_memory_fraction) - sess_config = tf.ConfigProto(allow_soft_placement = True, log_device_placement = False, intra_op_parallelism_threads = FLAGS.num_cpu_threads, inter_op_parallelism_threads = FLAGS.num_cpu_threads, gpu_options = gpu_options) - - # Set up a RunConfig to only save checkpoints once per training cycle. - run_config = tf.estimator.RunConfig().replace( - save_checkpoints_secs=FLAGS.save_checkpoints_secs).replace( - save_checkpoints_steps=None).replace( - save_summary_steps=FLAGS.save_summary_steps).replace( - keep_checkpoint_max=5).replace( - tf_random_seed=FLAGS.tf_random_seed).replace( - log_step_count_steps=FLAGS.log_every_n_steps).replace( - session_config=sess_config) - - if FLAGS.seq_train: - detail_params = { - 'all': { - 'model_dir' : os.path.join(FLAGS.model_dir, 'all'), - 'train_epochs': 6, - 'epochs_per_eval': 4, - 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '3, 4', - 'model_scope': 'all', - 'checkpoint_path': None, - 'checkpoint_model_scope': '', - 'checkpoint_exclude_scopes': '', - 'ignore_missing_vars': True, - }, - 'blouse': { - 'model_dir' : os.path.join(FLAGS.model_dir, 'blouse'), - 'train_epochs': 50, - 'epochs_per_eval': 30, - 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '15, 30', - 'model_scope': 'blouse', - 'checkpoint_path': os.path.join(FLAGS.model_dir, 'all'), - 'checkpoint_model_scope': 'all', - 'checkpoint_exclude_scopes': 'blouse/feature_pyramid/conv_heatmap, blouse/global_net/conv_heatmap', - 'ignore_missing_vars': True, - }, - 'dress': { - 'model_dir' : os.path.join(FLAGS.model_dir, 'dress'), - 'train_epochs': 50, - 'epochs_per_eval': 30, - 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '15, 30', - 'model_scope': 'dress', - 'checkpoint_path': os.path.join(FLAGS.model_dir, 'all'), - 'checkpoint_model_scope': 'all', - 'checkpoint_exclude_scopes': 'dress/feature_pyramid/conv_heatmap, dress/global_net/conv_heatmap', - 'ignore_missing_vars': True, - }, - 'outwear': { - 'model_dir' : os.path.join(FLAGS.model_dir, 'outwear'), - 'train_epochs': 50, - 'epochs_per_eval': 30, - 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '15, 30', - 'model_scope': 'outwear', - 'checkpoint_path': os.path.join(FLAGS.model_dir, 'all'), - 'checkpoint_model_scope': 'all', - 'checkpoint_exclude_scopes': 'outwear/feature_pyramid/conv_heatmap, outwear/global_net/conv_heatmap', - 'ignore_missing_vars': True, - }, - 'skirt': { - 'model_dir' : os.path.join(FLAGS.model_dir, 'skirt'), - 'train_epochs': 50, - 'epochs_per_eval': 30, - 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '15, 30', - 'model_scope': 'skirt', - 'checkpoint_path': os.path.join(FLAGS.model_dir, 'all'), - 'checkpoint_model_scope': 'all', - 'checkpoint_exclude_scopes': 'skirt/feature_pyramid/conv_heatmap, skirt/global_net/conv_heatmap', - 'ignore_missing_vars': True, - }, - 'trousers': { - 'model_dir' : os.path.join(FLAGS.model_dir, 'trousers'), - 'train_epochs': 50, - 'epochs_per_eval': 30, - 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '15, 30', - 'model_scope': 'trousers', - 'checkpoint_path': os.path.join(FLAGS.model_dir, 'all'), - 'checkpoint_model_scope': 'all', - 'checkpoint_exclude_scopes': 'trousers/feature_pyramid/conv_heatmap, trousers/global_net/conv_heatmap', - 'ignore_missing_vars': True, - }, - } - else: - detail_params = { - 'blouse': { - 'model_dir' : os.path.join(FLAGS.model_dir, 'blouse'), - 'train_epochs': 40, - 'epochs_per_eval': 15, - 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '10, 20', - 'model_scope': 'blouse', - 'checkpoint_path': os.path.join(FLAGS.data_dir, 'seresnext50') if FLAGS.run_on_cloud else os.path.join(FLAGS.checkpoint_path, 'seresnext50'), - 'checkpoint_model_scope': '', - 'checkpoint_exclude_scopes': 'blouse/feature_pyramid, blouse/global_net', - 'ignore_missing_vars': True, - }, - 'dress': { - 'model_dir' : os.path.join(FLAGS.model_dir, 'dress'), - 'train_epochs': 40, - 'epochs_per_eval': 15, - 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '10, 20', - 'model_scope': 'dress', - 'checkpoint_path': os.path.join(FLAGS.data_dir, 'seresnext50') if FLAGS.run_on_cloud else os.path.join(FLAGS.checkpoint_path, 'seresnext50'), - 'checkpoint_model_scope': '', - 'checkpoint_exclude_scopes': 'dress/feature_pyramid, dress/global_net', - 'ignore_missing_vars': True, - }, - 'outwear': { - 'model_dir' : os.path.join(FLAGS.model_dir, 'outwear'), - 'train_epochs': 40, - 'epochs_per_eval': 15, - 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '10, 20', - 'model_scope': 'outwear', - 'checkpoint_path': os.path.join(FLAGS.data_dir, 'seresnext50') if FLAGS.run_on_cloud else os.path.join(FLAGS.checkpoint_path, 'seresnext50'), - 'checkpoint_model_scope': '', - 'checkpoint_exclude_scopes': 'outwear/feature_pyramid, outwear/global_net', - 'ignore_missing_vars': True, - }, - 'skirt': { - 'model_dir' : os.path.join(FLAGS.model_dir, 'skirt'), - 'train_epochs': 40, - 'epochs_per_eval': 15, - 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '10, 20', - 'model_scope': 'skirt', - 'checkpoint_path': os.path.join(FLAGS.data_dir, 'seresnext50') if FLAGS.run_on_cloud else os.path.join(FLAGS.checkpoint_path, 'seresnext50'), - 'checkpoint_model_scope': '', - 'checkpoint_exclude_scopes': 'skirt/feature_pyramid, skirt/global_net', - 'ignore_missing_vars': True, - }, - 'trousers': { - 'model_dir' : os.path.join(FLAGS.model_dir, 'trousers'), - 'train_epochs': 40, - 'epochs_per_eval': 15, - 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '10, 20', - 'model_scope': 'trousers', - 'checkpoint_path': os.path.join(FLAGS.data_dir, 'seresnext50') if FLAGS.run_on_cloud else os.path.join(FLAGS.checkpoint_path, 'seresnext50'), - 'checkpoint_model_scope': '', - 'checkpoint_exclude_scopes': 'trousers/feature_pyramid, trousers/global_net', - 'ignore_missing_vars': True, - }, - } - model_to_train = [s.strip() for s in FLAGS.model_to_train.split(',')] - - # import datetime - # import time - # while True: - # time.sleep(1600) - # if '8' in datetime.datetime.now().time().strftime('%H'): - # break - - for m in model_to_train: - sub_loop(keypoint_model_fn, m, detail_params[m]['model_dir'], run_config, detail_params[m]['train_epochs'], detail_params[m]['epochs_per_eval'], detail_params[m]['lr_decay_factors'], detail_params[m]['decay_boundaries'], detail_params[m]['checkpoint_path'], detail_params[m]['checkpoint_exclude_scopes'], detail_params[m]['checkpoint_model_scope'], detail_params[m]['ignore_missing_vars']) - -if __name__ == '__main__': - tf.logging.set_verbosity(tf.logging.INFO) - tf.app.run() - -# 0.045620328343892416 -# blouse: 0.04301484169892338 -# dress: 0.04210286934923448 -# outwear: 0.04589965355198962 -# skirt: 0.056256085847705986 -# trousers: 0.05055153938503116 diff --git a/train_large_xt_cpn_onebyone.py b/train_large_xt_cpn_onebyone.py index 8524fa80..269701ee 100644 --- a/train_large_xt_cpn_onebyone.py +++ b/train_large_xt_cpn_onebyone.py @@ -62,20 +62,20 @@ 'save_summary_steps', 100, 'The frequency with which summaries are saved, in seconds.') tf.app.flags.DEFINE_integer( - 'save_checkpoints_secs', 3600, - 'The frequency with which the model is saved, in seconds.') + 'save_checkpoints_steps', 8000, + 'The frequency with which the model is saved, in steps.') # model related configuration tf.app.flags.DEFINE_string( - 'backbone', 'detxt', # 'detxt' or 'sext' + 'backbone', 'sext', # 'detxt' or 'sext' 'The backbone network to use for feature extraction.') tf.app.flags.DEFINE_integer( - 'net_depth', 50, + 'net_depth', 101, 'The depth of the backbone network for the model to use.') tf.app.flags.DEFINE_integer( - 'train_image_size', 512, + 'train_image_size', 384, 'The size of the input image for the model to use.') tf.app.flags.DEFINE_integer( - 'heatmap_size', 128, + 'heatmap_size', 96, 'The size of the output heatmap of the model.') tf.app.flags.DEFINE_float( 'heatmap_sigma', 1., @@ -84,7 +84,7 @@ 'bbox_border', 25., 'The nearest distance of the crop border to al keypoints.') tf.app.flags.DEFINE_integer( - 'batch_size', 5, + 'batch_size', 4, 'Batch size for training and evaluation.') tf.app.flags.DEFINE_boolean( 'use_ohkm', True, @@ -105,9 +105,9 @@ tf.app.flags.DEFINE_float( 'momentum', 0.9, 'The momentum for the MomentumOptimizer and RMSPropOptimizer.') -tf.app.flags.DEFINE_float('learning_rate', 5e-5, 'Initial learning rate.')#1e-3 +tf.app.flags.DEFINE_float('learning_rate', 7e-5, 'Initial learning rate.')#1e-3 tf.app.flags.DEFINE_float( - 'end_learning_rate', 0.000001, + 'end_learning_rate', 0.0000001, 'The minimal end learning rate used by a polynomial decay learning rate.') tf.app.flags.DEFINE_float( 'warmup_learning_rate', 0.00001, @@ -142,12 +142,12 @@ tf.app.flags.DEFINE_boolean( 'run_on_cloud', True, 'Wether we will train on cloud.') +tf.app.flags.DEFINE_boolean( + 'multi_gpu', True, + 'Wether we will use multi-GPUs to train.') tf.app.flags.DEFINE_string( 'cloud_checkpoint_path', 'seresnext{}', 'The path to a checkpoint from which to fine-tune.') -tf.app.flags.DEFINE_boolean( - 'seq_train', False, - 'Wether we will train a sequence model.') tf.app.flags.DEFINE_string( 'model_to_train', 'blouse, dress, outwear, skirt, trousers', #'all, blouse, dress, outwear, skirt, trousers', 'skirt, dress, outwear, trousers', 'The sub-model to train (comma-separated list).') @@ -163,13 +163,16 @@ def validate_batch_size_for_multi_gpu(batch_size): directly. Multi-GPU support is currently experimental, however, so doing the work here until that feature is in place. """ + if not FLAGS.multi_gpu: + return 0 + from tensorflow.python.client import device_lib local_device_protos = device_lib.list_local_devices() num_gpus = sum([1 for d in local_device_protos if d.device_type == 'GPU']) if not num_gpus: raise ValueError('Multi-GPU mode was specified, but no GPUs ' - 'were found. To use CPU, run without --multi_gpu.') + 'were found. To use CPU, run without --multi_gpu=False.') remainder = batch_size % num_gpus if remainder: @@ -180,7 +183,7 @@ def validate_batch_size_for_multi_gpu(batch_size): raise ValueError(err) return num_gpus -def input_pipeline(is_training=True, model_scope=FLAGS.model_scope, num_epochs=FLAGS.epochs_per_eval): +def input_pipeline(is_training=True, model_scope=FLAGS.model_scope, num_epochs=None): if 'all' in model_scope: lnorm_table = tf.contrib.lookup.HashTable(tf.contrib.lookup.KeyValueTensorInitializer(tf.constant(config.global_norm_key, dtype=tf.int64), tf.constant(config.global_norm_lvalues, dtype=tf.int64)), 0) @@ -306,8 +309,6 @@ def keypoint_model_fn(features, labels, mode, params): with tf.variable_scope(params['model_scope'], default_name=None, values=[features], reuse=tf.AUTO_REUSE): pred_outputs = backbone_(features, config.class_num_joints[(params['model_scope'] if 'all' not in params['model_scope'] else '*')], params['heatmap_size'], (mode == tf.estimator.ModeKeys.TRAIN), params['data_format'], net_depth=params['net_depth']) - #print(pred_outputs) - if params['data_format'] == 'channels_last': pred_outputs = [tf.transpose(pred_outputs[ind], [0, 3, 1, 2], name='outputs_trans_{}'.format(ind)) for ind in list(range(len(pred_outputs)))] @@ -455,7 +456,7 @@ def sub_loop(model_fn, model_scope, model_dir, run_config, train_epochs, epochs_ _replicate_model_fn = tf_replicate_model_fn.replicate_model_fn(model_fn, loss_reduction=tf.losses.Reduction.MEAN) fashionAI = tf.estimator.Estimator( - model_fn=_replicate_model_fn, model_dir=model_dir, config=run_config, + model_fn=_replicate_model_fn, model_dir=model_dir, config=run_config.replace(save_checkpoints_steps=2*steps_per_epoch), params={ 'checkpoint_path': checkpoint_path, 'model_dir': model_dir, @@ -510,8 +511,8 @@ def main(_): # Set up a RunConfig to only save checkpoints once per training cycle. run_config = tf.estimator.RunConfig().replace( - save_checkpoints_secs=FLAGS.save_checkpoints_secs).replace( - save_checkpoints_steps=None).replace( + save_checkpoints_secs=None).replace( + save_checkpoints_steps=FLAGS.save_checkpoints_steps).replace( save_summary_steps=FLAGS.save_summary_steps).replace( keep_checkpoint_max=5).replace( tf_random_seed=FLAGS.tf_random_seed).replace( @@ -524,10 +525,10 @@ def main(_): detail_params = { 'blouse': { 'model_dir' : os.path.join(full_model_dir, 'blouse'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 25, + 'epochs_per_eval': 5, 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '10, 20', + 'decay_boundaries': '15, 20', 'model_scope': 'blouse', 'checkpoint_path': os.path.join(FLAGS.data_dir, FLAGS.cloud_checkpoint_path.format(FLAGS.net_depth)) if FLAGS.run_on_cloud else FLAGS.checkpoint_path.format(FLAGS.net_depth), 'checkpoint_model_scope': '', @@ -536,10 +537,10 @@ def main(_): }, 'dress': { 'model_dir' : os.path.join(full_model_dir, 'dress'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 25, + 'epochs_per_eval': 5, 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '10, 20', + 'decay_boundaries': '15, 20', 'model_scope': 'dress', 'checkpoint_path': os.path.join(FLAGS.data_dir, FLAGS.cloud_checkpoint_path.format(FLAGS.net_depth)) if FLAGS.run_on_cloud else FLAGS.checkpoint_path.format(FLAGS.net_depth), 'checkpoint_model_scope': '', @@ -548,10 +549,10 @@ def main(_): }, 'outwear': { 'model_dir' : os.path.join(full_model_dir, 'outwear'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 25, + 'epochs_per_eval': 5, 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '10, 20', + 'decay_boundaries': '15, 20', 'model_scope': 'outwear', 'checkpoint_path': os.path.join(FLAGS.data_dir, FLAGS.cloud_checkpoint_path.format(FLAGS.net_depth)) if FLAGS.run_on_cloud else FLAGS.checkpoint_path.format(FLAGS.net_depth), 'checkpoint_model_scope': '', @@ -560,10 +561,10 @@ def main(_): }, 'skirt': { 'model_dir' : os.path.join(full_model_dir, 'skirt'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 25, + 'epochs_per_eval': 5, 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '10, 20', + 'decay_boundaries': '15, 20', 'model_scope': 'skirt', 'checkpoint_path': os.path.join(FLAGS.data_dir, FLAGS.cloud_checkpoint_path.format(FLAGS.net_depth)) if FLAGS.run_on_cloud else FLAGS.checkpoint_path.format(FLAGS.net_depth), 'checkpoint_model_scope': '', @@ -572,10 +573,10 @@ def main(_): }, 'trousers': { 'model_dir' : os.path.join(full_model_dir, 'trousers'), - 'train_epochs': 40, - 'epochs_per_eval': 16, + 'train_epochs': 25, + 'epochs_per_eval': 5, 'lr_decay_factors': '1, 0.5, 0.1', - 'decay_boundaries': '10, 20', + 'decay_boundaries': '15, 20', 'model_scope': 'trousers', 'checkpoint_path': os.path.join(FLAGS.data_dir, FLAGS.cloud_checkpoint_path.format(FLAGS.net_depth)) if FLAGS.run_on_cloud else FLAGS.checkpoint_path.format(FLAGS.net_depth), 'checkpoint_model_scope': '', diff --git a/train_senet_cpn_onebyone.py b/train_senet_cpn_onebyone.py index c92177eb..7b00339f 100644 --- a/train_senet_cpn_onebyone.py +++ b/train_senet_cpn_onebyone.py @@ -564,8 +564,8 @@ def main(_): detail_params = { 'blouse': { 'model_dir' : os.path.join(FLAGS.model_dir, 'blouse'), - 'train_epochs': 40, - 'epochs_per_eval': 15, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'blouse', @@ -576,8 +576,8 @@ def main(_): }, 'dress': { 'model_dir' : os.path.join(FLAGS.model_dir, 'dress'), - 'train_epochs': 40, - 'epochs_per_eval': 15, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'dress', @@ -588,8 +588,8 @@ def main(_): }, 'outwear': { 'model_dir' : os.path.join(FLAGS.model_dir, 'outwear'), - 'train_epochs': 40, - 'epochs_per_eval': 15, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'outwear', @@ -600,8 +600,8 @@ def main(_): }, 'skirt': { 'model_dir' : os.path.join(FLAGS.model_dir, 'skirt'), - 'train_epochs': 40, - 'epochs_per_eval': 15, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'skirt', @@ -612,8 +612,8 @@ def main(_): }, 'trousers': { 'model_dir' : os.path.join(FLAGS.model_dir, 'trousers'), - 'train_epochs': 40, - 'epochs_per_eval': 15, + 'train_epochs': 28, + 'epochs_per_eval': 7, 'lr_decay_factors': '1, 0.5, 0.1', 'decay_boundaries': '10, 20', 'model_scope': 'trousers', @@ -625,13 +625,6 @@ def main(_): } model_to_train = [s.strip() for s in FLAGS.model_to_train.split(',')] - # import datetime - # import time - # while True: - # time.sleep(1600) - # if '8' in datetime.datetime.now().time().strftime('%H'): - # break - for m in model_to_train: sub_loop(keypoint_model_fn, m, detail_params[m]['model_dir'], run_config, detail_params[m]['train_epochs'], detail_params[m]['epochs_per_eval'], detail_params[m]['lr_decay_factors'], detail_params[m]['decay_boundaries'], detail_params[m]['checkpoint_path'], detail_params[m]['checkpoint_exclude_scopes'], detail_params[m]['checkpoint_model_scope'], detail_params[m]['ignore_missing_vars']) diff --git a/utility/mertric.py b/utility/mertric.py index 06309e12..aa6b4e0c 100644 --- a/utility/mertric.py +++ b/utility/mertric.py @@ -89,8 +89,8 @@ def normalized_error(targets, predictions, norm_value, visible, isvalid, dist = tf.boolean_mask(dist, tf.logical_and(visible>0, isvalid>0)) #dist = dist * tf.cast(tf.logical_and(visible>0, isvalid>0), tf.float32) - update_total_op = state_ops.assign(total, math_ops.reduce_sum(dist))#assign_add - update_count_op = state_ops.assign(count, tf.cast(tf.shape(dist)[0], tf.float32))#assign_add + update_total_op = state_ops.assign(total, math_ops.reduce_sum(dist))#assign_add #assign + update_count_op = state_ops.assign(count, tf.cast(tf.shape(dist)[0], tf.float32))#assign_add #assign mean_t = _safe_div(total, count, 'value') update_op = _safe_div(update_total_op, update_count_op, 'update_op')