Questions regarding changes to yolo plugin #416

philipp-schmidt · 2021-05-26T16:27:36Z

Hi, I'm trying to update the yolo plugin in https://github.com/isarsoft/yolov4-triton-tensorrt to a more recent version with your fixes to batchsize and updates to compatiblity etc.

Could you briefly explain a few changes that you made to the plugin in comparison to the original from tensorrtx?

Input for the yolo layer plugin originally was "all output conv" layers, but now seems to be just one conv layer? So for example the full yolov4 network now needs three instances of the plugin instead of one? And also the output is three marked blobs instead of just one, correct?
Original implementation had MAX_OUTPUT_BBOX_COUNT how did you handle this change regarding output size etc?
What's input_multiplier for? I found input_multiplier = w // yolo_whs[i][0]. So this is just to not pass input_width and input_height? It could be calculated if the plugin knew those values?
You reduced int mThreadCount = 256; to int mThreadCount = 64;, is there a performance reason for this?

Thanks already for all those great fixes to the plugin, really helpful.
Could you very briefly explain other changes that you made to the plugin that you think would be significant enough to mention?

The text was updated successfully, but these errors were encountered:

jkjung-avt · 2021-05-27T09:10:43Z

Input for the yolo layer plugin originally was "all output conv" layers, but now seems to be just one conv layer? So for example the full yolov4 network now needs three instances of the plugin instead of one? And also the output is three marked blobs instead of just one, correct?

Yes. I would add 2 or 3 (or more) yolo plugins into the network depending on how many output conv layers there are. The code is here:

tensorrt_demos/yolo/plugins.py

Lines 82 to 83 in af87896

    
           def add_yolo_plugins(network, model_name, logger): 
        
               """Add yolo plugins into a TensorRT network."""

Original implementation had MAX_OUTPUT_BBOX_COUNT how did you handle this change regarding output size etc?

I don't set an upper limit on output bbox count. All detection boxes with "scores" higher than the threshold would be kept. They would go through NMS before the final detection results are generated. The relevant code is here:

tensorrt_demos/utils/yolo_with_plugins.py

Lines 100 to 101 in af87896

    
           def _postprocess_yolo(trt_outputs, img_w, img_h, conf_th, nms_threshold, 
        
                                 input_shape, letter_box=False):

What's input_multiplier for? I found input_multiplier = w // yolo_whs[i][0]. So this is just to not pass input_width and input_height? It could be calculated if the plugin knew those values?

Yes, I pass the particular information to the plugin as 1 single value instead of 2. This is to avoid a problem as described here: NVIDIA/TensorRT#238. TensorRT plugin code seems to handle pluginField values incorrectly if there are too many of them.

You reduced int mThreadCount = 256; to int mThreadCount = 64;, is there a performance reason for this?

Lower-end Jetson SoCs, such as TX2 and Nano, have only 256 GPU cores in total. I don't want the yolo plugin to occupy all GPU cores in such systems (i.e. trying to keep some GPU cores available for processing other TensorRT OPs/kernels in parallel). However, based on my own tests, it doesn't seem to make much difference (between 64 and 256).

philipp-schmidt · 2021-05-27T12:37:07Z

Thanks for the answers, they make perfect sense.

philipp-schmidt · 2021-05-29T11:30:22Z

Hi @jkjung-avt
Have to bother you again, sorry. I only now realised that your plugin was different from the one in tensorrtx from the very beginning in that it does not compute Logistic Activation. Can you confirm that I have to do Logistic Activation in the last Convolutional Layer before the Yolo Layer myself? Is there anything else that I have to apply to the inputs?

jkjung-avt · 2021-05-29T11:51:43Z

For "yolov4-tiny" and "yolov4" models, the conv layers proceeding yolo layers are with "linear" activation, e.g. yolov4.cfg

```
[convolutional]
size=1
stride=1
pad=1
filters=255
activation=linear
```

In this case, my yolo plugin would calculate "sigmoid" on the input values. Refer to code here:

tensorrt_demos/plugins/yolo_layer.cu

Lines 230 to 231 in 58ad1e9

    
           float max_cls_prob = sigmoidGPU(max_cls_logit); 
        
           float box_prob = sigmoidGPU(*(cur_input + 4 * total_grids));

In contrast, "yolov4-csp" and "yolov4x-mish" models would have those conv layers with "logistic" activation, e.g. yolov4x-mish.cfg

```
[convolutional]
size=1
stride=1
pad=1
filters=255
activation=logistic
```

So in this latter case, I don't need to calculate "sigmoid" in the plugin code:

tensorrt_demos/plugins/yolo_layer.cu

Line 288 in 58ad1e9

float box_prob = *(cur_input + 4 * total_grids);

In case you are interested in reading the code which handles different types of conv activations, check here:

tensorrt_demos/yolo/yolo_to_onnx.py

Line 699 in 58ad1e9

if layer_dict['activation'] == 'leaky':

philipp-schmidt · 2021-05-29T12:07:47Z

Yes I only now realised that sigmoid and logistic are practically the same thing. I'm using the TensorRT Layer API (as in tensorrtx) but for converting weights I have to use the python ScaledYolov4 Repo. And there seem to be minor differences in the cfg files of the Python Implementation and Darknet (and consequently in the weights) which is VERY annoying.

Starting with this:
WongKinYiu/ScaledYOLOv4#202 (comment)

And also this:
WongKinYiu/ScaledYOLOv4#202 (comment)

And furthermore this (route layer versus route_lhalf):
WongKinYiu/ScaledYOLOv4#165

I guess I'll have to check all implementations for differences...

philipp-schmidt · 2021-05-29T12:11:42Z

I'm currently trying to implement yolov4-tiny from here: https://github.com/tjuskyzhang/Scaled-YOLOv4-TensorRT/tree/master/yolov4-tiny-tensorrt

But using your plugin. With little success.

philipp-schmidt · 2021-05-29T12:22:19Z

I believe I need to add a Sigmoid Activation function and use new_coords. I will check that out. Thanks for the help jkjung.

philipp-schmidt · 2021-05-29T13:47:34Z

Nevermind.... yolov4-tiny does not use anchor 0....
hunglc007/tensorflow-yolov4-tflite#111 (comment)

That fixed it...

philipp-schmidt closed this as completed May 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions regarding changes to yolo plugin #416

Questions regarding changes to yolo plugin #416

philipp-schmidt commented May 26, 2021

jkjung-avt commented May 27, 2021

philipp-schmidt commented May 27, 2021

philipp-schmidt commented May 29, 2021

jkjung-avt commented May 29, 2021

philipp-schmidt commented May 29, 2021 •

edited

Loading

philipp-schmidt commented May 29, 2021

philipp-schmidt commented May 29, 2021

philipp-schmidt commented May 29, 2021

Questions regarding changes to yolo plugin #416

Questions regarding changes to yolo plugin #416

Comments

philipp-schmidt commented May 26, 2021

jkjung-avt commented May 27, 2021

philipp-schmidt commented May 27, 2021

philipp-schmidt commented May 29, 2021

jkjung-avt commented May 29, 2021

philipp-schmidt commented May 29, 2021 • edited Loading

philipp-schmidt commented May 29, 2021

philipp-schmidt commented May 29, 2021

philipp-schmidt commented May 29, 2021

philipp-schmidt commented May 29, 2021 •

edited

Loading