Object detection by Tensorflow Google Colab environment Part(3) | 初心者向けWEB技術と機械学習のBlog

Contents

Introduction
YoLo Roughly understanding
Yolo Architecture diagram
YoLo Sample implementation
YoLov3 Execution of object detection
Conclusion

Introduction

Hello, this is swim-lover. Object detection is done with Python and Pytorch. I’ve just started Python, but I’m studying with the concept of “learning while using”. This time, due to the implementation sample, I ended up touching Tensorflow.

In Part (1) and Part (2), object detection was performed by using SSD. This time, I would like to perform object detection by using YoLo, which is an algorithm different from SSD.

YoLo Roughly understanding

YoLo is an abbreviation for You Only Look Once, and it seems that it is often compared with SSD.

YoLo is an end-to-end object detection because it is processed by deep learning in all stages from input to result output, along with SSD.
Let’s try to understand roughly from YoLo’s paper.

Step1）Resize the input image to 448 x 448 image.

Step2）Process Single Convolutional network to the image of step1).

Step3）Process thinnin by model’s confidence about the result of step2)

In Step2) Single Convolutional network, the processing of Multi Bounding Box (square frame) and the processing of class classification of the object are performed at the same time.

By the way, I understand that the Multi Bounding Box is the process of finding a place where an object is likely to be.

The key point is that by performing two processes at the same time, it is possible to process at a very high speed. This looks promising.

Step1）Divide the input image into SxS Grids (images). In the above example, S = 7.

Step2）For each Grid, predict B Bounding Boxes and Confirence Scores. In the upper figure in the center, there are a few thick black frames.

Step3）Create C Class Probalitity Maps. In the lower center figure, the magenta grid shows the car map, the yellow grid shows the bicycle map, and the blue grid shows the dog map.

Step4）Finally, the SxSx (B * 5 + C) tensor is output. In the paper, S = 7, B = 2, C = 20 (class number), so 7x7x30 tensors (output data) are obtained.

Yolo Architecture diagram

This is the architecture diagram in YoLo’s paper.

From the SSD paper, it is a diagram comparing SSD and YoLo.

I’d like to take some time to understand this again, but YoLo seems to have a simpler diagram.

YoLo Sample implementation

As before, we will use the Colab environment.

Mount My Drive in Google Drive in your Colab environment.

from google.colab import drive
drive.mount('/content/drive')

Create the directory ‘test_yolo_v3’ in My Drivre.

mkdir test_yolo_v3

Just in case, check the version of tensowflow. The advantage of the Cloab environment is that you can save the time of setting up the environment.

import tensorflow as tf
tf.__version__

2.8.0

Download keras version yolov3 from Git. Since the version of tensorflow is 2.0 or higher, you need to use yolov3 that supports 2.0 or higher.

!git clone https://github.com/zzh8829/yolov3-tf2

Then download the trained model.

!wget https://pjreddie.com/media/files/yolov3.weights -o ./data/yolov3.weights

Convert the trained model (yolo3.weights) to keras format.

!python convert.py --weights ./data/yolov3.weights --output ./checkpoints/yolov3.tf

However, I got an error. There is an inconsistency in the reshape process.

Traceback (most recent call last):
  File "convert.py", line 39, in <module>
    app.run(main)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "convert.py", line 26, in main
    load_darknet_weights(yolo, FLAGS.weights, FLAGS.tiny)
  File "/content/drive/MyDrive/test_yolo_v3/yolov3-tf2/yolov3_tf2/utils.py", line 66, in load_darknet_weights
    conv_shape).transpose([2, 3, 1, 0])
ValueError: cannot reshape array of size 51527 into shape (128,64,3,3)

I didn’t want to spend time fixing the script, so I tried another learned file called yolov3-tiny.

!python convert.py --weights ./data/yolov3-tiny.weights --output ./checkpoints/yolov3-tiny.tf --tiny

This time it worked successfully. Now I can move forward.

I0227 05:41:37.598141 140114706884480 convert.py:24] model created
I0227 05:41:37.603230 140114706884480 utils.py:45] yolo_darknet/conv2d bn
I0227 05:41:37.606598 140114706884480 utils.py:45] yolo_darknet/conv2d_1 bn
I0227 05:41:37.609891 140114706884480 utils.py:45] yolo_darknet/conv2d_2 bn
I0227 05:41:37.614180 140114706884480 utils.py:45] yolo_darknet/conv2d_3 bn
I0227 05:41:37.620603 140114706884480 utils.py:45] yolo_darknet/conv2d_4 bn
I0227 05:41:37.627354 140114706884480 utils.py:45] yolo_darknet/conv2d_5 bn
I0227 05:41:37.642364 140114706884480 utils.py:45] yolo_darknet/conv2d_6 bn
I0227 05:41:37.696121 140114706884480 utils.py:45] yolo_conv_0/conv2d_7 bn
I0227 05:41:37.702009 140114706884480 utils.py:45] yolo_output_0/conv2d_8 bn
I0227 05:41:37.715446 140114706884480 utils.py:45] yolo_output_0/conv2d_9 bias
I0227 05:41:37.718433 140114706884480 utils.py:45] yolo_conv_1/conv2d_10 bn
I0227 05:41:37.721494 140114706884480 utils.py:45] yolo_output_1/conv2d_11 bn
I0227 05:41:37.731293 140114706884480 utils.py:45] yolo_output_1/conv2d_12 bias
I0227 05:41:37.733283 140114706884480 convert.py:27] weights loaded
I0227 05:41:39.185976 140114706884480 convert.py:31] sanity check passed
I0227 05:41:39.391645 140114706884480 convert.py:34] weights saved

YoLov3 Execution of object detection

Perform object detection using my image file.

!python3 detect.py --weights ./checkpoints/yolov3-tiny.tf --tiny --image ./data/sample.jpg

We were able to detect two bicycles with overlapping front and rear wheels. I tried the same image on a Part (2) SSD, but on the SSD it was recognized as a single bike.

Try another image. Four bicycles have been detected.

Conclusion

This time, I tried object recognition using YoLov3-tiny, which is an implementation sample of object detection (YoLo) with Tensorflow. This time, an error occurred when converting the data to keras format, and it took a long time to update. I would like to continue object recognition from the next time onwards.

swim-lover-us

I’m an embedded software engineer. I have avoided front-end technology so far, but I started studying to acquire technology in a different field.
My hobbies are swimming, road bike, running and mountaineering.
We will send out information about embedded technology, front-end technology that we have studied, and occasional hobby exercises.