Introduction
Hello, this is swim-lover. Object detection is done with Python and Pytorch. I’ve just started Python, but I’m studying with the concept of “learning while using”. This time, due to the implementation sample, I ended up touching Tensorflow.
In Part (1) and Part (2), object detection was performed by using SSD. This time, I would like to perform object detection by using YoLo, which is an algorithm different from SSD.
YoLo Roughly understanding
YoLo is an abbreviation for You Only Look Once, and it seems that it is often compared with SSD.
YoLo is an end-to-end object detection because it is processed by deep learning in all stages from input to result output, along with SSD.
Let’s try to understand roughly from YoLo’s paper.
Step1)Resize the input image to 448 x 448 image.
Step2)Process Single Convolutional network to the image of step1).
Step3)Process thinnin by model’s confidence about the result of step2)
In Step2) Single Convolutional network, the processing of Multi Bounding Box (square frame) and the processing of class classification of the object are performed at the same time.
By the way, I understand that the Multi Bounding Box is the process of finding a place where an object is likely to be.
The key point is that by performing two processes at the same time, it is possible to process at a very high speed. This looks promising.
Step1)Divide the input image into SxS Grids (images). In the above example, S = 7.
Step2)For each Grid, predict B Bounding Boxes and Confirence Scores. In the upper figure in the center, there are a few thick black frames.
Step3)Create C Class Probalitity Maps. In the lower center figure, the magenta grid shows the car map, the yellow grid shows the bicycle map, and the blue grid shows the dog map.
Step4)Finally, the SxSx (B * 5 + C) tensor is output. In the paper, S = 7, B = 2, C = 20 (class number), so 7x7x30 tensors (output data) are obtained.
Yolo Architecture diagram
This is the architecture diagram in YoLo’s paper.
From the SSD paper, it is a diagram comparing SSD and YoLo.
I’d like to take some time to understand this again, but YoLo seems to have a simpler diagram.
YoLo Sample implementation
As before, we will use the Colab environment.
Mount My Drive in Google Drive in your Colab environment.
from google.colab import drive drive.mount('/content/drive')
Create the directory ‘test_yolo_v3’ in My Drivre.
mkdir test_yolo_v3
Just in case, check the version of tensowflow. The advantage of the Cloab environment is that you can save the time of setting up the environment.
import tensorflow as tf tf.__version__
2.8.0
Download keras version yolov3 from Git. Since the version of tensorflow is 2.0 or higher, you need to use yolov3 that supports 2.0 or higher.
!git clone https://github.com/zzh8829/yolov3-tf2
Then download the trained model.
!wget https://pjreddie.com/media/files/yolov3.weights -o ./data/yolov3.weights
Convert the trained model (yolo3.weights) to keras format.
!python convert.py --weights ./data/yolov3.weights --output ./checkpoints/yolov3.tf
However, I got an error. There is an inconsistency in the reshape process.
Traceback (most recent call last): File "convert.py", line 39, in <module> app.run(main) File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 312, in run _run_main(main, args) File "/usr/local/lib/python3.7/dist-packages/absl/app.py", line 258, in _run_main sys.exit(main(argv)) File "convert.py", line 26, in main load_darknet_weights(yolo, FLAGS.weights, FLAGS.tiny) File "/content/drive/MyDrive/test_yolo_v3/yolov3-tf2/yolov3_tf2/utils.py", line 66, in load_darknet_weights conv_shape).transpose([2, 3, 1, 0]) ValueError: cannot reshape array of size 51527 into shape (128,64,3,3)
I didn’t want to spend time fixing the script, so I tried another learned file called yolov3-tiny.
!python convert.py --weights ./data/yolov3-tiny.weights --output ./checkpoints/yolov3-tiny.tf --tiny
This time it worked successfully. Now I can move forward.
I0227 05:41:37.598141 140114706884480 convert.py:24] model created I0227 05:41:37.603230 140114706884480 utils.py:45] yolo_darknet/conv2d bn I0227 05:41:37.606598 140114706884480 utils.py:45] yolo_darknet/conv2d_1 bn I0227 05:41:37.609891 140114706884480 utils.py:45] yolo_darknet/conv2d_2 bn I0227 05:41:37.614180 140114706884480 utils.py:45] yolo_darknet/conv2d_3 bn I0227 05:41:37.620603 140114706884480 utils.py:45] yolo_darknet/conv2d_4 bn I0227 05:41:37.627354 140114706884480 utils.py:45] yolo_darknet/conv2d_5 bn I0227 05:41:37.642364 140114706884480 utils.py:45] yolo_darknet/conv2d_6 bn I0227 05:41:37.696121 140114706884480 utils.py:45] yolo_conv_0/conv2d_7 bn I0227 05:41:37.702009 140114706884480 utils.py:45] yolo_output_0/conv2d_8 bn I0227 05:41:37.715446 140114706884480 utils.py:45] yolo_output_0/conv2d_9 bias I0227 05:41:37.718433 140114706884480 utils.py:45] yolo_conv_1/conv2d_10 bn I0227 05:41:37.721494 140114706884480 utils.py:45] yolo_output_1/conv2d_11 bn I0227 05:41:37.731293 140114706884480 utils.py:45] yolo_output_1/conv2d_12 bias I0227 05:41:37.733283 140114706884480 convert.py:27] weights loaded I0227 05:41:39.185976 140114706884480 convert.py:31] sanity check passed I0227 05:41:39.391645 140114706884480 convert.py:34] weights saved
YoLov3 Execution of object detection
Perform object detection using my image file.
!python3 detect.py --weights ./checkpoints/yolov3-tiny.tf --tiny --image ./data/sample.jpg
We were able to detect two bicycles with overlapping front and rear wheels. I tried the same image on a Part (2) SSD, but on the SSD it was recognized as a single bike.
Try another image. Four bicycles have been detected.
Conclusion
This time, I tried object recognition using YoLov3-tiny, which is an implementation sample of object detection (YoLo) with Tensorflow. This time, an error occurred when converting the data to keras format, and it took a long time to update. I would like to continue object recognition from the next time onwards.
I’m an embedded software engineer. I have avoided front-end technology so far, but I started studying to acquire technology in a different field.
My hobbies are swimming, road bike, running and mountaineering.
We will send out information about embedded technology, front-end technology that we have studied, and occasional hobby exercises.
コメント