retinaface face detection algorithm dessert I have been learning about face detection algorithms recently, so I have also tried to learn multiple face detection frameworks. So I will share them with you here. Retinaface is similar to ordinary object detection algorithms. Some prior boxes are pre-set on the image. These prior boxes will be distributed on the entire image. The internal structure of the network will judge these prior boxes to see if they contain faces. At the same time, it will also adjust the position and give each prior box a confidence level. In the Retinaface prior frame, not only the face position is obtained, but also the five key points of each face are obtained. Next, our implementation process of Retinaface is actually to pre-set the prior box on the image. The network's prediction result will determine whether the prior box contains a face and adjust the prior box to obtain the predicted box and five facial key points. Backbone feature extraction network
mobileNet The MobileNet network was proposed by the Google team in 2017. It focuses on lightweight CNN networks in mobile and embedded devices. It greatly reduces model parameters and computational complexity with only a slight decrease in accuracy. Strengthen feature extraction network FPN and SHH FPN construction is to generate feature maps for fusion, and then upsample and merge them with the effective feature layer of the previous layer. The idea of SSH is very simple. It uses three parallel structures and stacks 3 x 3 convolutions to replace the effects of 5 x 5 and 7 x 7 convolutions. retina head The backbone network outputs grids of different sizes for detecting targets of different sizes. The default number of prior boxes is 2. These prior boxes are used to detect targets, and then the target bounding boxes are obtained by adjustment.
FPN
SSH
Prior box adjustment Depthwise separable convolution The advantage of depthwise separable convolution is that it can reduce the number of parameters, thereby reducing the cost of calculation. It often appears in some lightweight network structures (these network structures are suitable for mobile devices or embedded devices). Depthwise separable convolution is composed of DW (depthwise) and PW (pointwise). Here we explain how depth-wise separable convolution reduces parameters by comparing it to ordinary convolutional neural networks.
DW (Depthwise Conv) Let's first look at the DW part in the figure. In this part, each convolution kernel has 1 channel. Each convolution kernel corresponds to one input channel for calculation. It can be imagined that the number of output channels is consistent with the number of convolution kernels and the number of input channels. To sum up briefly, there are two points:
PW (Pointwise Conv) The PW convolution kernel is similar to the normal convolution kernel, except that the PW convolution kernel size is 1, the convolution kernel depth is the same as the number of input channels, and the number of convolution kernels is the same as the number of output channels. |
<<: 600,000 new 5G base stations will be built in 2021
I received a message from DogYun that the Korean ...
At the recently concluded MWC 2018, 5G became a h...
Given the rapid pace of change in the technology ...
According to the latest research by research firm...
This section will formally enter the content of n...
This year's Double 11 event started very earl...
Unlike the fierce price competition in the 3G and...
Gartner and IDC predict that global IT spending w...
The Hong Kong International series VPS hosts prov...
According to foreign media, Australia has complet...
Friendhosting has frequent promotions at the end ...
2020 is coming to an end. With the advancement of...
July 5 During the just-concluded mobile communica...
If you don't talk about AI after dinner, you ...
VIAVI Solutions, Inc. (NASDAQ: VIAV) will showcas...