object detection and object segmentation algorithms
Differnce between object classification and object localization and object detection and object segmentation
RCNN
It is simple architecture and low performance model.Here use selective search algorithm(ssa) to propose the boxes it propose nearly 2000 boxes. The proposed box send to pretrained cnn model and get feature map this use for classification svm is default classifier and proposing boxes. Here we use a softmax layer to predict the class of the proposed region and also the offset values for the bounding box
Drawbacks very slow,propose nearly 2000 boxes per each image, using selective search algorithm.The selective search algorithm is a fixed algorithm. Therefore, no learning is happening at that stage.
FASTRCNN
Here first extract features map by using pretrained CNN and extracted feature map pass through selective search algorithm and propose the boxes in this algorithm number of propose nearly 50 so computation time decreases so less inference time compare to Rcnn and remaining working principal same as Rcnn.The selective search algorithm is a fixed algorithm. Therefore, no learning is happening at that stage.Here we use a softmax layer to predict the class of the proposed region and also the offset values for the bounding box.same disadvantage as rcnn but here first extract feature so number of proposal decreases significantly so less inference time.
FASTER RCNN
Same working principal as Fastrcnn but here use Region propsal network to propose instead of selective search algorithm.Region propsal network this is breakthrough because network learn the region proposals.breakthrough in performancer and fps both.state of the art algorithm in object detection family.
YOLO
YOLO V1,V2,V3,V4 algorithms are available.Get feature map by using pretrained model and apply.YOLO much different from the region based algorithms seen above models. YOLO a single convolutional network predicts the bounding boxes and the class probabilities for these boxes.working of yolo generated feature map reshaped into s*s grid, with in each grid take m bounding boxes.For each and every bounding box, the network outputs a class probability and value for the bounding box. The bounding boxes having the class probability above a threshold value is selected and used to locate the object within the image.The limitation of YOLO algorithm is that it struggles with small objects within the image, this limitation overcome by sdd algorithm.use non-max supersession and iou.fast yolo 45fps,yolo tiny 155 fps.
SSD
Give good balance between fps and mAP. use feature map of different convolution layers and use vgg16 and add 6 different convolution layers to extract features betterway.Each and everymap feature try to generate some number of boxes at any layer.ssd mobile net is very lite model but performance decreases but fps high.ssd 300 have high fps and performance than ssd512.
comparsion between yolo and ssd
Both are state of art for object detection model algorithms .But ssd overcome drawback of yolo that is ssd can detect different size of images but ssd propose high compare to yolo. SSD have high fps and mAP.In ssd add 5 more cnn layer(feature map) and propose 8732 per class.In ssd have confidence at any layer generate proposal at any layer so can detect any size of object in image. But leads to false box so keep ratio to eliminate it.
selection of model based on requirement for stable model use fasterRcnn because fps less and performance high and can consider ssd512 .for high fps requirement then use of ssd200.yolact plus plus is also good model.
MASKRCNN
perform both object detection and segmentation.segmentation detected object generate corresponding color.use fasterrcnn and parallel network generate pixel to objects.here use Roi align it can do pixel by pixel segmentation.And use of FCN to generate pixels.
3 types of segmentations:
- semantic segmentation:classify pixel by pixel in image and 1 color belong to 1 class.
- instance segmentation:detected object only segemented and color of same object is different.
- panoptic segmentation:classify pixel by pixel in image andcolor of same object is different.
yolact plus plus also can do object detection and segmentation it is also have high fps and high performance.
Detectron2
use pytorch framework so low latency because pytorch use c++ as base. Detectron2 better than Detectron1. Detectron2 can do object detection ,maskmpose estimator,dense pose,semantic segmentation. Detectron2 have high fps and performance as well.
Refernces for pretrained weight :
- https://github.com/facebookresearch/detectron2
- https://github.com/tensorflow/models
- https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf1_detection_zoo.md
Link to my previous blog:
Difference between 5 common convolutional neural networks
If your new to the field and want more sites where data available and life cycle of datascience and datascience competitions website then feel free to check my github link
If you like my work. please buy me a coffee it motivate me -> https://www.buymeacoffee.com/achuthasubhash?new=1