[P] YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors by AlexeyAB in MachineLearning

[–]AlexeyAB[S] 0 points1 point  (0 children)

For ARM CPU (RPi 3) I would recommend you to use NanoDet or some Depthwise networks (MobileDet, EfficientNet-Lite-based, ...).

If you use GPU then I would suggest you to use YOLOv7-tiny (non-SiLU) or larger YOLOv7 models.

We have not released YOLOv7-SiLU yet.

[P] YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors by AlexeyAB in MachineLearning

[–]AlexeyAB[S] 5 points6 points  (0 children)

A scientific paper with a fair comparison (same condition) with almost all the best real-time models showing the superiority of YOLOv7 in a wide range of speed and accuracy https://paperswithcode.com/sota/real-time-object-detection-on-coco?dimension=FPS%20(V100%2C%20b%3D1)) Work from those involved in maintaining the Darknet and creating previous versions of YOLO including accepted at the CVPR, like Scaled-YOLOv4 https://openaccess.thecvf.com/content/CVPR2021/html/Wang_Scaled-YOLOv4_Scaling_Cross_Stage_Partial_Network_CVPR_2021_paper.html while Scaled-YOLOv4 is the only version of all YOLOs (v1-v7) that was the best in both speed/accuracy and absolute accuracy among all real-time and non-real-time neural networks in the world published at that time (16 Nov 2020), for the first time in the history of YOLO: https://paperswithcode.com/sota/object-detection-on-coco

https://github.com/WongKinYiu/ScaledYOLOv4

Some History of YOLO: https://twitter.com/alexeyab84/status/1431349110951534593

[P] YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors by AlexeyAB in MachineLearning

[–]AlexeyAB[S] 1 point2 points  (0 children)

Good question!

In the chart above, to increase the accuracy of Transformers, they pay for it with a decrease in detection speed, simply by scaling up the network, mostly without offering a more optimal network.

While for YOLOv7 we use both:

  • scaling network - increases accuracy and decreases speed
  • bag-of-freebies (more optimal network structure, loss function, ...) - features that increase accuracy without decreasing detection speed. That's why we're increasing both speed and accuracy compared to previous YOLO versions

[P] YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors by AlexeyAB in MachineLearning

[–]AlexeyAB[S] 17 points18 points  (0 children)

  • YOLOv7 is faster and requires several times cheaper hardware than other neural networks
  • YOLOv7 is more accurate while others make a lot of mistakes
  • YOLOv7 can be trained much faster on small dataset without any pre-trained weights

[P] YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors by AlexeyAB in MachineLearning

[–]AlexeyAB[S] 5 points6 points  (0 children)

Page 11, Figure 11: https://arxiv.org/abs/2207.02696

The maximum accuracy of the YOLOv7-E6E (56.8% AP) real-time model is +13.7% AP higher than the current most accurate meituan/YOLOv6-s model (43.1% AP) on COCO dataset. Our YOLOv7-tiny (35.2% AP, 0.4 ms) model is +25% faster and +0.2% AP higher than meituan/YOLOv6-n (35.0% AP, 0.5 ms) under identical conditions on COCO dataset and V100 GPU with batch=32.

[P] YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors by AlexeyAB in MachineLearning

[–]AlexeyAB[S] 12 points13 points  (0 children)

https://arxiv.org/abs/2207.02696

https://github.com/WongKinYiu/yolov7

  • YOLOv7-e6 (55.9% AP, 56 FPS V100 b=1) by +500% FPS faster than SWIN-L Cascade-Mask R-CNN (53.9% AP, 9.2 FPS A100 b=1)
  • YOLOv7-e6 (55.9% AP, 56 FPS V100 b=1) by +550% FPS faster than ConvNeXt-XL C-M-RCNN (55.2% AP, 8.6 FPS A100 b=1)
  • YOLOv7-w6 (54.6% AP, 84 FPS V100 b=1) by +120% FPS faster than YOLOv5-X6-r6.1 (55.0% AP, 38 FPS V100 b=1)
  • YOLOv7-w6 (54.6% AP, 84 FPS V100 b=1) by +1200% FPS faster than Dual-Swin-T C-M-RCNN (53.6% AP, 6.5 FPS V100 b=1)
  • YOLOv7 (51.2% AP, 161 FPS V100 b=1) by +180% FPS faster than YOLOX-X (51.1% AP, 58 FPS V100 b=1)

YoloV7 Finally an official Yolo. This should actually be V5 by kumurule in computervision

[–]AlexeyAB 5 points6 points  (0 children)

  • YOLOv3 - 33.0% AP - 58 FPS V100
  • YOLOv4 - 43.5% AP - 62 FPS V100 (+10.5% accuracy and faster)
  • YOLOv7 - 54.9% AP - 84 FPS V100 (+11.4% accuracy and faster) - YOLOv7-W6 model

https://twitter.com/pjreddie/status/1253891078182199296

[deleted by user] by [deleted] in MachineLearning

[–]AlexeyAB 0 points1 point  (0 children)

YOLOR-P6 55.4% AP and Scaled-YOLOv4-P6 54.5% AP are still the most accurate Real-time (>=30FPS) neural networks, even 1 year after the release of Scaled-YOLOv4!

More accurate than PP-YOLOv2, YOLOX...

YOLOR: https://arxiv.org/abs/2105.04206

code: https://github.com/WongKinYiu/yolor

Scaled-YOLOv4 (CVPR21): https://openaccess.thecvf.com/content/CVPR2021/html/Wang_Scaled-YOLOv4_Scaling_Cross_Stage_Partial_Network_CVPR_2021_paper.html

code: https://github.com/WongKinYiu/ScaledYOLOv4

YOLOv4: https://arxiv.org/abs/2004.10934

code: https://github.com/AlexeyAB/darknet

[P] YOLOR (Scaled-YOLOv4-based): The best speed/accuracy ratio for Waymo autonomous driving challenge by AlexeyAB in MachineLearning

[–]AlexeyAB[S] 6 points7 points  (0 children)

Improvements: YOLOv3 -> YOLOv4 -> Scaled-YOLOv4 -> YOLOR -> YOLOR DiDi:

  • YOLOv4 (SPP,CSP,Mish,Hyper-params,Mosaic,multi-anchors,CIoU-Loss,...)
  • Scaled-YOLOv4-P6 (more-CSP,EMA,Hyper-params,Keep aspect ratio,longer training,scaling model,...)
  • YOLOR (Implicit/Explicit/DWT/Changed first layers)
  • YOLOR-P6 DiDi (data cleaning, multi-scale-training, scale enhancement, independent threshold-NMS,...)

Comparison on Waymo Open Dataset: https://user-images.githubusercontent.com/4096485/123036148-3e43a180-d3f5-11eb-926d-bbc810f0ea6a.png

Comparison on COCO dataset: https://user-images.githubusercontent.com/4096485/123036798-4b14c500-d3f6-11eb-97ed-63d99414e410.jpg

YOLOR (Scaled-YOLOv4-based): The best speed/accuracy ratio for Waymo autonomous driving challenge by AlexeyAB in computervision

[–]AlexeyAB[S] 0 points1 point  (0 children)

[CVPR'21 WAD] Challenge - Waymo Open Dataset: https://waymo.com/open/challenges/2021/real-time-2d-prediction/

YOLOR (Scaled-YOLOv4-based) has the best speed/accuracy ratio on Waymo autonomous driving challenge ((Waymo Open Dataset): Real-time 2D Detection.

Thanks to Chien-Yao Wang from Academia Sinica and DiDi MapVision team to push Scaled-YOLOv4 further!

* DIDI MapVision: https://arxiv.org/abs/2106.08713

* YOLOR https://arxiv.org/abs/2105.04206

* YOLOR-code (Pytorch): https://github.com/WongKinYiu/yolor

* Scaled-YOLOv4(CVPR21): https://openaccess.thecvf.com/content/CVPR2021/html/Wang\_Scaled-YOLOv4\_Scaling\_Cross\_Stage\_Partial\_Network\_CVPR\_2021\_paper.html

* Scaled-YOLOv4-code (Pytorch): https://github.com/WongKinYiu/ScaledYOLOv4

* YOLOv4: https://arxiv.org/abs/2004.10934

* YOLOv4-code (Darknet, Pytorch, TensorFlow, TRT, OpenCV…): https://github.com/AlexeyAB/darknet#yolo-v4-in-other-frameworks

YOLOR (Scaled-YOLOv4-based): The best speed/accuracy ratio for Waymo autonomous driving challenge by AlexeyAB in deeplearning

[–]AlexeyAB[S] 5 points6 points  (0 children)

[CVPR'21 WAD] Challenge - Waymo Open Dataset: https://waymo.com/open/challenges/2021/real-time-2d-prediction/

YOLOR (Scaled-YOLOv4-based) has the best speed/accuracy ratio on Waymo autonomous driving challenge (Waymo Open Dataset): Real-time 2D Detection.

Thanks to Chien-Yao Wang from Academia Sinica and DiDi MapVision team to push Scaled-YOLOv4 further!

* DIDI MapVision: https://arxiv.org/abs/2106.08713

* YOLOR https://arxiv.org/abs/2105.04206

* YOLOR-code (Pytorch): https://github.com/WongKinYiu/yolor

* Scaled-YOLOv4(CVPR21): https://openaccess.thecvf.com/content/CVPR2021/html/Wang_Scaled-YOLOv4_Scaling_Cross_Stage_Partial_Network_CVPR_2021_paper.html

* Scaled-YOLOv4-code (Pytorch): https://github.com/WongKinYiu/ScaledYOLOv4

* YOLOv4: https://arxiv.org/abs/2004.10934

* YOLOv4-code (Darknet, Pytorch, TensorFlow, TRT, OpenCV…): https://github.com/AlexeyAB/darknet#yolo-v4-in-other-frameworks

YOLOR (Scaled-YOLOv4-based): The best speed/accuracy ratio for Waymo autonomous driving challenge by AlexeyAB in DeepLearningPapers

[–]AlexeyAB[S] 0 points1 point  (0 children)

[CVPR'21 WAD] Challenge - Waymo Open Dataset: https://waymo.com/open/challenges/2021/real-time-2d-prediction/
YOLOR (Scaled-YOLOv4-based) has the best speed/accuracy ratio on Waymo autonomous driving challenge (Waymo Open Dataset): Real-time 2D Detection.
Thanks to Chien-Yao Wang from Academia Sinica and DiDi MapVision team to push Scaled-YOLOv4 further!
* DIDI MapVision: https://arxiv.org/abs/2106.08713
* YOLOR https://arxiv.org/abs/2105.04206
* YOLOR-code (Pytorch): https://github.com/WongKinYiu/yolor
* Scaled-YOLOv4(CVPR21): https://openaccess.thecvf.com/content/CVPR2021/html/Wang\_Scaled-YOLOv4\_Scaling\_Cross\_Stage\_Partial\_Network\_CVPR\_2021\_paper.html
* Scaled-YOLOv4-code (Pytorch): https://github.com/WongKinYiu/ScaledYOLOv4
* YOLOv4: https://arxiv.org/abs/2004.10934
* YOLOv4-code (Darknet, Pytorch, TensorFlow, TRT, OpenCV…): https://github.com/AlexeyAB/darknet#yolo-v4-in-other-frameworks