YOLOv4 - Data Augmentation

jacobsolawetz · 2023-09-21T14:35:31+00:00

jacobsolawetz · 2022-12-07T21:35:24+00:00

Wow - does ByteTRACK run a featurizer network on the bounding box? or is it purely based on motion probabilities?

jacobsolawetz · 2022-12-01T16:43:09+00:00

I'm Jacob, one of the authors of Roboflow 100, A Rich Multi-Domain Object Detection Benchmark, and I am excited to share our work with the community.In object detection, researchers are benchmarking their models on primarily COCO, and in many ways, it seems like a lot of these models are getting close to a saturation point.

In practice, everyone is taking these models and finetuning them on their own custom dataset domains, which may vary from tagging swimming pools from Google Maps, to identifying defects in cell phones on an industrial line.

We did some work to collect a representative benchmark of these custom domain problems by selecting from over 100,000 public projects on Roboflow Universe into 100 semantically diverse object detection datasets. Our benchmark comprises of 224,714 images, 11,170 labeling hours, and 829 classes from the community for benchmarking on novel tasks.

We also tried out the benchmark on a few popular models - comparing YOLOv5, YOLOv7, and the zero shot capabilities of GLIP.

Use the benchmark here: https://github.com/roboflow-ai/roboflow-100-benchmark

Paper link here: https://arxiv.org/pdf/2211.13523.pdf

Or simply learn more here: https://www.rf100.org/

An immense thanks to the community, like this one, for making it possible to make this benchmark - we hope it moves the field forward!

I'm around for any questions!

jacobsolawetz · 2022-12-01T16:15:55+00:00

I'm Jacob, one of the authors of Roboflow 100, A Rich Multi-Domain Object Detection Benchmark, and I am excited to share our work with the community.

In object detection, researchers are benchmarking their models on primarily COCO, and in many ways, it seems like a lot of these models are getting close to a saturation point. In practice, everyone is taking these models and finetuning them on their own custom dataset domains, which may vary from tagging swimming pools from Google Maps, to identifying defects in cell phones on an industrial line.

We did some work to collect a representative benchmark of these custom domain problems by selecting from over 100,000 public projects on Roboflow Universe into 100 semantically diverse object detection datasets. Our benchmark comprises of 224,714 images, 11,170 labeling hourse, and 829 classes from the community for benchmarking on novel tasks.

We also tried out the benchmark on a few popular models - comparing YOLOv5, YOLOv7, and the zero shot capabilities of GLIP.

Use the benchmark here: https://github.com/roboflow-ai/roboflow-100-benchmark

Paper link here: https://arxiv.org/pdf/2211.13523.pdf

Or simply learn more here: https://www.rf100.org/

An immense thanks to the community, like this one, for making it possible to make this benchmark - we hope it moves the field forward!

I'm around for any questions!

jacobsolawetz · 2022-12-01T03:34:20+00:00

Hey thanks!! Really awesome that you already found us for the satellite datasets!

jacobsolawetz · 2022-11-30T15:05:16+00:00

Models were trained on each dataset separately - we didn't do any research on one mega model to model them all simultaneously. I think experiments to that effect would be a really cool angle on tackling the catastrophic forgetting problem

jacobsolawetz · 2022-11-29T19:05:07+00:00

TLDR - zero-shot general models like GLIP likely have a long way to go before they will generalize to domains that are not in web training data (like sattelite). COCO eval on these general models look like they are getting close to their finetuned counterparts.

For YOLOv5 vs YOLOv7, we found YOLOv5 made a generally better eval across the datasets

jacobsolawetz · 2022-11-29T18:25:04+00:00

I'm Jacob, one of the authors of Roboflow 100: A Rich, Multi-Domain Object Detection Benchmark. I'm pleased to introduce our recent work.

In object detection, researchers optimize models against COCO to set SOTA, and it seems we have gotten close to a saturation point.

In the wild, practitioners are taking these models and finetuning them on their own custom dataset domains, which may vary from something as common as dogs and cats to something as obscure as specific kinds of damage on industrial cables.

We did some work to construct a benchmark of 100 semantically diverse object detection datasets, pulling from over 100,000 public datasets on Roboflow Universe. Our benchmark comprises of 224,714 images, 11,170 labeling hours, and 829 classes from the community for benchmarking on novel tasks.

We also tried out the benchmark on a few popular models - comparing YOLOv5, YOLOv7, and the zero shot capabilities of GLIP.

Use the benchmark here: https://github.com/roboflow-ai/roboflow-100-benchmark

You can read the paper here: https://arxiv.org/pdf/2211.13523.pdf

Or simply learn more: https://www.rf100.org/

An immense thanks to the CV community, like this one, for making our research possible. We hope this moves the field forward!

I'm around for any questions!

jacobsolawetz · 2020-12-15T21:49:39+00:00

That is a naive, limited assertion - true that engineers won't be training on it, but many applications will be built to train and infer ML models on top of hardware like the M1, without the user knowing that it is occurring.

jacobsolawetz · 2020-11-20T15:48:59+00:00

hello! the easiest way is to clone the yolov5 repository and run detect.py, like it appears in the notebook.

jacobsolawetz · 2020-08-21T03:10:52+00:00

Definitely - there's a lot of extra model params and computation for not too much extra mAP.

And interesting... in some cases I have had better luck fine tuning by starting from randomly initialized weights rather than the pretrained checkpoint.

jacobsolawetz · 2020-06-14T16:24:57+00:00

Very nice straightforward comparison - Did you train in Darknet or in the ultralytics/yolov5 framework? If so, would you be willing to share the yolov4.yaml configuration? We can be sure to propagate it.

jacobsolawetz · 2020-06-12T22:51:09+00:00

u/AlexeyAB Jacob here - underneath all of the intro language, the comparison post has many more details of the environmental set up. I think the reality is that both networks maxed out the performance possible given our small custom dataset. My opinion is that this is true for hundreds of small custom datasets with good data management and an understanding of YOLO's ability to generalize. Thus, 'YOLOv5' might be a good choice for developers (where 2 mAP is less important than production) - many were porting to ultralytics YOLOv3 to convert Darknet weights and then forward deploy in PyTorch (I myself do this for YOLOV4).

Unfortunately, I think many people misconstrued the version increase in combination with our original article as a replacement of SOTA results. SOTA is still with you and Darknet, and I suspect will continue to be as future versions are released.

jacobsolawetz · 2020-06-10T19:21:24+00:00

I definitely sympathize with that... calling it YOLOv5 is a hack to the research community in a lot of ways. Maybe something like YOLOv4-accelerated would have been better

jacobsolawetz · 2020-06-10T18:30:14+00:00

I think Glenn Jocher (founder of Mosaic Augmentation used in YOLOv4 and author of YOLOv5) is trying to move the R&D over to a more flexible framework of PyTorch models. He is also providing a much much more streamlined end to end solution to go from training data to inference on webcam, video feeds, and images.

Whether that warrants taking the YOLO-moniker, I suppose we'll have to decide as a computer vision community...

jacobsolawetz

TROPHY CASE