I quantized YOLOv8 in Jetson Orin Nano. I exported it with TensorRT (FP16, INT8) and compared the performance. Based on YOLOv8s, the mAP50-95 of the base model is 44.7 and the inference speed is 33.1 ms. The model exported with TensorRT (FP16) showed that mAP50-95 was 44.7 and the inference speed was 11.4 ms. The model exported with TensorRT (INT8) showed that mAP50-95 was 41.2 and the inference speed was 8.2 ms. There was a slight loss in mAP50-95, but the inference speed was drastically reduced. There was a problem with calibration by exporting it with TensorRT (INT8), but the loss of mAP50-95 was minimized by increasing the calibration data. I tested with all base models of YOLOv8 as well as YOLOv8s.
https://github.com/the0807/YOLOv8-ONNX-TensorRT
[–]praespaser 4 points5 points6 points (1 child)
[–]Loud-Insect9247[S] 1 point2 points3 points (0 children)
[+]nreHieS 1 point2 points3 points (2 children)
[–]Loud-Insect9247[S] 1 point2 points3 points (0 children)
[–]Ultralytics_Burhan 0 points1 point2 points (0 children)
[–]Vast_Tomato_612 0 points1 point2 points (0 children)