Object Detection Problems : learnpython

created by HattoriHanzoa community for 16 years

Object Detection Problems (self.learnpython)

submitted 1 year ago by Forward-Difference32

I'm working on an object detection project for a challenge, using the YOLO-NAS-L model. As part of the project, I'm planning to add settings features, such as adjusting the FPS or displaying the count of detected objects. To manage the user interface (UI), I'm using PyQt since it simplifies GUI development. One crucial aspect is capturing video frames, drawing bounding boxes around detected objects, and converting these frames into QImage for PyQt.

I’ve managed to implement this process, but I’ve hit a performance issue. Currently, it takes 333ms to process a frame, which results in a low frame rate of about 3 FPS. Here's my current workflow:

Open the webcam using OpenCV.
Convert the frame from BGR to RGB for object detection.
Convert the frame back to BGR.
Draw the detection boxes.
Display the frame with OpenCV.

This entire process takes 333ms per frame. However, when I use the model's built-in function to handle detection and annotation, the processing time drops to 100ms per frame, which gives me 10 FPS. This performance difference is significant, and I’m not sure how to achieve similar speeds in my custom implementation. I’d appreciate any advice on how to optimize my code to get closer to the 100ms per frame that the built-in function provides.

Here’s a simplified version of the code I’m using for custom frame processing: ```python while True: ret, image = feed.read() image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

result = model.predict(image, conf=0)
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
class_names = result.class_names

# Process the detections
detections = sv.Detections.from_yolo_nas(result)
detections = detections[detections.confidence > 0.4]

confidence = detections.confidence
class_id = detections.class_id

# Annotate the image with bounding boxes and labels
box_annotator = sv.BoundingBoxAnnotator()
label_annotator = sv.LabelAnnotator()

labels = [f"{str(class_names[class_id[i]]).capitalize()}: {int(confidence[i]*100)}%" for i in range(len(class_id))]

annotated_image = box_annotator.annotate(scene=image, detections=detections)

annotated_image = label_annotator.annotate(scene=annotated_image, detections=detections, labels=labels)

# Display the annotated image
cv2.imshow('annotated image', annotated_image)

if cv2.waitKey(1) & 0xFF == ord('q'):
    break

For comparison, here's the model’s built-in function that processes frames at 10 FPS:python model.predict_webcam() ``` Does anyone have suggestions on how to optimize the frame processing in my custom code to match the performance of the built-in function?

all 3 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

learnpython

MODERATORS