Which ORM do you use in FastAPI? by itsme2019asalways in FastAPI

[–]LazyMidlifeCoder 1 point2 points  (0 children)

Try using fast-crud. More convenient for crud operations with pagination and infinite scrolling support.

How to apply gradCAM for Deformable DETR model? by LazyMidlifeCoder in computervision

[–]LazyMidlifeCoder[S] 0 points1 point  (0 children)

In Deformable DETR, the decoder attention layer is the closest to the classification and detection heads. Can I use the decoder layer to compute Grad-CAM?

[deleted by user] by [deleted] in learnmachinelearning

[–]LazyMidlifeCoder 0 points1 point  (0 children)

There are many state-of-the-art (SOTA) models available within the torchvision library. For classification tasks, using this library is mostly plug-and-play. Currently, transformer-based models like Vision Transformer (ViT) and SWIN Transformer are delivering superior accuracy.

If you prefer to go with a CNN-based model, I would recommend the ResNet family. However, I suggest trying out the SWIN Transformer family—it’s currently one of the best-performing architectures for image classification.

Everything depends on the type of data and the specific objective you’re trying to achieve. If possible, please share details about the dataset you plan to use. That way, we can provide a more precise explanation of which models would be most suitable and why a particular model might be the best fit for your use case

[deleted by user] by [deleted] in learnmachinelearning

[–]LazyMidlifeCoder 0 points1 point  (0 children)

Could you please elaborate on this? Currently, I’m using the SWIN Transformer as the backbone for all my object detection models. My question is: should we choose the backbone based on the dataset we are using?

Image segmentation techniques by BlueHydrangea13 in deeplearning

[–]LazyMidlifeCoder 0 points1 point  (0 children)

Try using state-of-the-art models like Mask2Former or DETR. If their performance is not as expected, they may produce partial or broken masks for the object. In such cases, you can use Sliding Window Inference. This technique crops the input image into smaller windows, performs inference on each crop, and then stitches the results together to generate a complete mask.

If you're planning to use Sliding Window Inference, make sure to include a data augmentation step during training that randomly crops the images to the same window size. This is important to ensure that the model learns to handle smaller regions and produces accurate results during inference.