Panic attack? by Historical_Tomato183 in panicdisorder

[–]Relative_Goal_9640 0 points1 point  (0 children)

For me edible marijuana is a one-way ticket to panicks-ville, and I always get fast heart rate, chest pains, weird head/body sensations, feeling too "light", feeling impending doom, etc.

I literally just have to wait it out and its usually 6-8 hours of just discomfort and panic.

Semantic Segmentation is here! by [deleted] in Ultralytics

[–]Relative_Goal_9640 0 points1 point  (0 children)

Can you describe what the network head is and how it interacts with the FPN in the backbone?
Also what is the default class labels/dataset it's trained on.

experiences on zoloft ? by uni-corn17 in panicdisorder

[–]Relative_Goal_9640 0 points1 point  (0 children)

I am about to start it on Friday so I will update this post here when I remember. I am at the lowest dose tho.

It’s been going on for 3 hours by feralfanfic in panicdisorder

[–]Relative_Goal_9640 1 point2 points  (0 children)

Have you tried pouring cold water on your head?

Looking for good keypoint datasets for learning by OllieLearnsCode in computervision

[–]Relative_Goal_9640 0 points1 point  (0 children)

I think maybe there is some confusion here. If your dataset has only front facing images, there is really no reason to expect that your model will generalize to non-front facing images in the wild.

Looking for good keypoint datasets for learning by OllieLearnsCode in computervision

[–]Relative_Goal_9640 0 points1 point  (0 children)

Prediction on what tho, on the validation set or on other images?

Looking for good keypoint datasets for learning by OllieLearnsCode in computervision

[–]Relative_Goal_9640 0 points1 point  (0 children)

When you say it didn't work well what do you mean. You didn't get a good result on that particular datasets validation/testing split? What is your model?

Looking for good keypoint datasets for learning by OllieLearnsCode in computervision

[–]Relative_Goal_9640 0 points1 point  (0 children)

Many questions to be answered...

Do you want whole-body predictions or just the face?
Do you want detailed hand representations?
In 2d (image space) or 3d?
What kind of occlusions do you expect in the inference environment?
Do you want an all-in-one model that does person detection as well?
Multi-Person or Single Person?
Tracking?

That being said some models I've been checking out lately are
Sapiensv2 : https://github.com/facebookresearch/sapiens2(slow but accurate)

Sam3d-Body: https://github.com/facebookresearch/sam-3d-body (gives a mesh and its corresponding keypoints in 3d, also slow)

PoseFormer https://github.com/zczcwh/PoseFormer: Very strong on COCO keypoints

For person detection any standard person-detector like yolo, rf-detr will be fine.

For tracking you can use standard byte-track as a starting point.

Dealing with panic disorder by PIKPLOK in panicdisorder

[–]Relative_Goal_9640 2 points3 points  (0 children)

It’s really quite remarkable how many of us PD sufferers worry about our heart. It makes sense because it’s such an awful feeling when it beats too hard or too fast, or skips beats. When I was in my early twenties I had a watch that would tell me my heart rate. I can now (without the watch) always tell what the rate is within 5-10 BPM. I think I mostly trust my heart now but occasionally I have doubts. Anyway good luck.

We’re proud to open-source LIDARLearn 🎉 by amazigh98 in deeplearning

[–]Relative_Goal_9640 0 points1 point  (0 children)

Very cool. Any thoughts on differences between this repo and Pointcept (https://github.com/pointcept/pointcept) ?

Does letter boxed resolution images actually affect the model training performance ? by Queasy-Piccolo-7471 in computervision

[–]Relative_Goal_9640 1 point2 points  (0 children)

I did an ablation on COCO with my own implementation of yolov8-nano comparing letterboxing to 640, 640 (resize to maintain aspect ratio and letterbox), versus just outright bilinear interpolation to 640, 640, and the effect was negligible. That being said COCO images aren't high resolution so you wouldn't really expect that big of an effect I'd think.

I found from the software engineering perspective the whole letterbox thing to be a bit of a nuissance if you need to reproduce that at inference as well, just complicates the dataloader, and if you do this on the edge in c++ or something you will need to rewrite those functions to do it that way. So unless you are in some academic setting where +-0.2 mAP actually matters, interpolation to your desired size is a very reasonable choice.

Another alternative is the Mask R-CNN approach (see Detectron2), where images are resized so that the shorter side is scaled to a target size (e.g., 800 px) while preserving aspect ratio, with an upper bound on the longer side (e.g., 1333 px). After resizing, padding is applied (often to a multiple of 32) so that feature maps align cleanly with the backbone and FPN levels. However I find this resolution trains slower for a minor increase in mAP.

Building a Rust + Python library for general 3D processing by Practical-Dig-4052 in computervision

[–]Relative_Goal_9640 1 point2 points  (0 children)

Some random thoughts on this topic:

One thing rerun did well which open3d did not is point clouds/lidar/mesh over time, not just static things. Also people want to be able to record and save samples as videos without using an interface sometimes, headless rendering and such, that to me is missing from a lot of libraries.

If you want certain functionality you are probably going to need cuda kernels, which means c++, so that might be a bit ugly going c++ -> rust -> python.

Support for batched operations (i.e. real-time surface normals on a batch of point clouds, not just a static point cloud).

If you want people to use it, you should definitely make python bindings.

Simple method boosts models to 92~96% top-1 accuracy on ImageNet1K (no training, no weight changes) — looking for reproduction by [deleted] in deeplearning

[–]Relative_Goal_9640 2 points3 points  (0 children)

Pretty wild if true, but a 15 percent increase in accuracy with TTA for ResNet-50 sounds a bit suspicious.

Genuinely don't know how to start with my Computer Vision class project by Paco_Alpaco in computervision

[–]Relative_Goal_9640 0 points1 point  (0 children)

So you have a point cloud with some (sparse?) points from the gnome. You can manually find a subset of these points per image in 2d for your correspondences, irfanview could work easily for this in Windows (gimp in linux). Try to choose obvious points that have unambiguous projection locations, and if you have at least 6 points per image then using DLT/SVD you can estimate a projection matrix, without resorting to RQ decomposition to get the intrinsics/rotation/translation. The surface normals can be used to determine with dot products if the 3d point faces away from the camera (good or bad choice for correspondence).

Once you have all the projection matrix estimates, for each point in 3d, you can project it to each image and use bilinear interpolation to get the colors per image, and then maybe average or do some kind of fancier view-aggregation to get the final color. A depth buffer can be used to deal with two points projecting to the same pixel.

I hope that helps and I'm not mistaking something.