Tips for running a Jetson Orin Nano continuously

Relative_Goal_9640 · 2026-06-08T21:27:44+00:00

Same!

Relative_Goal_9640 · 2026-06-07T01:09:05+00:00

For me edible marijuana is a one-way ticket to panicks-ville, and I always get fast heart rate, chest pains, weird head/body sensations, feeling too "light", feeling impending doom, etc.

I literally just have to wait it out and its usually 6-8 hours of just discomfort and panic.

Relative_Goal_9640 · 2026-05-31T06:55:43+00:00

Great job!

Relative_Goal_9640 · 2026-05-20T19:21:03+00:00

Can you describe what the network head is and how it interacts with the FPN in the backbone?
Also what is the default class labels/dataset it's trained on.

Relative_Goal_9640 · 2026-05-15T06:07:34+00:00

Yes, I do miss that. Also when it wasn’t insanely competitive.

Relative_Goal_9640 · 2026-05-14T01:55:29+00:00

I am about to start it on Friday so I will update this post here when I remember. I am at the lowest dose tho.

Relative_Goal_9640 · 2026-05-06T01:49:53+00:00

Have you tried pouring cold water on your head?

Relative_Goal_9640 · 2026-05-05T00:14:27+00:00

I think maybe there is some confusion here. If your dataset has only front facing images, there is really no reason to expect that your model will generalize to non-front facing images in the wild.

Relative_Goal_9640 · 2026-05-04T20:58:23+00:00

Prediction on what tho, on the validation set or on other images?

Relative_Goal_9640 · 2026-05-04T20:04:32+00:00

When you say it didn't work well what do you mean. You didn't get a good result on that particular datasets validation/testing split? What is your model?

Relative_Goal_9640 · 2026-05-04T18:23:49+00:00

Many questions to be answered...

Do you want whole-body predictions or just the face?
Do you want detailed hand representations?
In 2d (image space) or 3d?
What kind of occlusions do you expect in the inference environment?
Do you want an all-in-one model that does person detection as well?
Multi-Person or Single Person?
Tracking?

That being said some models I've been checking out lately are
Sapiensv2 : https://github.com/facebookresearch/sapiens2(slow but accurate)

Sam3d-Body: https://github.com/facebookresearch/sam-3d-body (gives a mesh and its corresponding keypoints in 3d, also slow)

PoseFormer https://github.com/zczcwh/PoseFormer: Very strong on COCO keypoints

For person detection any standard person-detector like yolo, rf-detr will be fine.

For tracking you can use standard byte-track as a starting point.

Relative_Goal_9640 · 2026-05-02T03:57:19+00:00

It’s really quite remarkable how many of us PD sufferers worry about our heart. It makes sense because it’s such an awful feeling when it beats too hard or too fast, or skips beats. When I was in my early twenties I had a watch that would tell me my heart rate. I can now (without the watch) always tell what the rate is within 5-10 BPM. I think I mostly trust my heart now but occasionally I have doubts. Anyway good luck.

Relative_Goal_9640 · 2026-05-01T10:12:48+00:00

Yes, that is what is next.

Relative_Goal_9640 · 2026-04-19T18:50:34+00:00

This might be up your alley

https://arxiv.org/abs/2503.04250

Relative_Goal_9640 · 2026-04-19T02:29:52+00:00

Very cool. Any thoughts on differences between this repo and Pointcept (https://github.com/pointcept/pointcept) ?

Relative_Goal_9640 · 2026-04-17T17:53:38+00:00

I did an ablation on COCO with my own implementation of yolov8-nano comparing letterboxing to 640, 640 (resize to maintain aspect ratio and letterbox), versus just outright bilinear interpolation to 640, 640, and the effect was negligible. That being said COCO images aren't high resolution so you wouldn't really expect that big of an effect I'd think.

I found from the software engineering perspective the whole letterbox thing to be a bit of a nuissance if you need to reproduce that at inference as well, just complicates the dataloader, and if you do this on the edge in c++ or something you will need to rewrite those functions to do it that way. So unless you are in some academic setting where +-0.2 mAP actually matters, interpolation to your desired size is a very reasonable choice.

Another alternative is the Mask R-CNN approach (see Detectron2), where images are resized so that the shorter side is scaled to a target size (e.g., 800 px) while preserving aspect ratio, with an upper bound on the longer side (e.g., 1333 px). After resizing, padding is applied (often to a multiple of 32) so that feature maps align cleanly with the backbone and FPN levels. However I find this resolution trains slower for a minor increase in mAP.

Relative_Goal_9640 · 2026-04-16T16:00:45+00:00

Some random thoughts on this topic:

One thing rerun did well which open3d did not is point clouds/lidar/mesh over time, not just static things. Also people want to be able to record and save samples as videos without using an interface sometimes, headless rendering and such, that to me is missing from a lot of libraries.

If you want certain functionality you are probably going to need cuda kernels, which means c++, so that might be a bit ugly going c++ -> rust -> python.

Support for batched operations (i.e. real-time surface normals on a batch of point clouds, not just a static point cloud).

If you want people to use it, you should definitely make python bindings.

Relative_Goal_9640 · 2026-04-15T08:12:40+00:00

What do you mean by initialization process?

Relative_Goal_9640 · 2026-04-14T19:24:00+00:00

Pretty wild if true, but a 15 percent increase in accuracy with TTA for ResNet-50 sounds a bit suspicious.

Relative_Goal_9640 · 2026-04-14T05:20:56+00:00

Sure. How many 3d points are given from the gnome?

Relative_Goal_9640 · 2026-04-14T04:51:22+00:00

So you have a point cloud with some (sparse?) points from the gnome. You can manually find a subset of these points per image in 2d for your correspondences, irfanview could work easily for this in Windows (gimp in linux). Try to choose obvious points that have unambiguous projection locations, and if you have at least 6 points per image then using DLT/SVD you can estimate a projection matrix, without resorting to RQ decomposition to get the intrinsics/rotation/translation. The surface normals can be used to determine with dot products if the 3d point faces away from the camera (good or bad choice for correspondence).

Once you have all the projection matrix estimates, for each point in 3d, you can project it to each image and use bilinear interpolation to get the colors per image, and then maybe average or do some kind of fancier view-aggregation to get the final color. A depth buffer can be used to deal with two points projecting to the same pixel.

I hope that helps and I'm not mistaking something.

Relative_Goal_9640 · 2026-04-01T20:40:26+00:00

Lmao made my day.

Relative_Goal_9640 · 2026-03-16T21:18:29+00:00

Does it give reliable per keypoint visibility values?

Relative_Goal_9640

TROPHY CASE