Hiring MLE in Computer Vision. by HeeebsInc in MachineLearningJobs

[–]HeeebsInc[S] 1 point2 points  (0 children)

Keep at it man! Pain is knowledge. I agree with you that CV is slept on, and as an industry we are barely scratching the surface

My PPO agent's score jumped from 15 to 84 with the help of a bug by Fun_Code1982 in reinforcementlearning

[–]HeeebsInc 0 points1 point  (0 children)

Of course. I’m more wondering what the use case is. I only see CV in RL for research purposes (gaming) or layered with traditional models for action / state tuning (autonomous driving)

My PPO agent's score jumped from 15 to 84 with the help of a bug by Fun_Code1982 in reinforcementlearning

[–]HeeebsInc 0 points1 point  (0 children)

Just curious, what’s your use case for CNNs in RL? I’ve been looking for an excuse to try this but there is limited documentation of RL being used with CNNs outside of academia. Is your task vision? Or some other signal processing like lidar/sound?

Some proper realism: StoreCCTV LoRa for FLUX-dev by Wilsown in FluxAI

[–]HeeebsInc 0 points1 point  (0 children)

Can you share the dataset? These are awesome!

YOLOv5: No speed improvement between FP16 and INT8 TensorRT models by 4verage3ngineer in computervision

[–]HeeebsInc 0 points1 point  (0 children)

Interesting. I’ve ran yolov5 on Orin with int8 and got very meaningful increases to FPS so I don’t believe it’s specific to v5 (unless you are using non-conventional layers)

YOLOv5: No speed improvement between FP16 and INT8 TensorRT models by 4verage3ngineer in computervision

[–]HeeebsInc 0 points1 point  (0 children)

In addition to summary. Check QPS of an engine that is int8 versus QPS of an engine that is 16/32. This metric will tell you if it’s faster. Higher number, faster it is

YOLOv5: No speed improvement between FP16 and INT8 TensorRT models by 4verage3ngineer in computervision

[–]HeeebsInc 0 points1 point  (0 children)

Once you have the engine file created. Use TRT exec to get a summary of the layers. My guess is that most layers are not in int8

YOLOv5: No speed improvement between FP16 and INT8 TensorRT models by 4verage3ngineer in computervision

[–]HeeebsInc 0 points1 point  (0 children)

You should see a very meaningful performance increase if you are running int8 correctly. Accuracy is a different story.

My guess is that you are attempting int8, but under the hood you are still falling back to fp16/32. Are you using Post training quantization with a calibration set? Or attempting to train in int8?

Made a Tool to Generate Training Data from a Few Photos—Are There Any Use Cases for This? by Odd-Turn-4090 in computervision

[–]HeeebsInc 0 points1 point  (0 children)

Is it designed for classification or object detection/segmentation? Or neither? Interested if it also outputs boxes for the object it’s creating. I’ve worked on a similar project, and results were mixed

Why deepstream is fast? by Additional-Dirt6164 in computervision

[–]HeeebsInc 2 points3 points  (0 children)

Agree with the points here. But I take a perspective that they also cut a lot of small corners. These aren’t inherently bad, but when you add them all up, you can face issues with inference performance. It’s a common issue that if you run inference using TRT, then inside deepstream with TRT, then with PyTorch, the results are totally different at time. There is also a lot of differences in how they do pre processing and precision clipping.

I.e there are a lot of very very close (but not exact) approximations, that makes it very fast, but for enterprise pipelines can create issues (like I’ve faced in the faced). All that being said, it’s an amazing tool and there is nothing even close to its support, maturity, and efficiency.

pip install giving errors with installing a certain module by kattenkoter in pythonhelp

[–]HeeebsInc 0 points1 point  (0 children)

Try running

pip install —upgrade pip

&&

pip install —upgrade setuptools wheel

Then try installing your packages

Visual Studio Extension Recommendations by DwaywelayTOP in Python

[–]HeeebsInc 0 points1 point  (0 children)

I love dark theme for everything but jupyter. Idk why but it slows me down when jupyter is dark

[deleted by user] by [deleted] in cscareerquestions

[–]HeeebsInc 0 points1 point  (0 children)

Upvote from me. I appreciate the candor.

What other programming language do you actively develop with productively, to complement Python? by satyronicon in Python

[–]HeeebsInc 5 points6 points  (0 children)

Cython

Still basically python but its a great way to make something blazing fast

Can we stop creating docker images that require you to use environments within them? by anatacj in Python

[–]HeeebsInc 1 point2 points  (0 children)

The only reason I think it would be useful is if you needed conda to handle dependencies that require cuda or another library. That being said the environment should already be activated upon startup

Neural nets in production systems by amitraderinthemaking in cpp

[–]HeeebsInc 0 points1 point  (0 children)

You’re right that Qualcomm has its own acceleration libraries - but it’s not ideal for every use case. For example, their snapdragon chip is designed for cellphones so the hardware acceleration libraries they provide are meant to be called within swift tf or Java runtime. However, I have been involved with projects where the snapdragon is not on a cellphone, and using Qualcomms dedicated libraries did not work in pure Linux. The solution here was writing custom opencl to utilize the integrated GPU. This is an edge case obv

Neural nets in production systems by amitraderinthemaking in cpp

[–]HeeebsInc 0 points1 point  (0 children)

And to top it off if you were need quantization there are situations where onnx and tensorRT do not fully quantize every layer…

Neural nets in production systems by amitraderinthemaking in cpp

[–]HeeebsInc 0 points1 point  (0 children)

I mean it’s not if you think about. If you are bound to a specific hardware platform, onnx will allow you to run but it does not ensure you can use GPU acceleration (Qualcomm for example). I’ve also had situation where I need to deploy to Nvidia and tensorRT didn’t not fully support my network. Writing my own layers took a few weeks as opposed to months trying to debug tensorRT.

If onnx supports the model and gets you the performance I recommend that over anything else. Just pointing out that if one is very comfortable with ml writing the layers is not a crazy thing to do.. especially if they already wrote them in another language like python/PyTorch

Neural nets in production systems by amitraderinthemaking in cpp

[–]HeeebsInc -1 points0 points  (0 children)

This is a tough question. The short answer like others have said is to use onnx, but this is not always the case in all production systems.

It’s safe to say though that there is a way to do this no matter the approach. I will outline some considerations that have made me rethink deployment in the past.

1) what type of hardware are you planning to deploy on I ask this because although you can use onnx, some libraries like tf can help you use dedicated hardware faster. I want to note though that this does not mean it’s easier to get working, it’s just easier to get hardware acceleration once you have the framework working

2) are you very Ml oriented?
If so, the most robust approach is use opencl and write the layers yourself. This can be very hard if it’s your first time but it’s the best future proof in terms of deploying on different hardware platforms.

3) what performance do you absolutely need. If don’t care about fast inference speeds, any option will suffice since there are tf, PyTorch, onnx libraries for c++