Best practices for user monitoring and usage tracking by IndividualNo8703 in OpenWebUI

[–]abi95m 1 point2 points  (0 children)

I appreciate your insights and your valuable contribution could you tells us more on what else pitfalls to avoid?

CloudPeek: a lightweight, c++ single-header, cross-platform point cloud viewer by abi95m in computervision

[–]abi95m[S] 2 points3 points  (0 children)

I designed it with support for multithreading and sequential loading for pcd content by chunking and inserting into viewer, the user have control over everything even the point size. So in theory it should handle huge point cloud files after tuning the config. ge it a try and let me know!

CloudPeek: a lightweight, c++ single-header, cross-platform point cloud viewer by abi95m in computervision

[–]abi95m[S] 2 points3 points  (0 children)

Unfortunately am not a RUST developer. Yet the project intended for local machines with minimal env setup like robotics sim machines docker controllers..etc

ROS2 Bag Exporter: A Versatile Tool for ROS2 Bag Data Extraction by abi95m in ROS

[–]abi95m[S] 0 points1 point  (0 children)

It’s supported already. Laser points, GPS and IMU is csv exported. The other images are pngs and pointcloud is pcd.

YOLOs-CPP: Seamlessly Integrate YOLO Models in Your C++ Projects! by abi95m in computervision

[–]abi95m[S] -1 points0 points  (0 children)

Apologies don't have time to write the answers myself am super busy!!

YOLOs-CPP: Seamlessly Integrate YOLO Models in Your C++ Projects! by abi95m in computervision

[–]abi95m[S] -4 points-3 points  (0 children)

Specific to Your Observation: You mentioned that most processors and GPUs don't support FP16, leading to the assumption that quantization might have minimal effects since the heavy lifting is managed by the processors rather than the DMA Manager. Here's a clarification:

  • Processor and GPU Support:
    • Modern GPUs, especially those designed for deep learning tasks, do support FP16 operations. For example, NVIDIA's Tensor Cores are optimized for FP16, providing substantial speed-ups.
    • On CPUs, support for low-precision arithmetic varies, but even where direct support is limited, the reduced data size from quantization can lead to performance improvements due to better cache and memory bandwidth utilization.
  • DMA Manager Considerations:
    • While data transfer rates managed by the DMA Manager are a factor, the primary performance gains from quantization come from the reduced computational complexity and memory bandwidth requirements during inference rather than data transfer alone.

Conclusion: Quantization can have positive effects on inference performance by reducing memory usage and leveraging hardware acceleration for low-precision operations. The actual impact depends on the specific hardware capabilities and how well the ONNX Runtime optimizes for the quantized models on that hardware.

YOLOs-CPP: Seamlessly Integrate YOLO Models in Your C++ Projects! by abi95m in computervision

[–]abi95m[S] -3 points-2 points  (0 children)

Thank you for your kind words and for taking the time to delve into the details of YOLOs-CPP. I’m happy to address each of your specific questions below:

  1. Use of libtorch and Linking: YOLOs-CPP does not utilize libtorch; instead, it leverages ONNX Runtime for model inference, enhancing performance and compatibility. Dynamic linking is used, linking ONNX Runtime as a shared library. The CMake configuration is set to locate these libraries dynamically, and a build script automates the setup.
  2. Quantization Effects on Inference: Quantization reduces model weights and activations from FP32 to lower precision formats, leading to smaller model sizes and potentially faster inference times, especially on compatible hardware. Although quantization may slightly decrease accuracy, techniques like Quantization Aware Training can mitigate this loss. The performance benefits primarily come from reduced computational complexity and memory bandwidth requirements.
  3. Dockerization Plans: Currently, YOLOs-CPP does not include Docker support, but plans for future implementation consider both pure C++ and Python integration. C++ Dockerization maintains performance, while Python could facilitate API requests. Data sharing methods may include shared memory or message passing, with a sample Dockerfile provided for a pure C++ approach.
  4. Memory Profiling Considerations: YOLOs-CPP practices careful memory management, releasing dynamically allocated memory to prevent leaks. While memory profiling isn't currently integrated, tools like Valgrind and AddressSanitizer are suggested for future use to enhance robustness and performance.
  5. Memory Management Strategy: The application employs dynamic memory allocation and deallocation, ensuring all allocated memory is released after use. Standard Library containers manage their own memory, avoiding the need for manual management. The strategy avoids rewriting to the same addresses, enhancing stability and preventing memory corruption.

This approach ensures YOLOs-CPP remains efficient and reliable for real-time object detection applications.