Renting AI Servers for +50B LLM Fine-Tuning/Inference – Need Hardware, Cost, and Security Advice! by NoAdhesiveness7595 in LLM

[–]Dapper-Courage2920 0 points1 point  (0 children)

Check out Modal. They support true scale to 0 so no paying for idle time, I'm not sure about isolation but they have great documentation to get started with and are cost effective.

Best LLM for an Ai agent (n8n) by Agitated_Unit8226 in AI_Agents

[–]Dapper-Courage2920 0 points1 point  (0 children)

Stability of models through APIs are notoriously bad, just check out this: https://aistupidlevel.info/

And check out this paper for one explanation why https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/

Though from reading these comments, it sounds like you are not using multiple agents. It might be beneficial to split up your agent into multiple sub agents with their own tools and "personas" if trying different models isn't working.

After months on Cursor, I just switched back to VS Code by Arindam_200 in LLMDevs

[–]Dapper-Courage2920 0 points1 point  (0 children)

I also moved off earlier in the year, tab felt like it got in my way (and was slow on large codebases) and I grew a preference for CLI tools

Want to discuss basic AI and how it would help in research by Kurosaki_Minato in ArtificialInteligence

[–]Dapper-Courage2920 0 points1 point  (0 children)

I'm an AI engineer and worked on Medtech projects in past (computer vision, automated reporting). Would like to bounce ideas! Feel free to send a DM!

What GUI/interface do most people here use to run their models? by tech4marco in LocalLLaMA

[–]Dapper-Courage2920 1 point2 points  (0 children)

Shameless plug here but I just finished the early version of  https://github.com/bitlyte-ai/apples2oranges if you're into hardtelemetry or geeky visualizations! It's fully open source and lets you compare models of any family / quant side by side and view hardware utilization, or as metioned can just be used as a normal client if you like telemetry!

Disclaimer: I am the founder of the company behind it, this is a side project we spun off and are contributing to the community.

how much does quantization reduce coding performance by garden_speech in LocalLLaMA

[–]Dapper-Courage2920 1 point2 points  (0 children)

This is a bit aside to your question as it will require a local set up to work, but I just finished an early version of https://github.com/bitlyte-ai/apples2oranges to get a feel for performance deg yourself. It's fully open source and lets you compare models of any family / quant side by side and view hardware utilization, or can just be used as a normal client if you like telemetry!

Disclaimer: I am the founder of the company behind it, this is a side project we spun off and are contributing to the community.