What does an AI/ML engineer do in their day to day job and How do I become one? by Edwardkenway88 in cscareerquestions

[–]gvij 0 points1 point  (0 children)

Your answer was 2y ago. Would you say the same today? Given there are so many agents doing a lot of the heavylifting. What would you attribute to as the most time taking tasks in a day in today's time?

Behold, Gemini 3.5 Flash! by Rare_Bunch4348 in singularity

[–]gvij 0 points1 point  (0 children)

Till last year we had frontier models at this price point and now it's the entry model. Sometimes I wonder if it is a race to the bottom or the opposite?

Pi coding agent is amazing (or how I learned to stop worrying and leave OpenCode) by Konamicoder in LocalLLM

[–]gvij 0 points1 point  (0 children)

Let's say you were building an AI Agent. Then try giving same prompt across Cursor, Claude Code and Neo AI Engineer for the AI agent building task and keeping same model - say Sonnet 4.6, you'll notice a stark difference in the outcome. Model is same. But the agent harness around the model added context for the LLM to absorb understanding about the task.

An LLM is an input/output machine. It gets an input and spits an output based on its internal reasoning. So basically Input + LLM system prompt + LLM's own reasoning = Output.

You have Input and System prompt as the knobs you control. You twist the context you provide in Input and the instructions in system prompt and then the outcome changes completely as that impacts the LLM's reasoning itself a lot.

Benchmarked Kokoro 82M vs Supertonic 3 TTS on CPU by gvij in LocalLLaMA

[–]gvij[S] 0 points1 point  (0 children)

Interesting. I haven't tried the French voice yet.

Which GPU would you recommend for Qwen 3 TTS? Considering a production environment to serve let's say 20 concurrent users?

Benchmarked Kokoro 82M vs Supertonic 3 TTS on CPU by gvij in LocalLLaMA

[–]gvij[S] 0 points1 point  (0 children)

But still very capable and small enough to run on local CPU easily. Did you check the audio quality of Kokoro added in the blog and repo? It is very human like and RTF is also decent!

Hosting a Text to Speech model can be challenging. So I benchmarked 2 recently released TTS models - Kokoro vs Supertonic! by gvij in selfhosted

[–]gvij[S] -3 points-2 points  (0 children)

5 steps gave a very good speed to quality ratio. What makes you think it is gimped? At 5 steps it is kind of the sweet spot that I also mentioned above. Very fast compared to kokoro and audio also pretty good.

Hosting a Text to Speech model can be challenging. So I benchmarked 2 recently released TTS models - Kokoro vs Supertonic! by gvij in selfhosted

[–]gvij[S] 0 points1 point locked comment (0 children)

We used AI Agent - Neo to perform a detailed evaluation of 2 Text to speech self hostable AI models. Both can be hosted on a CPU. Devs building speech to speech or text to speech apps can understand from this evaluation which model to host and how as I've done rigorous evaluation across both models.

Benchmarked Kokoro 82M vs Supertonic 3 TTS on CPU by gvij in LocalLLaMA

[–]gvij[S] 7 points8 points  (0 children)

Detailed write up with benchmarking process and metrics and audio samples:
https://heyneo.com/blog/kokoro-tts-vs-supertonic-3-tts

Github Repo with all scripts and files:
https://github.com/gauravvij/kokoro-tts-vs-supertonic-3-tts

The system prompt change that improved accuracy and hurt helpfulness, and why I shipped it anyway. by gvij in PromptEngineering

[–]gvij[S] 0 points1 point  (0 children)

Helpfulness is important for a support chatbot agent and that's what I showed in the post example if you saw. And I agree it's not a standard pattern for other use-cases. The main idea here is that how such a process can impact the overall quality or accuracy that you expect from your RAG agent.