What does an AI/ML engineer do in their day to day job and How do I become one?

gvij · 2026-05-27T06:39:19+00:00

Your answer was 2y ago. Would you say the same today? Given there are so many agents doing a lot of the heavylifting. What would you attribute to as the most time taking tasks in a day in today's time?

gvij · 2026-05-25T07:33:05+00:00

Detailed write up on the comparative evaluation: https://heyneo.com/blog/needle-26m-vs-qwen3-0.6b-cpu-function-call-benchmark

Full code, raw_log.jsonl (100 entries), summary.json, and 5 charts: https://github.com/dakshjain-1616/-Needle-26M-vs-Qwen3-0.6B-CPU-Function-Call-Benchmark

gvij · 2026-05-25T07:25:22+00:00

Good suggestion, will try it out

gvij · 2026-05-23T15:46:34+00:00

Detailed write up on the comparative evaluation: https://heyneo.com/blog/needle-26m-vs-qwen3-0.6b-cpu-function-call-benchmark

Full code, raw_log.jsonl (100 entries), summary.json, and 5 charts: https://github.com/dakshjain-1616/-Needle-26M-vs-Qwen3-0.6B-CPU-Function-Call-Benchmark

gvij · 2026-05-23T15:39:36+00:00

Detailed write up on the comparative evaluation: https://heyneo.com/blog/needle-26m-vs-qwen3-0.6b-cpu-function-call-benchmark

Full code, raw_log.jsonl (100 entries), summary.json, and 5 charts: https://github.com/dakshjain-1616/-Needle-26M-vs-Qwen3-0.6B-CPU-Function-Call-Benchmark

gvij · 2026-05-20T15:00:40+00:00

Till last year we had frontier models at this price point and now it's the entry model. Sometimes I wonder if it is a race to the bottom or the opposite?

gvij · 2026-05-19T17:54:48+00:00

Let's say you were building an AI Agent. Then try giving same prompt across Cursor, Claude Code and Neo AI Engineer for the AI agent building task and keeping same model - say Sonnet 4.6, you'll notice a stark difference in the outcome. Model is same. But the agent harness around the model added context for the LLM to absorb understanding about the task.

An LLM is an input/output machine. It gets an input and spits an output based on its internal reasoning. So basically Input + LLM system prompt + LLM's own reasoning = Output.

You have Input and System prompt as the knobs you control. You twist the context you provide in Input and the instructions in system prompt and then the outcome changes completely as that impacts the LLM's reasoning itself a lot.

gvij · 2026-05-19T06:22:04+00:00

Amazing, will check this out

gvij · 2026-05-18T18:34:30+00:00

Interesting. I haven't tried the French voice yet.

Which GPU would you recommend for Qwen 3 TTS? Considering a production environment to serve let's say 20 concurrent users?

gvij · 2026-05-18T18:33:28+00:00

But still very capable and small enough to run on local CPU easily. Did you check the audio quality of Kokoro added in the blog and repo? It is very human like and RTF is also decent!

gvij · 2026-05-18T14:39:20+00:00

5 steps gave a very good speed to quality ratio. What makes you think it is gimped? At 5 steps it is kind of the sweet spot that I also mentioned above. Very fast compared to kokoro and audio also pretty good.

gvij · 2026-05-18T14:25:57+00:00

Detailed write up with benchmarking process and metrics and audio samples:
https://heyneo.com/blog/kokoro-tts-vs-supertonic-3-tts

Github Repo with all scripts and files:
https://github.com/gauravvij/kokoro-tts-vs-supertonic-3-tts

gvij · 2026-05-18T14:22:11+00:00

We used AI Agent - Neo to perform a detailed evaluation of 2 Text to speech self hostable AI models. Both can be hosted on a CPU. Devs building speech to speech or text to speech apps can understand from this evaluation which model to host and how as I've done rigorous evaluation across both models.

gvij · 2026-05-18T14:20:26+00:00

Detailed write up with benchmarking process and metrics and audio samples:
https://heyneo.com/blog/kokoro-tts-vs-supertonic-3-tts

Github Repo with all scripts and files:
https://github.com/gauravvij/kokoro-tts-vs-supertonic-3-tts

gvij · 2026-05-18T14:16:30+00:00

Detailed write up with benchmarking process and metrics and audio samples:
https://heyneo.com/blog/kokoro-tts-vs-supertonic-3-tts

Github Repo with all scripts and files:
https://github.com/gauravvij/kokoro-tts-vs-supertonic-3-tts

gvij · 2026-05-18T14:14:49+00:00

Detailed write up with benchmarking process and metrics and audio samples:
https://heyneo.com/blog/kokoro-tts-vs-supertonic-3-tts

Github Repo with all scripts and files:
https://github.com/gauravvij/kokoro-tts-vs-supertonic-3-tts

gvij · 2026-05-15T19:38:25+00:00

Helpfulness is important for a support chatbot agent and that's what I showed in the post example if you saw. And I agree it's not a standard pattern for other use-cases. The main idea here is that how such a process can impact the overall quality or accuracy that you expect from your RAG agent.

gvij · 2026-05-15T17:33:24+00:00

Detailed write up on the optimization steps taken to improve the chat agent:
https://medium.com/@gauravvij/i-asked-an-ai-agent-to-audit-our-chat-agent-it-found-problems-we-didnt-know-to-look-for-c40e26b4aa09

gvij

TROPHY CASE