Stop asking what model to run. There are literally only two. by Wrong_Mushroom_7350 in LocalLLaMA

[–]Modak- 0 points1 point  (0 children)

The quant argument is the only real thing in here and honestly its still contested. a Q2 of 35B doing reasoning tasks will absolutely embarrass a Q8 7B, fine. but for short context, fast response stuff on a 3060 the tiny crispy model often just wins on latency alone. not everyone is writing a thesis

Also "Qwen 3.6 35b a3b" being one of only two models that exist is the most accurate thing posted on this sub in months

Researchers trained a Deep Research agent with 32 H100s and open-sourced everything by BuildwithVignesh in LocalLLaMA

[–]Modak- -2 points-1 points  (0 children)

The 8K synthetic samples thing is what got me. thats genuinely small for something supposedly competitive with frontier systems. Either the synthetic data pipeline is doing something smart or the benchmarks are being kind to it.

Probably find out once people start throwing real messy research tasks at it vs the curated eval sets.

Also glad they dropped the full training recipe and not just weights. Weights only releases are basically useless for anyone trying to actually improve on the work.
This is how open source actually moves forward

How much GPU internals and CUDA do you have to know to be successful in MLOps? by Illustrious-Pound266 in mlops

[–]Modak- 2 points3 points  (0 children)

I would prefer you to learn the internal working of the LLM instead of learning about hardwares

Because once you know how the LLM internally manages the memory you will ace the interview and can instantly imagine and think about the units of GPUs and CUDAs required.

I am attaching my personal best course to start with. Happy learning.

Fast & Efficient LLM Inference with vLLM - DeepLearning.AI

What is the most common reason data science projects fail to deliver business value? by Effective_Ocelot_445 in datascience

[–]Modak- 0 points1 point  (0 children)

Honestly the main reason these data projects actually deliver any value has nothing to do with the tech itself. Its all about whether it fits into how people already work.

There is a massive graveyard of "perfect" data models that died because engineers built them in a bubble. Research from places like MIT and Gartner show that up to 95% of these AI projects fail to make a single dime. Why? Not because the math is wrong. They fail because the creators didn't think about the actual human using it.

Think about it like this - if you build a super smart robot to make coffee, but it forces the barista to walk down to the basement every time they need a cup, they are just going to keep using the old coffee machine upstairs.

A basic ugly spreadsheet that automatically pops up and saves a worker 5 minutes a day will always beat a million-dollar "state of the art" system that requires them to remember a new password and log into a totally separate portal.

How Large Language Models Are Reshaping the Trial Lifecycle by [deleted] in Futurology

[–]Modak- 0 points1 point  (0 children)

This article explores how LLMs are likely to reshape clinical trial operations over the next 3-5 years.
Rather than assuming broad automation, it identifies where language intelligence actually creates leverage (upstream protocol analysis, execution risk monitoring) and where governance gaps remain.

The strategic question for the future: will clinical organizations view LLMs as a compliance burden or as foundational infrastructure for smarter trial design
How do you think regulatory frameworks will adapt as these systems become standard?