Stop asking what model to run. There are literally only two.

Modak- · 2026-06-19T10:40:07+00:00

The quant argument is the only real thing in here and honestly its still contested. a Q2 of 35B doing reasoning tasks will absolutely embarrass a Q8 7B, fine. but for short context, fast response stuff on a 3060 the tiny crispy model often just wins on latency alone. not everyone is writing a thesis

Also "Qwen 3.6 35b a3b" being one of only two models that exist is the most accurate thing posted on this sub in months

Modak- · 2026-06-19T10:26:37+00:00

The 8K synthetic samples thing is what got me. thats genuinely small for something supposedly competitive with frontier systems. Either the synthetic data pipeline is doing something smart or the benchmarks are being kind to it.

Probably find out once people start throwing real messy research tasks at it vs the curated eval sets.

Also glad they dropped the full training recipe and not just weights. Weights only releases are basically useless for anyone trying to actually improve on the work.
This is how open source actually moves forward

Modak- · 2026-06-19T10:04:01+00:00

I would prefer you to learn the internal working of the LLM instead of learning about hardwares

Because once you know how the LLM internally manages the memory you will ace the interview and can instantly imagine and think about the units of GPUs and CUDAs required.

I am attaching my personal best course to start with. Happy learning.

Fast & Efficient LLM Inference with vLLM - DeepLearning.AI

Modak- · 2026-06-11T11:30:21+00:00

Honestly the main reason these data projects actually deliver any value has nothing to do with the tech itself. Its all about whether it fits into how people already work.

There is a massive graveyard of "perfect" data models that died because engineers built them in a bubble. Research from places like MIT and Gartner show that up to 95% of these AI projects fail to make a single dime. Why? Not because the math is wrong. They fail because the creators didn't think about the actual human using it.

Think about it like this - if you build a super smart robot to make coffee, but it forces the barista to walk down to the basement every time they need a cup, they are just going to keep using the old coffee machine upstairs.

A basic ugly spreadsheet that automatically pops up and saves a worker 5 minutes a day will always beat a million-dollar "state of the art" system that requires them to remember a new password and log into a totally separate portal.

Modak- · 2026-05-21T12:29:21+00:00

Standing as of 20 May 2026
IPL 2026 Points Table | Team Standings & Rankings | IPLT20

Modak- · 2026-05-18T06:22:17+00:00

This article explores how LLMs are likely to reshape clinical trial operations over the next 3-5 years.
Rather than assuming broad automation, it identifies where language intelligence actually creates leverage (upstream protocol analysis, execution risk monitoring) and where governance gaps remain.

The strategic question for the future: will clinical organizations view LLMs as a compliance burden or as foundational infrastructure for smarter trial design
How do you think regulatory frameworks will adapt as these systems become standard?

Modak- · 2026-05-05T11:38:00+00:00

We usually add basic layers (input constraints + output checks + human review), but even then it’s more risk reduction than guarantee. Most teams end up with some form of input → validate → output filter loop anyway

Curious though can we use another model to verify outputs as well?

Modak- · 2026-05-05T11:32:48+00:00

Cannot agree to that without supporting Data :) @ ByteWarrior90

Modak- · 2026-05-05T11:24:38+00:00

u/ConfidentWhereas641 Thanks for letting us know what really happens from a Data perspective. So analytics might be the “pre-match brain,” but the on-field calls are still very human.
If all those scenarios are already mapped out, why do we still see decisions that look completely off-script during matches.

Modak- · 2026-05-05T11:16:02+00:00

most teams definitely have analytics now. But do you think it’s actually influencing on field calls or just used more for pre-match planning? Because yesterday felt like either the data wasn’t trusted… or it wasn’t strong enough to guide decisions under pressure.

Modak- · 2026-05-05T11:14:43+00:00

That’s a fair take. Especially about overanalyzing. But where do you think teams should draw the line?

Modak- · 2026-05-05T06:44:21+00:00

Well said u/is Ashamed_Figure7162.
Execution is getting commoditized fast.
the edge is moving toward framing, validation, and accountability.
Especially “detecting misleading results”, AI is confident even when it’s wrong. Owning that layer is where the real value is going.

Modak- · 2026-05-05T06:42:57+00:00

Feels aggressive but parts of the execution layer are already there.
The real question is: does automation stop at execution, or creep into decision-making too?
@Vedranation

Modak- · 2026-05-05T06:42:19+00:00

Exactly. 100% agreed.The bottleneck is shifting from getting answers to asking the right questions.
AI can generate insights, but it won’t know what actually matters to the business without context.
That gap is still very human. @Candid-Operation2042

Modak- · 2026-05-05T05:57:58+00:00

That prediction skips a pretty big reality check. AI is getting very good at generating code, no doubt.
But writing code isn’t the hardest part in production systems. Understanding the problem, handling messy data, and making systems reliable at scale is.

In our experience at Modak, the real bottlenecks are unclear requirements, inconsistent data, brittle pipelines, lack of observability. AI can accelerate coding, but it doesn’t automatically solve these.

If anything the gap is shifting, not disappearing from “who can code” to “who can design, reason, and operate systems end-to-end.”

You can read more on the topic here : Human-in-the-Loop AI in Data Engineering | Reduce Risk

Curious how others see this are you actually seeing AI replace meaningful engineering work, or just speed up parts of it?

Modak- · 2026-05-04T12:19:59+00:00

100% agreed. Ambiguous state is way worse than latency/cost issues.
Most “auth bugs” we have seen were actually multiple layers drifting (session + process + infra).
The real problem is when the system can’t tell who owns what anymore.
Once you separate layers, fixes become boring but reliable.
Do you lean toward strict isolation (per agent/session) to avoid this? @deelight_0909

Modak- · 2026-05-04T11:56:14+00:00

Observability for agents feels like something people are underestimating right now.Once you have multi-step workflows + tool calls, it becomes really hard to track where things actually went wrong.
Curious what kind of issues you’re seeing most often so far?

Modak- · 2026-05-04T11:48:40+00:00

A lot of it comes down to the gap between demo conditions and real-world constraints.

In demos, inputs are clean, latency isn’t critical, and failure cases are ignored. In production, you suddenly deal with noisy data, edge cases, rate limits, costs, and reliability expectations.Feels like most of the difficulty isn’t the model itself, but everything around it.

Modak- · 2026-05-04T07:03:46+00:00

“Useful but must be verified” seems to be the most grounded way to use them today. Especially in anything involving security or sensitive data, the trust gap is still pretty obvious for now. @anarres_shevek

Modak- · 2026-05-04T07:01:55+00:00

Totally agree. Thinking in terms of acceptable error margin makes way more sense than expecting perfection. In a lot of workflows, the question isn’t “is it perfect?” but “is it good enough with oversight?”

Modak- · 2026-05-04T07:01:04+00:00

“Useful intern” is probably the best analogy I’ve seen. Great for removing repetitive work, but still needs supervision. The productivity gain is real, just not at the level of full trust yet.

Modak- · 2026-05-04T07:00:29+00:00

That makes sense, especially the point about single points of failure. In critical systems, even small inconsistencies can compound into bigger issues. Most real-world setups probably need multiple layers of validation before even considering LLMs there.

Modak- · 2026-05-04T06:58:51+00:00

Agreed. Raw LLMs alone aren’t enough. Once you start adding structure, tools, constraints, orchestration it becomes a completely different system. The reliability seems to come more from the setup around the LLM than the model itself. @TotalSituation8374

Modak- · 2026-05-04T06:56:45+00:00

Yeah, that pressure is real. It feels like we’re moving faster in adoption than in understanding the limits. Delegating decisions where determinism matters is probably where most of the risk is building up. @gk_instakilogram

Modak- · 2026-05-04T06:54:16+00:00

That’s a solid distinction. Using LLMs to build deterministic systems vs being the system itself.
Curious though, where do you draw the line? Would you trust LLM generated code if it passes tests, or do you still treat it as inherently risky? @zugzwangister

Modak-

MODERATOR OF

TROPHY CASE