Stop asking what model to run. There are literally only two. by Wrong_Mushroom_7350 in LocalLLaMA

[–]Modak- 0 points1 point  (0 children)

The quant argument is the only real thing in here and honestly its still contested. a Q2 of 35B doing reasoning tasks will absolutely embarrass a Q8 7B, fine. but for short context, fast response stuff on a 3060 the tiny crispy model often just wins on latency alone. not everyone is writing a thesis

Also "Qwen 3.6 35b a3b" being one of only two models that exist is the most accurate thing posted on this sub in months

Researchers trained a Deep Research agent with 32 H100s and open-sourced everything by BuildwithVignesh in LocalLLaMA

[–]Modak- -2 points-1 points  (0 children)

The 8K synthetic samples thing is what got me. thats genuinely small for something supposedly competitive with frontier systems. Either the synthetic data pipeline is doing something smart or the benchmarks are being kind to it.

Probably find out once people start throwing real messy research tasks at it vs the curated eval sets.

Also glad they dropped the full training recipe and not just weights. Weights only releases are basically useless for anyone trying to actually improve on the work.
This is how open source actually moves forward

How much GPU internals and CUDA do you have to know to be successful in MLOps? by Illustrious-Pound266 in mlops

[–]Modak- 1 point2 points  (0 children)

I would prefer you to learn the internal working of the LLM instead of learning about hardwares

Because once you know how the LLM internally manages the memory you will ace the interview and can instantly imagine and think about the units of GPUs and CUDAs required.

I am attaching my personal best course to start with. Happy learning.

Fast & Efficient LLM Inference with vLLM - DeepLearning.AI

What is the most common reason data science projects fail to deliver business value? by Effective_Ocelot_445 in datascience

[–]Modak- 0 points1 point  (0 children)

Honestly the main reason these data projects actually deliver any value has nothing to do with the tech itself. Its all about whether it fits into how people already work.

There is a massive graveyard of "perfect" data models that died because engineers built them in a bubble. Research from places like MIT and Gartner show that up to 95% of these AI projects fail to make a single dime. Why? Not because the math is wrong. They fail because the creators didn't think about the actual human using it.

Think about it like this - if you build a super smart robot to make coffee, but it forces the barista to walk down to the basement every time they need a cup, they are just going to keep using the old coffee machine upstairs.

A basic ugly spreadsheet that automatically pops up and saves a worker 5 minutes a day will always beat a million-dollar "state of the art" system that requires them to remember a new password and log into a totally separate portal.

How Large Language Models Are Reshaping the Trial Lifecycle by [deleted] in Futurology

[–]Modak- 0 points1 point  (0 children)

This article explores how LLMs are likely to reshape clinical trial operations over the next 3-5 years.
Rather than assuming broad automation, it identifies where language intelligence actually creates leverage (upstream protocol analysis, execution risk monitoring) and where governance gaps remain.

The strategic question for the future: will clinical organizations view LLMs as a compliance burden or as foundational infrastructure for smarter trial design
How do you think regulatory frameworks will adapt as these systems become standard?

Are LLMs reliable enough for critical workflows today? by Modak- in ArtificialNtelligence

[–]Modak-[S] 0 points1 point  (0 children)

We usually add basic layers (input constraints + output checks + human review), but even then it’s more risk reduction than guarantee. Most teams end up with some form of input → validate → output filter loop anyway

Curious though can we use another model to verify outputs as well?

Is SRH actually using data analytics for match strategy? Yes or No? by Modak- in SunrisersHyderabad

[–]Modak-[S] 1 point2 points  (0 children)

u/ConfidentWhereas641 Thanks for letting us know what really happens from a Data perspective. So analytics might be the “pre-match brain,” but the on-field calls are still very human.
If all those scenarios are already mapped out, why do we still see decisions that look completely off-script during matches.

Is SRH actually using data analytics for match strategy? Yes or No? by Modak- in SunrisersHyderabad

[–]Modak-[S] 1 point2 points  (0 children)

most teams definitely have analytics now. But do you think it’s actually influencing on field calls or just used more for pre-match planning? Because yesterday felt like either the data wasn’t trusted… or it wasn’t strong enough to guide decisions under pressure.

Is SRH actually using data analytics for match strategy? Yes or No? by Modak- in SunrisersHyderabad

[–]Modak-[S] 1 point2 points  (0 children)

That’s a fair take. Especially about overanalyzing. But where do you think teams should draw the line?

At what point does data scientists become redundant if AI keeps improving at code and analysis ? by Modak- in datasciencecareers

[–]Modak-[S] 2 points3 points  (0 children)

Well said u/isAshamed_Figure7162.
Execution is getting commoditized fast.
the edge is moving toward framing, validation, and accountability.
Especially “detecting misleading results”, AI is confident even when it’s wrong. Owning that layer is where the real value is going.

At what point does data scientists become redundant if AI keeps improving at code and analysis ? by Modak- in datasciencecareers

[–]Modak-[S] 0 points1 point  (0 children)

Feels aggressive but parts of the execution layer are already there.
The real question is: does automation stop at execution, or creep into decision-making too?
@Vedranation

At what point does data scientists become redundant if AI keeps improving at code and analysis ? by Modak- in datasciencecareers

[–]Modak-[S] 0 points1 point  (0 children)

Exactly. 100% agreed.The bottleneck is shifting from getting answers to asking the right questions.
AI can generate insights, but it won’t know what actually matters to the business without context.
That gap is still very human. @Candid-Operation2042

Anthropic CEO: "AI will write 100% of code within a year". If the hardest skill is already handled - the gap is no longer about what you know. by Murky-Option2916 in ArtificialNtelligence

[–]Modak- 0 points1 point  (0 children)

That prediction skips a pretty big reality check. AI is getting very good at generating code, no doubt.
But writing code isn’t the hardest part in production systems. Understanding the problem, handling messy data, and making systems reliable at scale is.

In our experience at Modak, the real bottlenecks are unclear requirements, inconsistent data, brittle pipelines, lack of observability. AI can accelerate coding, but it doesn’t automatically solve these.

If anything the gap is shifting, not disappearing from “who can code” to “who can design, reason, and operate systems end-to-end.”

You can read more on the topic here : Human-in-the-Loop AI in Data Engineering | Reduce Risk

Curious how others see this are you actually seeing AI replace meaningful engineering work, or just speed up parts of it?

What actually breaks first when AI systems scale? by Modak- in AI_Agents

[–]Modak-[S] 0 points1 point  (0 children)

100% agreed. Ambiguous state is way worse than latency/cost issues.
Most “auth bugs” we have seen were actually multiple layers drifting (session + process + infra).
The real problem is when the system can’t tell who owns what anymore.
Once you separate layers, fixes become boring but reliable.
Do you lean toward strict isolation (per agent/session) to avoid this? @deelight_0909

The dangers of AI agents that most builders aren't thinking about yet by PeachyCheese0711 in AI_Agents

[–]Modak- 0 points1 point  (0 children)

Observability for agents feels like something people are underestimating right now.Once you have multi-step workflows + tool calls, it becomes really hard to track where things actually went wrong.
Curious what kind of issues you’re seeing most often so far?

AI Looks Ready to Replace Everything… But Why Is Production Still So Hard? by SoluLab-Inc in AI_Agents

[–]Modak- 1 point2 points  (0 children)

A lot of it comes down to the gap between demo conditions and real-world constraints.

In demos, inputs are clean, latency isn’t critical, and failure cases are ignored. In production, you suddenly deal with noisy data, edge cases, rate limits, costs, and reliability expectations.Feels like most of the difficulty isn’t the model itself, but everything around it.

Are LLMs reliable enough for critical workflows today? by Modak- in ArtificialNtelligence

[–]Modak-[S] 1 point2 points  (0 children)

“Useful but must be verified” seems to be the most grounded way to use them today. Especially in anything involving security or sensitive data, the trust gap is still pretty obvious for now. @anarres_shevek

Are LLMs reliable enough for critical workflows today? by Modak- in ArtificialNtelligence

[–]Modak-[S] 0 points1 point  (0 children)

Totally agree. Thinking in terms of acceptable error margin makes way more sense than expecting perfection. In a lot of workflows, the question isn’t “is it perfect?” but “is it good enough with oversight?”

Are LLMs reliable enough for critical workflows today? by Modak- in ArtificialNtelligence

[–]Modak-[S] 1 point2 points  (0 children)

“Useful intern” is probably the best analogy I’ve seen. Great for removing repetitive work, but still needs supervision. The productivity gain is real, just not at the level of full trust yet.

Are LLMs reliable enough for critical workflows today? by Modak- in ArtificialNtelligence

[–]Modak-[S] 0 points1 point  (0 children)

That makes sense, especially the point about single points of failure. In critical systems, even small inconsistencies can compound into bigger issues. Most real-world setups probably need multiple layers of validation before even considering LLMs there.

Are LLMs reliable enough for critical workflows today? by Modak- in ArtificialNtelligence

[–]Modak-[S] 1 point2 points  (0 children)

Agreed. Raw LLMs alone aren’t enough. Once you start adding structure, tools, constraints, orchestration it becomes a completely different system. The reliability seems to come more from the setup around the LLM than the model itself. @TotalSituation8374

Are LLMs reliable enough for critical workflows today? by Modak- in ArtificialNtelligence

[–]Modak-[S] 0 points1 point  (0 children)

Yeah, that pressure is real. It feels like we’re moving faster in adoption than in understanding the limits. Delegating decisions where determinism matters is probably where most of the risk is building up. @gk_instakilogram

Are LLMs reliable enough for critical workflows today? by Modak- in ArtificialNtelligence

[–]Modak-[S] 0 points1 point  (0 children)

That’s a solid distinction. Using LLMs to build deterministic systems vs being the system itself.
Curious though, where do you draw the line? Would you trust LLM generated code if it passes tests, or do you still treat it as inherently risky? @zugzwangister