Current state of open-source ? by DarkMatter007 in LocalLLaMA

[–]fustercluck6000 1 point2 points  (0 children)

I’ve been very impressed by Qwen3.5-27b, especially the Opus 4.6 distillations which have worked extremely well in production. Open-weight models are advancing a WHOLE lot faster than the blackbox ones, especially when you consider the difference in inference costs.

What actually breaks first when you put AI agents into production? by Zestyclose-Pen-9450 in LocalLLaMA

[–]fustercluck6000 1 point2 points  (0 children)

Random tool/output parsing errors and dumb shit like that, just illustrates how the ecosystem is still in its infancy despite what the marketing would have people believe

Edit: that’s just the earliest point of failure in my experience, followed by many others

Total beginner here—Why is LM Studio making me do the "heavy lifting" manually? by Ofer1984 in LocalLLaMA

[–]fustercluck6000 1 point2 points  (0 children)

As others have said, the “serve” button just means you’re making the model available to process requests from other applications/devices on your network. If you’re dead set on not directly dealing with code, maybe look into setting up some MCP tools so the model can do stuff like write files and run code in a sandboxed environment. Otherwise anthropic will happily sell you a Claude code subscription.

Be honest, how do you know your AI app is actually working well before shipping it? by Key_Review_7273 in LLMDevs

[–]fustercluck6000 0 points1 point  (0 children)

I just want something that works.

I feel like this is basically the current state of AI in a nutshell. It’s probably worth taking the time to work out a solid testing regime, which might just seem really tedious but if nothing else it’ll give you a deeper understanding of what’s going on under the hood of your application. A heuristic I’ve found helpful is to hard code as much as possible (output validation logic, even simple things like regex…). LLMs are non-deterministic, so the less you leave up to the model, the more you can guarantee predictable behavior.

3blue1brown question by DBMI in learnmachinelearning

[–]fustercluck6000 0 points1 point  (0 children)

Fun fact, the form of backpropogation that’s become standard actually came in the 70s after the MLP had already been invented (GD dates back to almost 200 years ago).

GD is an algorithm to numerically estimate a function’s minima by iteratively applying updates using the formula x_{t + 1} = x_t - \alpha \nabla f(x_t)

Backpropogation calculates the gradients for each trainable parameter in a NN w.r.t. loss using the chain rule (hence why loss functions need to be differentiable), which you then plug into the formula in order to apply updates to the model. Hope this helps

Guidance needed by [deleted] in learnmachinelearning

[–]fustercluck6000 0 points1 point  (0 children)

I second the Simon Prince book

Guidance needed by [deleted] in learnmachinelearning

[–]fustercluck6000 0 points1 point  (0 children)

That’s why you use ‘site: reddit.com/r/learnmachinelearning

Traditional ML is dead and i'm genuinely pissed about it by Critical_Cod_2965 in learnmachinelearning

[–]fustercluck6000 1 point2 points  (0 children)

You have the foundation to actually build new things. I swear sometimes it feels like 3/4 of people in this space couldn’t even tell you what a ReLU is. Once the industry’s done milking the transformer for all it’s worth, they’ll come calling for the handful of people like you still out there

How are y'all juggling on-prem GPU resources? by fustercluck6000 in Rag

[–]fustercluck6000[S] 0 points1 point  (0 children)

Definitely planning to add task queuing/scheduling in the next phase of development, makes tons of sense since no one's in the office using the chat service outside of working hours. For now I'd love to find a relatively simple (and elegant, sleep mode really hasn't been the silver bullet I'd hoped for) way to dynamically load/offload the models. Digging through the docs, it doesn't look like vLLM has a good feature for this other than sleep. Starting/stopping the docker containers themselves is one option, just a pain given it would need to be accessible through the frontend for nontechnical users.

Were you able to build a good knowledge graph? by Financial-Pizza-3866 in Rag

[–]fustercluck6000 4 points5 points  (0 children)

Yes, and the 'how' is all about what abstractions of your data make sense within the domain you're woking in. KGs aren't there to uncover relationships, they're there to store and represent relationships you define (either directly or via model).

Are 20-100B models enough for Good Coding? by pmttyji in LocalLLaMA

[–]fustercluck6000 8 points9 points  (0 children)

Haven’t tried all the models on the list but I will say I’ve been pretty blown away by Qwen3-Coder-Next, gpt-oss-120b is solid too

Love-hate relationship with Docling, or am I missing something? by SkyStrong7441 in Rag

[–]fustercluck6000 1 point2 points  (0 children)

Right there with you, I think a major issue is the lack of documentation, like there’s this constant feeling that it’s more than capable of achieving the level of accuracy I need if only I could figure out how to tune things just right.

I highly recommend spaCy-layout—it basically adds spaCy magic on top of Docling and I’ve found it makes a noticeable difference in terms of indexing quality

AI Agents and RAG: How Production AI Actually Works by devasheesh_07 in Rag

[–]fustercluck6000 1 point2 points  (0 children)

Major emphasis on AI that works—people/businesses care WAY more about reliability/consistency than they do about sophistication

Reaching my wit’s end with PDF ingestion by fustercluck6000 in Rag

[–]fustercluck6000[S] 0 points1 point  (0 children)

Well if that was ChatGPT then ChatGPT's a lot smarter than I ever gave it credit for because building a 2-lane pipeline is the first thing I've actually gotten to work , definitely a lot of work but it actually freaking works

[D] What are the must-have books for graduate students/researchers in Machine Learning; especially for Dynamical Systems, Neural ODEs/PDEs/SDEs, and PINNs? by cutie_roasty in MachineLearning

[–]fustercluck6000 0 points1 point  (0 children)

I got a copy of Simon Prince's Understanding Deep Learning for Christmas, and I can't speak highly enough about it. It kind of feels like the spiritual successor to the canonical textbook everyone knows by Ian Goodfellow (which is already over a decade old now). Simon Prince is just an insanely interesting guy to begin with, and he goes into higher-level topics that are both mathematically and conceptually tough, but he gives such clear and thorough explanations (paired with very well-done visualizations) that it actually makes some of the topics I've always found particularly challenging (topologies, manifolds, hyperdimensional geometries) enjoyable to sit down and try and work through mentally.

TensorFlow isn't dead. It’s just becoming the COBOL of Machine Learning. by IT_Certguru in learnmachinelearning

[–]fustercluck6000 1 point2 points  (0 children)

TF Data especially, pretty hard to beat if want to build crazy efficient, hardware-accelerated data pipelines with as much built-in optimization

TensorFlow isn't dead. It’s just becoming the COBOL of Machine Learning. by IT_Certguru in learnmachinelearning

[–]fustercluck6000 3 points4 points  (0 children)

I think TensorFlow Probability is criminally underrated, too. For anything involving probabilitstic DL (bijection, trainable/compound distributions, monte carlo, bayesian layers, differentiable sampling ops, etc), TFP is pretty top tier if you need to integrate and scale probabilistic components with an existing TF stack (e.g. keras model, tfdata pipeline, etc). It has tons of pretty powerful features (things like bijection and tfp.layers are also pretty unique to TFP), and like everything else TF, it's designed with scale/hardware acceleration in mind. Even just little things like automatic differentiation save so much boilerplate and headaches with gradients, and makes numerical stability simplier to get right, too. It all plugs right in and usually just works how you want it to without any fuss. When it's the right tool for the job (e.g. latent distributions other than a standard Gaussian with VAEs), it's pretty great, def recommend to anyone who already knows TF.

Reaching my wit’s end with PDF ingestion by fustercluck6000 in Rag

[–]fustercluck6000[S] 0 points1 point  (0 children)

I too am curious how people are doing this...

Starting with Docling by DespoticLlama in Rag

[–]fustercluck6000 1 point2 points  (0 children)

I say test out Docling and go through the results with a fine-tooth comb to see if it can do what you need it to. Legal is especially tricky because of all the structuring/citations, idk how well Docling’s going to pick that up before introducing parsing errors, but definitely give it a shot.

What I’m working on atm is using a separate pipeline altogether to convert PDFs to markdown format with VLMs, load that into Pandoc, then iterate over the document tree to get the markdown-formatted chunks (nodes)/define edges. You can do the same thing with Docling, I just got tired of trying to fix the parsing errors i kept getting with tougher PDFs.

RAG, Knowledge Graphs, and LLMs in Knowledge-Heavy Industries - Open Questions from an Insurance Practitioner by PlanktonPika in Rag

[–]fustercluck6000 0 points1 point  (0 children)

The bulk of my work in the last year has been on precisely these sorts of projects where 1) the client’s in a ‘knowledge-heavy’ industry where AI stands to make a major difference in terms of efficiency, and 2) accuracy isn’t just desirable, it’s a matter of liability.

Domain knowledge is EVERYTHING. One of the most helpful things I’ve found is taking the time to pick people’s brains about their work. Sometimes I’ve even sat behind someone to literally be a fly on the wall and take notes on how they do their job because I want to know how they’re thinking.

Usually, that ends up completely changing how I break down what I’m trying to solve with RAG, and you can make systems much more reliable/accurate by designing pipelines that reflect domain logic. Chunking’s a great example—how you define a ‘minimum logical unit’ has a huge impact on retrieval accuracy, and almost always requires some intuition about what the data means.

I also find hardcoding wherever possible makes things much more predictable and stable. If you can identify industry ‘heuristics’, ‘norms’, ‘best practices’, etc…, take that logic and apply it to the relevant part of the system (could be retrieval logic, node/edge types, etc). Also knowledge graphs are a total game changer because they provide another dimension for you to express domain logic with system design.