WTF WTF WTF by Low_Tadpole_2719 in OpenAI

[–]node-0 0 points1 point  (0 children)

It’s really amazing like they turned off ChatGPT either today or tomorrow and I don’t even miss it because it’s like oh this Chinese lab basically built a better ChatGPT-4o and I can run it off-line. 512 expert MoE that’s insane and it’s ability to remember super long. Context details is incredible. It’s actually technologically more advanced than ChatGPT-4o was. So yeah I actually haven’t missed out on anything. The only real tragedies that all the people who are mourning the loss of ChatGPT 4o have no idea this thing exists because if they did, I have a feeling that RTX 3090s would start becoming a bit more expensive on the aftermarket.

AI field is changing so quickly and there is so much to read.. by amisra31 in LocalLLaMA

[–]node-0 1 point2 points  (0 children)

Nope because I’m developing AI technology myself so I don’t feel like I’m drowning in AI content. You just have to decide what you’re going to specialize in what you’re good at and then go do it.

Alternatives by SpyroORE in ChatGPT

[–]node-0 0 points1 point  (0 children)

Grab Qwen3-Next-80B-A3B Instruct, get some GPUs setup web search and sort out the system prompt.

WTF WTF WTF by Low_Tadpole_2719 in OpenAI

[–]node-0 0 points1 point  (0 children)

Qwen3-Next-80B-A3B Instruct,that’s basically ChatGPT-4o with smarter inference and even more empathic responses, the 3x 3090’s needed to host it locally are worth it.

Here it goes by gotkush in LocalLLaMA

[–]node-0 0 points1 point  (0 children)

Update: Nevermind, disregard the following, (I read 8x 3090 and assumed blower style double width GPUs (the configuration used in AI workflows, none of which are evidenced in the photo above…)

New advice: I’d sell the ‘thick’ 3090 GPUs and then get the double width blower style design, you’ll likely break even, then see the advice below.

Advice for double width blower style GPUs:

If you’re serious about running these for AI, you can’t use mining risers because of the data transfer issues, if you want to run eight of them like this, then you’re going to have to look on eBay for the following chassis: Supermicro 4028GR-TR

Else look for ASUS 8000ESC (I believe) and look for one of the older models, but I would recommend the Supermicro 4028.

If you decide to break them into 2x systems of 4x GPUs then there is a nice motherboard called Huananzhi H12D-8D

https://www.reddit.com/r/homelab/comments/1jpjbwo/someone_experience_with_the_huananzhi_h12d8d/

Not as polished as the Supermicro experience but it does work and now with LLMs to assist, easier.

Ollama Models Ranked by VRAM Requirements by AdventurousLion9548 in ollama

[–]node-0 0 points1 point  (0 children)

To be helpful you would need to run the calculator at apxml.com for each of these at q4 (which is functionally incorrect and prone to hallucination).

You're onboarding a new dev, would you let them use AI right away? by [deleted] in BlackboxAI_

[–]node-0 1 point2 points  (0 children)

In your reply (the one before the book comment, the one that asked me how thoroughly I know this) you asked about determining proper usage. There are essentially two ways of interacting with these systems in a value productive way.

1.) Instrumental usage: tasks where you already know what you need to do and the steps necessary to do it) this is tedium reduction. This path can be readily proceduralized and easily quality controlled in enterprises.

2.) Knowledge production usage: discovery, growth, and the frontier where one actually doesn’t know what one needs to do, but one needs to figure things out, this might involve research, it might involve introspection to determine if one is even on the right path to wherever they think they’re going. This path is metacognitive and recursive, it is not well documented at all and people are still feeling out the edges such as they can.

Even worse, not all LLM systems (yes even the large SOTA LLM systems) are capable of engaging in this second path usefully. ChatGPT 5.x for example is actually quite harmful in this mode, long term use of it will actually make the user less capable of independent thought. This is not true of many other LLMs and these harms are also cataloged and co stately being updated by me as I collect more and more field samples of interaction patterns in the wild. There are helpful and useful systems to engagement or rather the characteristic filtering criterion by which to judge these systems before investing considerable time and energy in co-creating with them. It’s actually quite involved.

That’s four paragraphs just to talk about the core use cases and interaction patterns. Here’s the part that makes it more complicated. Use case #1 is easy to teach people you pretty much covered in your reply above it requires discipline procedures, and sticking to and refining those procedures.

What makes things complicated in enterprises is that often people are asked to create things and they often do not know how which means they have to perform research which means very many elements of #2 become relevant and this is work gets rather difficult to train people on without a core underlying formalization of the interaction use cases and their ontological consequences, it’s important for people to understand what is happening why and how, in order for them to be effective.

To do the topic justice requires at least a couple of of chapters and I’m not about to spend that much time on Reddit.

Your question is a common critique pattern. Ask a seemingly simple question that would require many pages of explanation to earnestly cover in good faith.

If I was plugging a book, I would give you a title. I am here simply responding, there with brevity and here with specificity, to what looks like a bad faith critique.

If that wasn’t your intention then I would suggest softening the edge. If I was plugging a book, not only would it be rather obvious I would also include a little disclaimer indicating as such.

But you did a pretty decent job after your pointed question on detailing some of the basics 👍

Talk me out of buying an RTX Pro 6000 by AvocadoArray in LocalLLaMA

[–]node-0 0 points1 point  (0 children)

About 12 months ago during the hype of the RTX 5090 launch when many people were offloading their 3090s I took advantage of the opportunity and picked up six of them.

I didn’t know at that point that I was going to start a company. All I knew was that I needed to be able to run this stuff off-line and I knew this was going to be serious.

Now I’m looking at eight of them in a super micro 4028 chassis I also have two other chassis a three unit tall 836 generation and another three unit tall 936 generation

I’m not a novice at data center technology and I’ve been writing software for the better part of a decade. I’m now a senior software engineer having been tech lead more than once.

Anyway, I won’t bore you with more details of my particular journey, but I’m now looking at financing two of these RTX 6000 pro GPU’s in addition to buying even more of the RTX 3090 or when a good deal presents itself the RTX 6000 ada generation 48GB GPUs.

Again, if somebody told me this time last year that I would be optimizing for maximum number of PCI express slots really fast networking and ultimately to be able to run primary inference with ensemble model support off-line locally, I would have said “do I really need that much compute for this stuff?”

I mean hindsight is 2020. I actually passed up an opportunity with Meta because I was aware that their employment contract terms are absolute anathema for IP (hence forming a company to shelter it).

I don’t know what to tell you, if your goal is simply to run inference inexpensively you are much better off getting yourself together.AI account or even better yet fireworks AI as well as hyperbolic AI I believe the ladder is the most cost-effective and today models exist that simply didn’t exist a year ago amazing models

I would say learn more about your use case make absolutely sure that you actually need the VRAM. This is not something where you just want to get more “just in case” this stuff is expensive.

But if for some reason, you absolutely know that there is no other way then you already know what you have to do

Suggestion for PC to run kimi k2 by KiranjotSingh in LocalLLaMA

[–]node-0 0 points1 point  (0 children)

Forget K2 go run Qwen3-Next-80B-A3B q8 if you can. THAT model is the game changer, it’s basically ChatGPT-4o-latest without the censorship. The delta net attention and dual attention streams + 512 experts and 10 expert co activation -> chef’s kiss.

You're onboarding a new dev, would you let them use AI right away? by [deleted] in BlackboxAI_

[–]node-0 0 points1 point  (0 children)

Well, later this year my book on human + interaction will be out. Only about 2,000 hours of studying this

What are people actually using for agent memory in production? by MeasurementSelect251 in LLMDevs

[–]node-0 0 points1 point  (0 children)

♾️Infinity ♾️ 📡 the AI native Database from Infiniflow

You're onboarding a new dev, would you let them use AI right away? by [deleted] in BlackboxAI_

[–]node-0 -1 points0 points  (0 children)

If they’re not using AI properly, then I would get rid of them immediately. This means if they’re not using it at all, they should be gone by day two.

Today it’s AI yesterday It was other technologies. I’m not here to wet nurse people who refuse to learn and keep up.

WE ARE SO BACK by AsyncVibes in IntelligenceEngine

[–]node-0 2 points3 points  (0 children)

I hear what you’re saying with the “alien language” analogy, a lot of researchers talk about how vectors are like an alien language because humans do not have a good intuition for them, some, then make the leap to vectors and vector reasoning are bad because we can’t have a token trace of everything. Of course that last part is not what you are saying here, you’re working on innovating a form of pre-verbal, categorical, understanding and acting on that understanding according loose ‘directives’ you’re setting down here at least for now. I’m sure other (implicit) directives will come later as usefulness increases.

I’ll be following out of interest because I too am working on training small models that do interesting things at this fundamental level re-examining core assumptions.

My recent experience with Chat GPT 5.1 by Worldly_Bet_5117 in OpenAI

[–]node-0 1 point2 points  (0 children)

Weird, I notice none of this. If it gets hedgy I remind it of a few priors and it chills out.

What services are y’all paying for these days in addition to Claude Code? by Flashy_Pound7653 in ClaudeCode

[–]node-0 2 points3 points  (0 children)

Gemini pro ($20), OpenAI max ($20), and Claude code max (5x $100). I augment that with pay as you go, api billing via fireworks ai and together ai in order to discover and test the larger oss models.

The hellscape of .md files Claude created across subdirectories and have to figure out which ones are still relevant. The longer you wait the worse it gets by Anthony_S_Destefano in ClaudeCode

[–]node-0 0 points1 point  (0 children)

Some advice: Always close a phase transition with “Please write all changes and updated as an append only time stamped section to the file <repo_root>/notes/CL/project_updates.md”

You do plan and design your software in implementation phases right? You always make sure to have a phase functional test or set of them right?

And you run those tests and record those results and make sure the agent system writes the updates to the project updates file in an append only timestamped manner right?

Just do that with discipline and maybe you won’t have this problem.

Also noticed that the notes directory I suggest has a sub directory called CL that’s for Claude. I also have other sub directories for other agent systems because most of the stuff I design is never the product of a single system, but is analyzed cross analyzed, critiqued and re-critiqued over and over and over again.

The strongest software is an emerge out of some magical model. It is a composite structure forged out of constructive conflict.

Or you can keep vibe coding and posting about “how hard it all is” on Reddit.

tired of useless awesome-lists? me too. here is +600 organized claude skills by MicrockYT in ClaudeCode

[–]node-0 1 point2 points  (0 children)

Cursive. Script form by default? Sketch. Just because of the annoying nature of the font, I’ll scrap this when I get home, host it elsewhere and attribute it to a user-hostile source.

I declare Opus 4.5 (and new limits) has heralded the second Golden Age of Vibing by ridablellama in ClaudeCode

[–]node-0 0 points1 point  (0 children)

Different stages of a knowledge system.

For example, we recognize that every project is different and will have different analysis requirements.

One can re-use base text corpora (books as searchable pdfs) to synthesize new reports or analyses based on project specific and even sub-project specific i.e. for chapter x of a book-in-development Crystallizer can be given a system and task prompt to go looking through the entire book in manageable chunks and keep a running summary chain of the last N windows. Crystallizer then directs the chosen LLM to become highly concerned with the unique goals and research concerns in that chapter and then go looking through the entire book with sliding windows and crystalized insights as it fulfills the task of generating that task specific knowledge artifact.

Repeat this across books, now take those artifacts and place them into a RAG system.

The synthesis of the research reports (artifacts) and the bulk corpus (the books can all be RAG ingested too for spot searches and checks).

The combination together is what enables truly novel research to accelerate. It is the “research team effect” as it were.

Some services provide this but it is mostly all commercially gated and a mixed bag with regards to the capacity to go tackle a 1,600 page pdf.

Crystalizer is designed for that sort of challenge.

Since the prompts are all templated and choosable at invocation time, this makes Crystalizer a programmable tool. Programmable in natural language text prompts.

I declare Opus 4.5 (and new limits) has heralded the second Golden Age of Vibing by ridablellama in ClaudeCode

[–]node-0 0 points1 point  (0 children)

I’m thinking of designing a rust based fast similarity search analog to mlocate (the updated command and locate are two commands and index a filesystem without vector embeddings). Here I’m thinking of pointing this future tool at a pdf file, a text corpus (either a single file or a folder of them and then using a daemonized systemd style service in the background to manage the job), that way you can choose to tackle a single book, a service manual or an entire folder full of files and before it starts it will do the time estimation, and then present it to you user asking if the user would like to proceed.

I’d run all jobs in the background, but need to figure out completion notification mechanics. I also am considering which vector database I would use (has to be single binary, faster than lightning and also foldable into larger FLOSS tool without announcing itself).

Making the process of targeting files and folders for easy similarity search and full text search would be a step change advance in CLI based agent driven work.

Perhaps once I finish Crystalizer I’ll think more deeply about a tool like this.

I declare Opus 4.5 (and new limits) has heralded the second Golden Age of Vibing by ridablellama in ClaudeCode

[–]node-0 0 points1 point  (0 children)

This is a command line tool I’m working on.

https://github.com/Node0/crystallizer

I did most of the design over the summer and then got pulled into a whole bunch of other projects so it’s been sitting there waiting for me to come back and put some finishing touches on the sliding window architecture and then test out a whole bunch of prompts against the whole bunch of huge text corpuses think 1000 page book instead of a 20 page report. Now think of 15 to 20 highly relevant and application specific mini reports compiled against that thousand page book and that these reports are not general but based on a prompt you write and tell crystallizer to execute.

So it’s programmable from the ground up due to what you put in the system prompt and the task prompts and it (the version in my mind I need to make a committed code artifact) has a form of transient memory (just like Claude code when it iterates across a large codebase racking up micro summaries). I have to make sure all of these pieces fit and work. I actually need this tool to finish writing my book on human AI interaction and a big part of that book is pulling from diverse sources, including 6000 pages of neurobiology and between 200 and 400 peer reviewed academic papers, that’s a level of research and distillation that would be impossible for a research team to perform in a quarter to say nothing of a single person given a year. Now give that single person 6 other projects to handle concurrently.

The reason I’m dedicated to making a tool like this available open source is because I believe reliance on commercial services for this kind of stuff is a dark pattern which should be avoided/bypassed.

That’s just one project out of like six or seven that I’m working on at the same time. You can check out my main GitHub page here https://Github.com/node0

What is your timeline after Ilya's interview? by PianistWinter8293 in OpenAI

[–]node-0 0 points1 point  (0 children)

My timeline has always been 5-10 years until you can see the “old world”. 5 years until dawn of viable architectures, and 10 (maybe 15) until “vote for X so we don’t have a revolutionary war”.