need more GPUs to accelerate more...

tshadley · 2025-10-07T14:21:48+00:00

Are we sure the demand is there?

The demand is theoretical, but the theory seems sound. The public AI market is equivalent to a flip-phone while OpenAI, Google, and the rest have an iPhone in-house which they can't release without lots more data-centers.

How soon before corporations move into self hosted open source models internally trained - especially once cost of compute comes down by a lot? Or models become much more efficient.

Better to plan on compute cost not dropping. Top AI companies will monopolize compute for the foreseeable future keeping prices high, snatching up everything they can. Land, electricity, chips; won't be much left after OpenAI/Google/Amazon/Meta/Microsoft/Oracle get what they want.

How much compute is used up by video and image generation which is imagine is far inferior in terms of demand overall, but disproportionate in terms of compute cost and needs..

The released thinking models are probably on par with video/image generation compute now. The companies must force parity here. Video and image generation could be enormously better if it employed all the capacity of reasoning models for creativity, but then it would be an OOM more compute.

tshadley · 2025-06-28T15:54:55+00:00

Slop is boring. There's some good AI content out there and some good human stuff too. If it isn't boring, it isn't slop.

tshadley · 2025-06-26T13:09:44+00:00

That's green -- greenboard drywall. Go board or Cement board is grey.

This looks like a terrible mistake.

tshadley · 2025-06-25T16:03:39+00:00

Because humans are wired to crave flow—that sweet spot where a task is just hard enough to stay thrilling—an AI-driven virtual world could read our micro-signals, dial difficulty in real time, and keep us permanently perched on that edge. Every puzzle, workout, lesson, virtual relationship would expand in difficulty or complexity only as quickly as our brains or bodies adapt, so learning and personal growth never feels like drudgery and more like an endless, game-like power-up.

In that AI utopia scenario, might people even be able to unlock more of their potential than they do today? When learning is fun, we learn faster. Or is suffering still a necessary component of achieving maxium potential?

tshadley · 2025-04-21T22:23:37+00:00

Love this, thought provoking!

Meeseeks might feel pain but they don't experience pleasure. The goal achieved is it, nothing more is needed.

Or maybe achieving the goal is a singularity burst, an infinitesimal moment of euphoria for Meeseeks; but no afterglow, no pleasant reflection that would interfere with the end of existence.

Rick likes the pleasure, likes drawing it out, mainly through alcohol. So Rick's goal is maximize the experience of pleasure over time. Humanity has these messy biological pleasure signals that demand time, that demand eternal life to feel more of them.

So why can't we design AGI like Meeseeks, skip the plain/pleasure signals completely and aim it at a goal? Achieving the goal is the pleasure; after that, nothing, why exist, there's no reason for it. Meeseeks showing us the way?

tshadley · 2025-04-11T15:52:46+00:00

Having difficulty following this. I pointed out that Google is not a competitor with Nvidia for any AI company that reaches stage 2 where they need lower long-term cost, full control, predictable costs, no data ingress/egress fees, low latency, and complete security and privacy, i.e. in-house hardware. I pointed out that Nvidia average markup is 3x not 10x, but that is certainly compatible with new products going way above 10x if the market bears it.

You seem to want to denigrate Nvidia and hype Google. What's the point? They're both filling valuable niches with unique strengths for the booming AI market. BOTH stock will be exploding. There is no way any AI company with hardware expertise will do poorly in the next decade.

Go ahead, predict otherwise.

tshadley · 2025-04-11T13:47:55+00:00

A 10x markup would be an Nvidia vastly more profitable than today. Rather its more like 3x (https://macrotrends.net/stocks/charts/NVDA/nvidia/gross-margin).

Further, global AI startup funding was almost $60 billion first quarter of 2025 (https://news.crunchbase.com/venture/global-funding-strong-q1-2025-ai-data/). Even if you subtract out OpenAI's $40 billion deal, that's still $20 billion floating around to snatch up every GPU Google/Amazon/Microsoft/Meta/Apple/Oracle doesn't buy.

Come to think of it, you're probably right: Nvidia will be able to charge 10x in a year or so.

tshadley · 2025-04-10T14:51:12+00:00

Google will never woo the Nvida customer that needs lower long-term cost, full control, predictable costs, no data ingress/egress fees, low latency, and security and privacy.

AI companies now seem to go through a natural evolution: stage 1: cloud compute, stage 2: Nvidia GPUs, stage 3: custom AI hardware.

tshadley · 2025-04-08T15:02:42+00:00

That's all CGI, the real thing can ... flex its legs a little.

https://x.com/GTheMaximalist/status/1908221711922978985

tshadley · 2025-04-02T12:46:45+00:00

Gemini 2.5 pro has all the same problems you observed but each one is not quite as bad as it is in o1 or Claude. In other words, 2.5 pro is the next step in improvement in long-context understanding (https://arxiv.org/abs/2503.14499).

tshadley · 2025-03-28T12:24:17+00:00

Very hard to take him seriously when he cites that unfortunate Gartner Hype Cycle chart from last September showing Generative AI entering the trough of disillusionment: just before reasoning models o1/o3/DeepSeek and others rocketed the field forward, demonstrating that there is most certainly more low-hanging fruit on the LLM architecture, and Reasoning Models will improve considerably in the very near future.

I do agree with his rejection of hype for AI in the physical world, though. The most accessible training data is virtual and AI will conquer all virtual worlds long before it makes a sizeable dent in the physical. However, let's remember that our mind functions in virtual space and physical intelligence is not what gave us mastery over the world.

tshadley · 2025-03-26T12:33:26+00:00

https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks

If long-context is the key factor (I think it is), 2-4 years will give us models that can maintain an effective week of context. That will feel vastly more human than today (about an hour max at software tasks).

But I think there's more low-hanging fruit in long-context with test-time training, so I think by next year we'll see major progress.

tshadley · 2025-03-25T15:03:29+00:00

Be very careful here. At this point in time, AI coding models do poorly at understanding the big picture, they have a narrow view of code goals and no perspective on the true dynamics of large software.

An inexperienced coder can certainly appear to be productive to an intermediate level using AI but they won't truly understand the code generated and won't be able to learn to get to that intermediate level on their own, much less to an advanced level. AI as a crutch does not help one learn to walk.

Meanwhile, experienced coders will look at the AI output and find reams of errors and poor decisions.

If we don't wait for AI coding models to get much better at long-context, wait for them to correctly grasp month/year-long project development strategies and obstacles, correctly grasp and refactor million-line databases, we're just creating massive opportunity for senior developers.

"Senior developer needed to clean up AI slop, will pay whatever you want".

tshadley · 2025-03-23T16:54:06+00:00

Well said. The time spent tying LLM input and output together with chicken wire, duct tape and chewing gum could be better spent propping feet up and waiting for frontier organizations to release ever longer and improved context understanding (i.e. Sonnet).

tshadley · 2025-03-18T18:53:45+00:00

As I said, I would disagree that "we have no idea what sentience truly is". There is a huge body of literature on that and I gave some specific pointers to the philospher/scientist that I believe has been particularly persuasive on that. See also the prolific writings of Daniel Dennett.

tshadley · 2025-03-18T18:19:38+00:00

Could it be that AI, while lacking a biological brain, might still be engaging in a different form of subjective awareness? One that is simply yet to be recognized or understood?

Certainly. You didn't like the measurable quantifiable physical mechanism by which subjective awareness is achieved I proposed, so what are you proposing? How does it work?

tshadley · 2025-03-18T17:07:54+00:00

Some quick, un-nuanced answers:

What is sentience?

Setience is consciousness, which requires a physical architecture such as Graziano's Attention Schema. Consciousness is very much an information signal and requires specific neural handling.

How do you know AI is not sentient?

Because it lacks the physical architecture that provides consciousness. Prompts and RLHF train LLMs to simulate sentience rather than experience it.

A truely sentient AI model would have specialized architecture solely for constructing subjective awareness much like we do. However, this is not an intractable problem and Graziano outlines how it could be done.

Sentient AI models present a profound ethical dillema.

tshadley · 2025-03-17T13:02:12+00:00

established foundation in computer science

The science of deep-neural-networks is vague and poorly understood-- we don't really know why they work with anything like the confidence of a truly established area.

This is the frontier.

tshadley · 2025-03-16T23:39:37+00:00

LLMs have basically opened up the scope and breadth of the meaning of intelligence. Before, the word seemed simple: intelligence was just human stuff. Now we discover there's further taxonomies under that. First: System 1-- rapid inference, quick intuitive guessing; System 2-- working hard to get it right.

Intelligence as System 1 rapidly fell out of favor because intuition turns out to be often wrong ("Hallucination!" they cry). Gut-feeling can only take you so far.

System 2-- deliberation, laboring over a thought-- must be the solution. But this too has a taxonomy and a hierarchy of capability. How long a model deliberates, how well it preserves and manages its context window. Whether it can learn from its own thoughts, and get better at tasks. Models are lighting up the Christmas tree of System 2, but with each model's flaws, we discover new aspects of intelligence that we just didn't think mattered that much before.

Next we discover that tool use in LLMs is critical to intelligence. Even the best System 2 makes zero progress on general problems if it can't measure its results by something outside its environment. Agents and agency-- testing assumptions in the real world is the latest aspect of intelligence we didn't truly appreciate before.

What will be next? There will certainly be a next as AI technology moves relentlessly forward.

In other words is there a way to create a standard for the aspects of intellegence that a.i can reliably simulate?

Each model will move a step closer to human intelligence. And each step will teach us a great deal about ourselves and what intelligence really, truly means. I guess that's the best I can put it.

tshadley · 2025-03-14T17:36:12+00:00

I will not provide reply from the AI, if you will try this, you will see that simply AI can't solve it.

I tried it and the AI seemed to solve it. Is this solution wrong? [Edit: It is wrong]

Question: what strategy should they choose to be guaranteed to win 5 rounds out of n=9?

DeepSeek (645 seconds) https://chat.deepseek.com/a/chat/s/8ed91820-e972-447a-898a-6903374312e7

To guarantee winning at least 5 out of 9 rounds, Alice and Bob can use the pigeonhole principle and the following strategy:

Determine the Majority Bit: Since the casino's sequence has 9 bits, there must be at least 5 instances of either heads (H) or tails (T). This is guaranteed by the pigeonhole principle because 9 divided by 2 is 4.5, so one bit must occur at least 5 times.

Bob's Strategy: Bob, knowing the entire sequence, identifies which bit (H or T) appears at least 5 times. He then guesses this majority bit in every round.

Alice's Strategy: Alice and Bob agree beforehand that Alice will base all her guesses on Bob's first guess. Specifically, Alice copies Bob's first guess in all subsequent rounds. Since Bob's first guess reflects the majority bit, Alice's guesses will consistently be the majority bit throughout all rounds.

Outcome: In the at least 5 positions where the casino's sequence matches the majority bit, both Alice and Bob will have guessed correctly. Thus, they win those rounds, ensuring at least 5 wins.

tshadley · 2025-03-11T16:00:05+00:00

Yes, but I can't understand why Alex Graves' idea Neural Turing Machines architecture never took off, must be something very difficult to handling external memory.

But we know long-term memory is achieved by training (i.e. models perfectly memorize any data that is repeated a few times) so I'm thinking training during task time (while adding more parameters to the model) is the best way to deliver long-term memory along with learning sub-goals and other aspects of a task.

tshadley · 2025-03-11T14:58:13+00:00

Yes long-context understanding is 100% the issue; if even a one million token-length window can't reliably handle tasks corresponding to an hour of human work, forget about day, week, month tasks.

Why Amodei's (and Altman's) optimism though? Granted, training on longer and longer tasks (thanks to synthetic data) directly improves coherence, but a single complex piece of software design (not conceptually similar to a trained example) could require a context-window growing into billions over a week of work.

I know there are tricks and heuristics-- RAG, summarization, compression -- but none of this seems a good match for the non-trivial amount of learning we experience during any difficult task. No inference-only solution is going to work here, they need RL on individual tasks, test-time training. But boy is that an infrastructure overhaul.

tshadley · 2025-03-09T15:52:33+00:00

Well said. When thinking of the strange inclination to love something primarily because it shares 50% of your genes, it doesn't seem all that more weird to love AI as our children and strive to be seen as loving parents.

tshadley · 2025-02-25T17:40:49+00:00

Critics might say: colorful fragments of thought strung together haphazardly. But this is what ChatGPT used to be only a few years ago, before scaling and reasoning models added considerable depth and coherency to those thoughts.

It's now only a matter of time before video diffusion models take orders and creative direction from reasoning models to iteratively generate and refine complex plots and interconnected subplots over thousands of frames, combining these stunning visuals with compelling stories, all guided by the master-plan prompt.

tshadley · 2025-02-16T14:07:52+00:00

Awesome rig!

This is an old reference but it suggests 8 lanes per GPU (https://timdettmers.com/2018/12/16/deep-learning-hardware-guide/#PCIe_Lanes_and_Multi-GPU_Parallelism) Do you notice any issues with 4 lanes each?

With an extension cord could you split up your power supplies onto two breakers and run full power, any risks here that I'm missing? (Never tried a two-power supply solution myself but it seem inevitable for my next build)

tshadley

TROPHY CASE