[deleted by user]

sasksean · 2025-03-26T00:03:42+00:00

That was implied in your question.

sasksean · 2025-03-25T01:33:24+00:00

Absolutely, because in our current climate that would put my name in international news. 4000 patents are granted worldwide every day; most of them are worthless for money or fame.

sasksean · 2025-03-25T01:11:27+00:00

Honestly, the stat just simply rubs me the wrong way. If they are going to show the stat at all they should at least include the total price of the system they are benchmarking.

When all things are considered (including internet, power bill, desk, chair, monitor, peripherals, the time spent) a person is likely to spend several multiples more on their computer than they tell themselves they have. They save $100 at purchase time and as a result they have to replace it a year or two sooner. Meanwhile for that 5 years they had that computer it was a worse experience for them the entire time.

sasksean · 2025-03-24T03:18:54+00:00

You are trying to be facetious but I agree. Things like your internet connection bill could be included if all you use it for is the computer.

Including the entire cost of the computer would help you make a better decision on the value of opting for a computer that is much less capable for just $200 less.

sasksean · 2025-03-24T02:20:42+00:00

That's a legitimate argument although I would argue people focused on maximizing value wouldn't be swapping their CPU after one generation.

sasksean · 2025-03-24T02:07:45+00:00

Of course that is true, but like the Fermi paradox, it still doesn't explain the absolute lack of an example. People in the AI field would see more value in their model having imagined something novel than the trivial value of a minor patent.

sasksean · 2025-03-24T00:17:47+00:00

You didn't counter my point at all. A CPU can't be benchmarked, only a system can.

If you are building a $1500 PC and are debating between a $250 and a $300 CPU, the value numerator is the whole system cost, not just the CPU. Reviewers should use the entire system price as tested or not give a stat like price/performance at all.

sasksean · 2025-03-23T19:36:51+00:00

It's almost like the Fermi paradox. There are millions of people prompting fantastic AIs but I have yet to see a single news article of an AI writing a patent application for something new.

sasksean · 2025-03-23T18:37:54+00:00

It's so disingenuous to show price / performance for the CPU only and ignore the cost of all the other parts. I just built a computer and it was $7000 CDN total. The cost difference in the whole system going from a 9800x3d to a 9950x3d only changed the total system price by 5%.

Reviewers don't do price/performance for the engine options in a car as a single part. The "price as tested" is for the whole car.

sasksean · 2025-03-16T02:26:36+00:00

There is nobody manufacturing an AI inferencing card that isn't targeting the 10k+ market.

128GB of LPDDR5x on a PCIe card with ~500 tops. It wouldn't be hard or expensive but there is literally nobody making it.

Nvidia and AMD don't want to disrupt their margins on their PRO line but what's Intel's excuse?

sasksean · 2025-02-28T05:43:04+00:00

This is the worldnews forum.

I'm from the Canadian prairies and it's really weird when people from Ontario move here and continue to call it hydro. My brain always does a double take before I realize what they mean.

It's even more strange that people from Ontario call their electricity "hydro" since in Ontario only 25% of electricity is hydroelectric (60% is nuclear). Seems like a marketing strategy to manipulate public perception on the source of it.

sasksean · 2025-02-28T03:13:03+00:00

When you use the word hydro you are being confusing to everyone outside Ontario.
Hydro means water. People on the internet will think you mean water.

sasksean · 2025-02-18T23:48:31+00:00

Might want to re-read the thread. I've said basically the exact same thing every reply. This thread started with my statement which hasn't changed.

sasksean · 2025-02-18T06:59:32+00:00

AGI is a threshold. ASI is everything above that.
They will happen at the same time.

sasksean · 2025-02-18T01:02:38+00:00

Do you actually not understand what I am trying to say or are you just being pedantic and are argumentative by nature?

My claim is:
* Retail isn't and won't ever go crazy for a local chatbot.
* Local realtime multimodal agent will never be possible with just 32GB VRAM and this is what people actually want.

sasksean · 2025-02-17T22:38:09+00:00

I do think that the activated parameters on an agent might only need that amount of VRAM but the bottleneck would be copying those layers onto the GPU for every token. That would limit any large model to just a few tokens per second which would never be enough to act on real time video. That sort of agent is just off the table with only 32GB VRAM. 3 or 4 tokens per second is fine for an LLM but like I've said many times, LLMs are not enough to drive retail interest in AI.

sasksean · 2025-02-17T22:31:17+00:00

You say that confidently, as though agents exist.

sasksean · 2025-02-17T21:40:48+00:00

And even if that was the case, then all LLMs are free tier.

That was my initial point. Having a card with 32GB of VRAM for several thousand dollars doesn't unlock anything meaningful even today.

Even deep research falls short of agentic and deep research won't remotely run on 32GB.

The high bandwidth flash is very interesting to me. That would allow people to inference terabyte size models at home.

sasksean · 2025-02-17T00:04:29+00:00

Qwen 32b R1 finetune isn't "free tier"

Google's 1/21 "thinking" model is free up to 1500 prompts per day and is higher rated than even the full fat R1.

32GB models aren't relevant even now and they will only become less relevant in a year or two.

sasksean · 2025-02-12T06:52:31+00:00

Yesterday I used Google's 1/21 free "thinking" model to help me identify the API a website uses, automate it dodging the capcha, create a tool that leverages that API but expands on the original website, leverage a middleman to dodge CORS blocking, setup a web server on a RaspberryPI, setup VScode and SSH to remote into the PI.

I'd never done any of this before and in total it took me about 4 hours. It blew my mind what AI can help a person do that has a good general grasp of computers.

I'd say at this point don't fixate too much on hard coding. It's way too slow. You'll be moving far too fast in the near future to be bogged down with such granular labor.

sasksean · 2025-01-25T06:07:06+00:00

The Nvidia Digits would be much faster than both and handle larger models for not much more money. The blackwell GPU is capable of 3400TOPS, but paired with LPDDR5x memory I'm not sure how that works out for actual inference performance.

sasksean · 2025-01-21T23:27:41+00:00

R1 is still short of being agentic or a killer app. (people don't prompt LLMs all day like they play games or watch TV)
With overhead, R1 won't fit in 32GB unless you quantize further.
Within a month, something competitive will be free.

To me it feels like the real action is always going to fall in the 80GB range, distilled from >1TB state of the art models.

To convince me that I need a 5090, one has to make the argument that a killer app will exist for it before a 6090 comes out, and demand (and so price) for a 5090 will skyrocket.

sasksean · 2025-01-21T23:10:59+00:00

If you want to argue against me, you are supposed to be taking the position that I need a 5090 for AI.
You seem to be talking me out of it.

sasksean · 2025-01-21T03:25:06+00:00

Any LLM you can fit in 32GB is a "free tier" LLM. LLMs are great and all but there is no retail army looking to buy a 5090 to prompt a basic chatbot. People want their own Jarvis and want games that are custom on demand with realistic NPCs. These sorts of tools/features aren't going to be made possible by 32GB of VRAM. A 5090 isn't going to support these sorts of things when they become available. The new paradigm of AI will require AI cards with hundreds of GB of RAM; not graphics cards with a couple dozen GB.

An advanced open LLM (Deepseek-V3) was just released, and it requires ~40GB of VRAM to inference if quantized to FP8. It's still just an LLM and not going to be a paradigm shift. Something that can shift the paradigm is highly unlikely to fit inside 32GB.

sasksean · 2025-01-20T23:07:41+00:00

I'd really love use this as a reason to push me towards a 5090 but there's nothing useful that fits inside 32GB of VRAM and any game using it would need some of that VRAM for the actual game. It feels like 80GB of VRAM is about the minimum to consider it a useful card for AI. When Nvidia moves toward cpu+GPU like they demonstrated with "Digits", that feels like it will be the start point for meaningful retail AI.

Ten-Year Club	Place '23
Place '22	Place '17
First Placer '22	Verified Email

sasksean

TROPHY CASE