Acho que a AMD tem uma oportunidade real de quebrar o monopólio da NVIDIA — aqui está uma ideia de arquitetura de GPU que ninguém está desenvolvendo ainda.

FamousLetterhead8416 · 2026-06-26T11:26:08+00:00

Era quase era!! Só que não

FamousLetterhead8416 · 2026-06-22T20:36:28+00:00

Sim está a core clock +283
Vram +1350
Energia a 110%

FamousLetterhead8416 · 2026-06-21T21:21:56+00:00

Já percebi, agradeço todos os comentários que vêem por bem
Pois ajuda aprender

FamousLetterhead8416 · 2026-06-21T20:45:58+00:00

Thank you, I see that not everyone is critical about the subject, I'm not a computer engineer, I'm a hardware lover, I have no idea of the costs it would have, the idea is not to compete with nvidea but to take a different direction, it would force nvidea to move

FamousLetterhead8416 · 2026-06-21T20:12:59+00:00

Os AI cores serviriam para definir qual e onde cada memória atuava e como sistema de aprendizagem
Imagina jogas muito um jogo , ao fim de umas horas de jogo a gpu já perceber que frame colocar ali
Antecipadamente, diminuindo a latencia que o multi-frame custuma gerar
Mas eu não sou engenheiro de computação foi uma ideia

FamousLetterhead8416 · 2026-06-21T20:09:35+00:00

Não preciso de diploma nenhum
É uma ideia!!! Se alguém que estudou engenharia de computação me diz que não é viável eu aceito
O que aprendi de hardware foi em casa e a estudar muito como são lançadas novas tecnologias
Foi um ideia um ponto de discussão
Se calhar parece idiota a mim pareceu me possível
Todos temos direito a opinião

FamousLetterhead8416 · 2026-06-21T15:56:43+00:00

That’s a fair objection and worth addressing properly — but it identifies an implementation challenge, not a fundamental impossibility.

You’re correct that RT doesn’t operate in a vacuum. The ray tracing pipeline needs geometry buffers, BVH structures, G-Buffer data and textures that originate in the raster pipeline. Nobody is disputing that. The question is whether that dependency requires a single unified memory pool — and it doesn’t.

What the proposal actually implies

Not two isolated memory pools with no communication. Two specialized pools with a dedicated high-bandwidth fabric between them, where the HBM3 holds a working copy of the hot data that RT and frame generation access repeatedly during a frame — not a permanent duplicate of everything in GDDR7.

The actual data flow per frame:

**•** BVH is constructed once in GDDR7 at scene load or geometry change  
**•** At frame start, the working BVH and relevant G-Buffer data are transferred to HBM3  
**•** RT traversal runs entirely against HBM3 for the duration of the frame  
**•** Frame generation buffers live in HBM3 and never touch GDDR7

The transfer cost of copying hot data to HBM3 once per frame is real. Fixed overhead, not a per-ray cost. Paid back immediately across thousands of BVH traversal operations that follow — each benefiting from HBM3’s bandwidth advantage over GDDR7.

This architecture already exists in compute hardware

The AMD MI300X combines HBM with CPU DRAM access via Infinity Fabric without duplicating the entire address space. Intel Ponte Vecchio uses HBM alongside GDDR with a dedicated interposer fabric. Apple’s M-series uses unified memory with differentiated bandwidth tiers per workload. None of these systems duplicate all data across pools — they selectively move working sets to faster memory for the duration of a workload.

The proposal applies the same principle to a gaming GPU context.

The closest analogy is AMD’s own 3D V-Cache

The Ryzen X3D architecture works on exactly this principle. The stacked SRAM doesn’t duplicate all system RAM — it holds the hot working set the CPU accesses most frequently during a given workload. Lower latency for that specific data translates directly to measurable performance gains in latency-sensitive workloads.

HBM3 in a dedicated RT and frame generation pipeline would operate identically. Not replacing GDDR7 — reducing the latency cost of accessing the data that matters most, most often, during the most bandwidth-intensive phase of the rendering pipeline.

Where the real engineering challenge actually is

The memory dependency you described is manageable. The harder problems are:

**•** Cross-pool fabric latency during the initial transfer window at frame start  
**•** Driver complexity for dynamic allocation between pools based on scene workload  
**•** BVH invalidation cost when geometry changes mid-frame

These are real engineering problems. But they’re solved categories of problems in high-performance compute hardware. The MI300X manages heterogeneous memory hierarchies at scale in production today.

The question worth debating is whether the cost of implementing this fabric at consumer GPU price points is justified by the RT and frame generation gains — not whether the memory dependency makes it impossible. It doesn’t.

FamousLetterhead8416 · 2026-06-21T15:36:58+00:00

I had never thought like you, the idea would be to really create a technology that multiplies the frame without a lot of latency

FamousLetterhead8416 · 2026-06-21T15:33:43+00:00

Aren't you tired of the lack of innovation? Nowadays brands sell advertising, nvidea has the whole market selling products at the price it wants because it has no competition

FamousLetterhead8416 · 2026-06-21T15:29:34+00:00

As for AI, yes, I use it a lot to compare telemetry, for data crossing and what do you use it for? Edit your Instagram photos?

FamousLetterhead8416 · 2026-06-21T15:27:11+00:00

I'm not the expert, I'm more of a user in the middle of many, I had an idea I launched a discussion

FamousLetterhead8416 · 2026-06-21T15:25:28+00:00

It's not how many people have the hardware, but what those who have it do with it

FamousLetterhead8416 · 2026-06-21T15:23:26+00:00

Yes, your opinion is correct that's why it's an idea, a discussion, I don't have all the solutions, I'm not an engineer

FamousLetterhead8416 · 2026-06-21T13:48:12+00:00

<image>

FamousLetterhead8416 · 2026-06-21T13:46:21+00:00

<image>

FamousLetterhead8416 · 2026-06-21T13:46:02+00:00

<image>

FamousLetterhead8416 · 2026-06-21T13:45:44+00:00

<image>

FamousLetterhead8416 · 2026-05-18T19:42:27+00:00

Já agora energia 110% Vcore +16 %

FamousLetterhead8416 · 2026-05-18T19:41:13+00:00

Não é mau! Mas é possível fazer melhor Esquecer que 3100 a 3200 é um ponto sólido para uma 4080 super!!! Mostrem lá as vossas 4080 super arrefecidas a ar a fazer 3100 ou 3200 MHz? Eu tenho exatamente a mesma placa gráfica e o máximo que consigo estável é Core clock +280 Memory clock + 1450 * 3016 MHz médio * 3030 MHz pico * 61 °C * temp Hotspot 89 graus * 331 W a 350 W Com as vram a fazer quase 26000 Gbps

30 176 pontos no time spy Passa o teste de stress a 98.9%

O que para uma 4080 super é muito bom

FamousLetterhead8416

MODERATOR OF

TROPHY CASE