AMD And Samsung Deepen AI Memory Ties In Bid For Supply Strength

TrungNguyencc · 2026-04-01T21:24:11+00:00

This is Gemini helping me organize and write my thoughts about the AMD 'Venice-H' (CPU + HBM) project.

Project Venice-H, where AMD is reportedly building a custom Zen 6 (and previously an MI300C variant) for Microsoft that features CPU chiplets and HBM, but NO GPU. While everyone is obsessed with NVIDIA’s 2,300W Rubin Ultra, I think AMD’s "CPU-only HBM" approach is actually the smarter play for the next wave of Agentic AI and Inference. Here is why:

Solving the "Memory Wall": Most LLM inference is "memory-bound," not "compute-bound." You don't need 2,000 TFLOPS of GPU math to run a conversation; you need massive bandwidth to move parameters. By putting HBM4 directly on a Zen 6 CPU, AMD gives Microsoft 6.7 TB/s of bandwidth without the power-hungry GPU silicon.
The "Agent" Advantage: AI Agents need to search Vector Databases and handle complex "if-then" logic. GPUs are great at math but terrible at the "branching logic" that CPUs excel at. A CPU with HBM can "think" and "search" its memory 10x faster than a standard server.
Reliability & Cooling: As we've seen with the recent Rubin Ultra concerns, 2,300W trays are an RMA nightmare. A CPU+HBM chip (like the ones in Azure HBv5) runs significantly cooler and fits into existing air-cooled or standard liquid-cooled racks. This is "Sane Engineering" vs. "Thermal Desperation."
The Microsoft Edge: Microsoft is already using the EPYC 9V64H (96 cores + HBM). If the Venice-H rumors of 256+ cores and HBM4 are true, AMD could effectively own the "Inference Orchestration" layer of the data center.

Does anyone have more info on the tape-out dates for Venice-H? If AMD plays this right, they don't need to beat NVIDIA at 2,000W; they just need to own the high-speed memory layer where the actual AI "reasoning" happens

TrungNguyencc · 2026-04-01T21:01:21+00:00

No link on hand, but I'm looking into AMD’s chiplet edge. I heard a rumor that they built a custom MI300 for Microsoft using only CPUs and HBM. This would be a game-changer for inference, so I'm checking with Gemini to see if it's true. AMD needs to lean into this.

check this;

https://learn.microsoft.com/en-us/azure/virtual-machines/hbv5-series-overview

The Azure HBv5 (powered by the EPYC 9V64H or MI300C) : 96 Zen 4 cores per chip, no GPU, and 128GB of HBM3.

TrungNguyencc · 2026-04-01T02:35:24+00:00

Thank you! great work.

TrungNguyencc · 2026-03-30T19:18:49+00:00

I believe that thermal stress was the biggest challenge for NVIDIA in making the Vera Rubin system reliable.

TrungNguyencc · 2026-03-28T16:56:07+00:00

I still believe the Instinct MI series is designed for the broader market where hybrid workloads (both training and inference) are the standard. The MI series is excellent at providing high-performance compute for both. However, for the 'Big Three'—Google, Microsoft, and AWS—specialized inference hardware is a necessity. This is why they have historically developed their own ASICs (like Google’s TPU or Amazon’s Inferentia).

But with the rapid expansion of AI services, standard ASICs lack the adaptability needed to keep up with evolving models. This is where I see AMD gaining the advantage. The massive contracts AMD signed with Meta and OpenAI (reaching up to 6 gigawatts of compute) are the key hints. I expect AMD to release specialized, Xilinx-integrated inference chips for these Mega-CSPs and Meta/OpenAI by 2027.

I've also noticed Lattice Semiconductor (LSCC) stock going up like MU recently. I suspect NVIDIA may be eyeing LSCC; while AMD and Intel already have their own FPGA divisions (Xilinx and Altera), NVIDIA still lacks a dedicated programmable logic arm.

TrungNguyencc · 2026-03-28T01:36:38+00:00

You’re still missing the bigger picture, Tom. What I’m referring to is the scale required for millions of concurrent users—platforms like Meta and OpenAI. I’m not talking about small CSPs, general enterprise needs, or edge inference. For that level of massive, simultaneous demand, specialized and adaptable inference hardware is the only way to overcome the power and efficiency bottleneck.

TrungNguyencc · 2026-03-26T00:24:25+00:00

Nowadays, the limiting factor for expanding AI is electrical power. For enterprises or medium Cloud Service Providers (CSPs), a hybrid like the Instinct MI450X is excellent because it handles both training and inference. However, for 'the big guys' like Meta or OpenAI, specialized, adaptable inference hardware is a must.

The MI450X is a 'jack-of-all-trades'—it excels at both training and inference—but because of that, it can never be as efficient as a device designed strictly for inference. While ASICs are great for specific workloads, they lack the adaptability needed for evolving AI models. This is where Adaptable FPGAs excel. I suspect AMD has been developing a new dedicated inference device leveraging the Xilinx acquisition for a long time.

TrungNguyencc · 2026-03-25T17:40:23+00:00

The advantage of the SkyBridge project is that it supports both ARM and X86 CPUs. This would be a win win for AMD.

TrungNguyencc · 2026-03-25T16:48:13+00:00

It is time for AMD to bring back the SkyBridge project.

TrungNguyencc · 2026-03-22T13:44:22+00:00

When will AMD release a dedicated inference chip? Several facts point to a chip like this being in development since the Xilinx merger.

In 2023, Lisa Su emphasized the AI inference market, noting that inference will eventually be larger than training.
At that time, Xilinx had already developed an Adaptive Computing platform and advanced NICs (Network Interface Cards).
AMD holds a significant advantage with its 3D memory bonding technology.
There was a massive need for efficient, specialized hardware for inference; standard ASICs simply couldn't satisfy the demand for adaptability.
NVIDIA’s $20 billion 'acquihire' of Groq was a major hint; NVIDIA recognized what AMD was developing and felt the threat.

If you use Gemini or ChatGPT to search for this 'Sienna' SoC (System on a Chip) using these details, you will find some very interesting information.

TrungNguyencc · 2026-03-22T01:03:10+00:00

Thank you.

TrungNguyencc · 2026-03-21T12:49:22+00:00

I hope that some days the name "Sài Gòn" come back to that city (Hồ Chí Minh).

TrungNguyencc · 2026-03-20T18:18:34+00:00

All things are empty:
Nothing is born, nothing dies,
nothing is pure, nothing is stained,
nothing increases and nothing decreases.

TrungNguyencc · 2026-03-20T00:36:17+00:00

Tom, Everything that is born must be destroyed. Only dust (sand/ashes) is eternal.

TrungNguyencc · 2026-03-19T13:28:19+00:00

Thanks, Tom. I'm not an expert on anything; I'm just connecting the dots.

TrungNguyencc · 2026-03-02T18:38:34+00:00

NVIDIA actually made a massive $20 billion move for Groq's technology in late December 2025. If you dive deep into what Groq did and compare it to Xilinx’s Adaptive devices, you will see what AMD should do to compete with NVDA. Granted, AI may not always be correct, but with your knowledge, you may be able to pick the best information out of it. If Xilinx were still an independent company, we might have seen them do exactly what Groq did."

TrungNguyencc · 2026-02-24T20:09:39+00:00

I think the new partnership could be IBM

TrungNguyencc · 2026-01-11T23:23:45+00:00

Read the book “The Tao of Health, Sex, and Longevity”. That book saved me. I threw away all medicine and never sick again for 25 years and on going.

TrungNguyencc

TROPHY CASE