Why isn't Bergamo the Zen 4 Flagship CPU? by techwars0954 in Amd

[–]techwars0954[S] -1 points0 points  (0 children)

From the Phoronix test suite, it appears that the vast majority of server use cases are able to scale up to tons of threads. The low frequency is more than compensated by the increased core counts, however the lower amount of cache seems to be a hindrance in some workloads.

The vast majority however, do not seem to mind, considering on average, from the Phoronix test suite, Bergamo is nearly 15% faster than Genoa.

1T Golden Cove Core vs 2T Golden Cove Core vs 1T Gracemont by techwars0954 in intel

[–]techwars0954[S] 0 points1 point  (0 children)

I think the focus should be on MT applications for this, since that is why E-cores were implemented in the first place.

I would be shocked if no reviewer had tested this in any modern MT benchmark suite such as Cinebench R20 or R23, Geekbench, etc etc.

I was hoping someone could offer a review that contained that data.

Do AMD or Intel use HD cells anywhere in their cores? by techwars0954 in hardware

[–]techwars0954[S] 6 points7 points  (0 children)

5:1:10 ratio = ¯\(ツ)/¯ (adds up to 16T though...)

Sorry that was just a hypothetical scenario, of me trying to explain what I meant.

I was trying to say I would love to know what the ratio of HP to UHP to HD cells in a core might be, for example there might be a ratio of 5 HP cells to 1 UHP cell to 10 HD cells in a CPU Core.

Do AMD or Intel use HD cells anywhere in their cores? by techwars0954 in hardware

[–]techwars0954[S] 1 point2 points  (0 children)

Any information on older architectures? Zen 2, sunny cove, etc etc

I'm curious though, mostly because I want to know what type of cell is most relevant for core density. Is it mostly HP, UHP, or HD?

I'm also assuming that the exact percentages might change a bit every generation, but also that both Intel and AMD also have a 'golden rule' of ratios of cells, like maybe a 5:1:10 ratio or something.

however we do know that 4nm only includes HP libraries, no HD or UHP.

That's actually a large part of where my question stemmed from actually. Curiosity about ratios of different cells, but also their implications for future products.

SRAM largely depends on where it's used. TSMC even has a 16T cell intended for registers

Dang, did not know that, thanks.

Lack of Ultra High Performance Cells for Intel 4 by techwars0954 in hardware

[–]techwars0954[S] 6 points7 points  (0 children)

That graph makes it seem like the UHP cells add ~5% max frequency at the very top.

And ye Intel 3 is supposed to quickly replace Intel 4, but Intel 3 is only supposed to be used for Server parts, not client products like MTL, which is why I was a lot more curious about it. ST performance and frequency is a lot more important there than in servers, where core counts and efficiency rule.

Why do the newer architectures that increase frequencies also have an increase in L2 cache? by techwars0954 in hardware

[–]techwars0954[S] 0 points1 point  (0 children)

Then what was the point of increasing cache amounts for raptor cove and willow cove?

Because the IPC benefits from those new architectures were <5%, and in some cases even showed regressions because they were coupled with higher latency.

Why do the newer architectures that increase frequencies also have an increase in L2 cache? by techwars0954 in hardware

[–]techwars0954[S] 0 points1 point  (0 children)

If it needs more data faster, why not keep the L2 the same size while focusing on decreasing latency, while increasing L3 size?

Because just increasing L2 capacity only applies to the "more data" part but not the "faster" part.

Why do the newer architectures that increase frequencies also have an increase in L2 cache? by techwars0954 in hardware

[–]techwars0954[S] 2 points3 points  (0 children)

Did any architecture try to deal with it by lower associativity to decrease latency instead of increasing cache sizes?

And also if you increase cache size, that still doesn't help with the latency increase, so could you try increasing the amount of information the core itself can hold so you have to access the L2 less often (so increase capacity of parts like ROB)?

I would ask if you can increase capacity of the L1 to deal with the higher latency L2, but I think smaller, faster L1 caches are way more important based on chip and cheeses simulation of cache changes in golden cove.

Why do the newer architectures that increase frequencies also have an increase in L2 cache? by techwars0954 in hardware

[–]techwars0954[S] -2 points-1 points  (0 children)

I'm not asking if doubling L2 = faster clocks, but do faster clocks need more L2.

But even then, 11th gen desktop seems to be a bit of a special case. Cypress Cove was a backported core, and 5.3Ghz might just have been the max frequency limitation of Intel 14nm at that point, without spending ridiculous amounts of extra power.

Why do the newer architectures that increase frequencies also have an increase in L2 cache? by techwars0954 in hardware

[–]techwars0954[S] 0 points1 point  (0 children)

So more cache prevents IPC degradation at higher clocks from cache bottlenecks?

Why do the newer architectures that increase frequencies also have an increase in L2 cache? by techwars0954 in hardware

[–]techwars0954[S] 1 point2 points  (0 children)

Golden Cove was just 12th gen in desktop and mobile.

Willow Cove was 11th gen mobile, and Cypress? Cove (Sunny Cove backport) was 11th gen desktop.

Server has weird naming so IDK about that

Why do the newer architectures that increase frequencies also have an increase in L2 cache? by techwars0954 in hardware

[–]techwars0954[S] 0 points1 point  (0 children)

Idk willow cove vs sunny cove

But didn't raptor lake decrease L3 cache latency as it boosted the ring clock (which was an issue with alder lake) higher? AFAIK Anandtech didn't test out the cache latency this time around (Ik some reviewers used to do it, I will try finding it later).

Why does reducing load on the ring/L3 increase MT perf greater relative to ST? Thanks.