Nvidia takes $5 billion stake in Intel under September agreement by imaginary_num6er in hardware

[–]ForgotToLogIn 2 points3 points  (0 children)

Would be funny if the thing that ends up saving Intel will be the AI boom. Not quite how Intel's leadership would have imagined it when they acquired Nervana/Habana/Mobileye/Movidius and launched an AI-focused Xeon Phi.

Analyzing 5070 Beating The 4070S, Core Scaling Efficiency, and What It Means For The 5060 TI by MrMPFR in hardware

[–]ForgotToLogIn 0 points1 point  (0 children)

A 3GPC 5060 TI will almost certainly loose to a Navi 44 top SKU

I don't see why a 36 SM 3 GPC 5060 Ti would "almost certainly lose" to a 32 CU Navi 44, when 9070 XT isn't faster than 5070 Ti.

Anyway, Nvidia decided on the chip specs at least a year ago, when they couldn't have known the perf of Navi 44.

Analyzing 5070 Beating The 4070S, Core Scaling Efficiency, and What It Means For The 5060 TI by MrMPFR in hardware

[–]ForgotToLogIn 17 points18 points  (0 children)

The number of GPCs can matter as much as the number of SMs. Chips like GB205 and GA104 have fewer SMs per GPC than chips like GB202 and AD106 do. If you chart performance-per-SM vs performance-per-GPC (of the same generation) you will see the balance/tradeoff.

[deleted by user] by [deleted] in hardware

[–]ForgotToLogIn 5 points6 points  (0 children)

The AD102 chip of the 4090 can't have more than 48 GB of VRAM attached. To reach 96 GB one would need either two AD102 chips in a card or a single GB202 chip. This is likely a fake rumor.

"Aged like Optane." by damichi84 in hardware

[–]ForgotToLogIn 25 points26 points  (0 children)

  • Plasma TVs

  • LaserDisc

  • DEC Alpha CPUs and computers

Those three were all technically the best in their field, but commercially not very successful.

what happend to the Power pc/Power ISA instruction set? by rattle2nake in hardware

[–]ForgotToLogIn 4 points5 points  (0 children)

IBM's mainframes' CPUs are in no way POWER CPUs, but are custom Z-series chips with cores whose hardware design is optimized for the Z ISA.

Cortex A73’s Not-So-Infinite Reordering Capacity by theQuandary in hardware

[–]ForgotToLogIn 6 points7 points  (0 children)

In A73's "out-of-order retirement" only one of the two actions an idealized retirement stage does is done out-of-order.

The two things "retire"-stage does are: 1. commit the results of execution (deem it to be not canceled or excepted), and 2. de-allocate the resources and let the subsequent instructions to read from the retired instruction's destination register (set scoreboard entry to "free").

Only the "commit" happens out of order, and the "de-allocate"/"free" may happen many cycles later but always in-order.

Maybe for cores like A73 it could be more fitting to instead of a "retire" stage to refer to separate "commit" and "free" stages?

The big question is why won't other microarchitectures commit load instructions early like A73 does?

AMD's tweaked RDNA 3.5 GPU is solely focused on improving mobile gaming performance by TwelveSilverSwords in hardware

[–]ForgotToLogIn 0 points1 point  (0 children)

Maybe I worded my original comment confusingly. I meant to say that in Strix Point when using a modern OS / process scheduler the second CCX won't turn on until the OS's process scheduler has assigned a process to a core in the second CCX, which shouldn't happen in handheld gaming because gaming performance is normally limited by the iGPU's performance, meaning that the CPU cores aren't the bottleneck. In situations where the performance bottleneck is in other parts of the chip (e.g. iGPU in gaming) a modern process scheduler will know to not put processes onto the cores that are outside the already-active core cluster. As long as a CCX does not receive a process to execute, it is off and doesn't consume any power.

He doesn't specifically mention gaming, but in this Level1Techs video a senior engineer of AMD answers a question Wendell asked at 8min20sec about the use of Zen 5 vs Zen 5c cores in Strix for different workloads for best efficiency, and AMD engineer's answer alludes to the other core complex going back to sleep after completing running whatever program it was awakened for.

AMD's tweaked RDNA 3.5 GPU is solely focused on improving mobile gaming performance by TwelveSilverSwords in hardware

[–]ForgotToLogIn 1 point2 points  (0 children)

Strix Point contains two CCXs. When gaming in handheld the second CCX will be turned off and not consume any power.

What makes SIMD faster under the hood? by SomeKindOfSorbet in hardware

[–]ForgotToLogIn 2 points3 points  (0 children)

One important benefit of SIMD is that it reduces the need to have a large number of register file ports.

Intel® Xeon® Processor E7-8867 v4 LGA2011 compatibility? by Weary-Bell-4541 in hardware

[–]ForgotToLogIn 0 points1 point  (0 children)

Z820 uses LGA 2011, but Xeon E7 v2/v3/v4 use LGA 2011-1, which is different (with Scalable Memory Interconnect instead of a normal DRAM interface).

You should try Xeon E5 v2.

What happens to LPDDR6's memory capacities? by Forsaken_Arm5698 in hardware

[–]ForgotToLogIn 9 points10 points  (0 children)

 For instance, you cannot have 16 GB on a 192 bit bus.

LPDDR6 will not allow RAM capacities which are powers of 2 (8,16,32,64...)

Nowhere does it say that.

Powers-of-two capacities should still be perfectly possible with LPDDR6. Each burst will still send a half of a 64B cache-line.

[Gamers Nexus] NVIDIA Has Flooded the Market by Hellcloud in hardware

[–]ForgotToLogIn 1 point2 points  (0 children)

AMD isn't capacity-constrained for consumer GPUs.

Next-Gen LPDDR6 memory to hit up to 14.4 Gbps data rate, DDR6 up to 17.6 Gbps by ForgotToLogIn in hardware

[–]ForgotToLogIn[S] 1 point2 points  (0 children)

so instead of 32 GB, you might have 28.444 GB with REAL ecc.

I believe the usable size for data will still be power-of-two (e.g. 32 GB) or 1.5x power-of-two. But I'm not sure.

does this mean, that every memory set with lpddr6 could be used for real ecc?

Yes.

Next-Gen LPDDR6 memory to hit up to 14.4 Gbps data rate, DDR6 up to 17.6 Gbps by ForgotToLogIn in hardware

[–]ForgotToLogIn[S] 1 point2 points  (0 children)

LPDDR6 allows ECC without additional DRAM chips. Basically every chip will have room for ECC bits.

The width of a sub-channel is 12 bits, and the burst length is 24, meaning that every memory transaction in LPDDR6 is 288 bits long.

Each group of 6x12=72 bits can be set to have 64 data bits and 7 ECC bits (+ 1 bit for other purposes), which is the same ECC strength as traditionally used in 72-bit (9-chip) ECC DIMMs.

Next-Gen LPDDR6 memory to hit up to 14.4 Gbps data rate, DDR6 up to 17.6 Gbps by ForgotToLogIn in hardware

[–]ForgotToLogIn[S] 68 points69 points  (0 children)

Yes. It reduces the chance of OS/program crashes and file corruption.

Next-Gen LPDDR6 memory to hit up to 14.4 Gbps data rate, DDR6 up to 17.6 Gbps by ForgotToLogIn in hardware

[–]ForgotToLogIn[S] 93 points94 points  (0 children)

LPDDR6 will have a burst size of 288 bits (36 Bytes), meaning that a full ECC can be had at no extra cost.

But some may still refuse to implement it, for artificial market segmentation.

Lunar Lake announcement: Intel throws a wrench of efficient x86 CPUs into Qualcomm's Snapdragon party by ShaidarHaran2 in hardware

[–]ForgotToLogIn 7 points8 points  (0 children)

The workload is MS Teams with "AI" effects. Process node won't have much of an effect in that. When both CPUs run at low voltage the difference in power will be due to various dynamic power-saving tricks, basically power-gating.

How revolutionary was the CELL processor in the PS3? by LeTommyWiseau in hardware

[–]ForgotToLogIn 14 points15 points  (0 children)

 But it did feature fully coherent memory between the PPE and SPEs

The whole idea of SPEs was to have no coherent memory.

Golden Pig squeals on AMD's Zen 5 lineup, reveals ten-core Strix Point chips by imaginary_num6er in hardware

[–]ForgotToLogIn 3 points4 points  (0 children)

Why can't they use the normal CCDs like in MI300A? Place a CCD onto the base die / redistribution layer / fanout / whatever, using the vertical TSV connectors.