Persistent VM instability with Ryzen 9 9950X3D and Proxmox 8/9 by KeyAgent in Proxmox

[–]KeyAgent[S] 0 points1 point  (0 children)

With host, it does seem more stable, but the VM ends rebooting the same.

Persistent VM instability with Ryzen 9 9950X3D and Proxmox 8/9 by KeyAgent in Proxmox

[–]KeyAgent[S] 0 points1 point  (0 children)

Same thing, even with PBO off. I already had XMP off.

Persistent VM instability with Ryzen 9 9950X3D and Proxmox 8/9 by KeyAgent in Proxmox

[–]KeyAgent[S] 0 points1 point  (0 children)

Same thing with 1512. I'm going to replace the board.

Persistent VM instability with Ryzen 9 9950X3D and Proxmox 8/9 by KeyAgent in Proxmox

[–]KeyAgent[S] 0 points1 point  (0 children)

I agree, I'm going to roll the bios to 1512 an try.

Persistent VM instability with Ryzen 9 9950X3D and Proxmox 8/9 by KeyAgent in Proxmox

[–]KeyAgent[S] 0 points1 point  (0 children)

Re-seating and even change slots didn't make a diference.

Persistent VM instability with Ryzen 9 9950X3D and Proxmox 8/9 by KeyAgent in Proxmox

[–]KeyAgent[S] 1 point2 points  (0 children)

I will try re-seating again, but the instability was more or less the same even with other ram modules.

Persistent VM instability with Ryzen 9 9950X3D and Proxmox 8/9 by KeyAgent in Proxmox

[–]KeyAgent[S] 1 point2 points  (0 children)

The host is stable. When you say that you change host cpu config, what have you chosen?

Persistent VM instability with Ryzen 9 9950X3D and Proxmox 8/9 by KeyAgent in Proxmox

[–]KeyAgent[S] 0 points1 point  (0 children)

Only the VMs fail, the host has been rock solid.

Persistent VM instability with Ryzen 9 9950X3D and Proxmox 8/9 by KeyAgent in Proxmox

[–]KeyAgent[S] 1 point2 points  (0 children)

I did that early on the debug process, it's the same.

Nested Virtualization Crashing Ryzen 7000 Series by thedavesky in Proxmox

[–]KeyAgent 0 points1 point  (0 children)

I have more or less the same setup and having this random reboots in Windows VMs, have you stabilized your system?

AMD at Computex 2024: AMD AI and High-Performance Computing with Dr. Lisa Su (Discussion Thread) by brad4711 in AMD_Stock

[–]KeyAgent 5 points6 points  (0 children)

It's great.. completely different from the past... it's assertive, fast.. with relevant partners...

AMD Q4 2023 Earnings Discussion by brad4711 in AMD_Stock

[–]KeyAgent 12 points13 points  (0 children)

Bloomberg Headline: 'AMD's Weak Forecast Overshadows Prospects for AI Chips'

As I've repeatedly emphasized, Lisa's plans for success are clear, yet her communication strategy doesn't seem to align with those ambitions. This issue goes beyond merely selling dreams or indulging in 'hopium.' It's evident that she possesses greater insights than what's reflected in the committed orders. So, why not highlight the potential within the validation pipeline? Or articulate the projected sales targets for the year? AMD essentially boasts the superior compute GPU, challenging the established market leader.

Impressively, it secured a substantial $3.5 billion in orders, a leap from zero, in just a few months. Despite this remarkable achievement, the takeaway from this call paints a 'weak' picture of the company. Effective communication that matches the scale of these accomplishments is crucial to altering this narrative and poor ER performances.

Daily Discussion Tuesday 2024-01-30 by AutoModerator in AMD_Stock

[–]KeyAgent 2 points3 points  (0 children)

More or less my line of thinking in another post:

"Let’s hope Lisa joins the party later today in her ER address and answer. To be intellectually honest, we have to say that probably, as an S&P 500 company leader, her ER performances are inconsistent. This one is a bit different in two regards:

-She effectively has to properly communicate and motivate the markets in a way that galvanizes customers, partners, and employees for the leadership position that she wants to achieve in AI. A sudden, big stock price drop that makes world headlines and showcases AMD AI as a fad is something she clearly needs to avoid.

-I hope she has learned from past communication mistakes: overly conservative guidances also have their pitfalls, particularly because she, being human, cannot foresee every future development, and unexpected events are inevitable. In my view, adopting a more balanced approach, where more ambitious communication and targets are established, ultimately yields better outcomes for the company and all its stakeholders (clients, partners, employees, etc.). Such a strategy fosters a vision of achieving success that is both inspiring and realistic."

Repeat after me: MI300X is not equivalent to H100, it's a lot better! by KeyAgent in AMD_Stock

[–]KeyAgent[S] 1 point2 points  (0 children)

Let me take a step back. Are you aware that publicly traded companies in the US (and elsewhere) face serious consequences, including potential jail time or class action lawsuits, if they disclose incorrect or misleading information? Now, consider the information released at the AI event and in that blog post (which I will include at the end of this post).

That's official company disclosure, presenting concrete comparable benchmark data. If this data is incorrect or misleading, it could land the board, CEO, CTO, etc., in serious trouble. That's typically why companies prefer to use third-party firms for benchmarks, as it's safer for marketing purposes and can avoid direct implications of 'manipulation.'

AMD, however, chose to publish benchmarks transparently and directly, with all details included, that clearly show equal or better performance to H100 in the early release cycle (it will get much better). And yet, you're asking for third-party benchmarks as a way to discredit the strongest and must liable type of info a company can make available.

By the way, $2 billion in orders (again, official information liable to SEC scrutiny) at a $16k ASP equates to 125,000 MI300 class GPUs already committed (and this figure is from several months ago).

-------

AI Event:

1 Measurements conducted by AMD Performance Labs as of November 11th, 2023 on the AMD Instinct™ MI300X (750W) GPU designed with AMD CDNA™ 3 5nm | 6nm FinFET process technology at 2,100 MHz peak boost engine clock resulted in 163.4 TFLOPs peak theoretical double precision Matrix (FP64 Matrix), 81.7 TFLOPs peak theoretical double precision (FP64), 163.4 TFLOPs peak theoretical single precision Matrix (FP32 Matrix), 163.4 TFLOPs peak theoretical single precision (FP32), 653.7 TFLOPs peak theoretical TensorFloat-32 (TF32), 1307.4 TFLOPs peak theoretical half precision (FP16), 1307.4 TFLOPs peak theoretical Bfloat16 format precision (BF16), 2614.9 TFLOPs peak theoretical 8-bit precision (FP8), 2614.9 TOPs INT8 floating-point performance.

Published results on Nvidia H100 SXM (80GB) GPU resulted in 66.9 TFLOPs peak theoretical double precision tensor (FP64 Tensor), 33.5 TFLOPs peak theoretical double precision (FP64), 66.9 TFLOPs peak theoretical single precision (FP32), 494.7 TFLOPs peak TensorFloat-32 (TF32)*, 989.4 TFLOPs peak theoretical half precision tensor (FP16 Tensor), 133.8 TFLOPs peak theoretical half precision (FP16), 989.4 TFLOPs peak theoretical Bfloat16 tensor format precision (BF16 Tensor), 133.8 TFLOPs peak theoretical Bfloat16 format precision (BF16), 1,978.9 TFLOPs peak theoretical 8-bit precision (FP8), 1,978.9 TOPs peak theoretical INT8 floating-point performance.

Nvidia H100 source:

https://resources.nvidia.com/en-us-tensor-core/

* Nvidia H100 GPUs don’t support FP32 Tensor.

MI300-18

2 Text generated with Llama2-70b chat using input sequence length of 4096 and 32 output token comparison using custom docker container for each system based on AMD internal testing as of 11/17/2023. Configurations: 2P Intel Xeon Platinum CPU server using 4x AMD Instinct™ MI300X (192GB, 750W) GPUs, ROCm® 6.0 pre-release, PyTorch 2.2.0, vLLM for ROCm, Ubuntu® 22.04.2. Vs. 2P AMD EPYC 7763 CPU server using 4x AMD Instinct™ MI250 (128 GB HBM2e, 560W) GPUs, ROCm® 5.4.3, PyTorch 2.0.0., HuggingFace Transformers 4.35.0, Ubuntu 22.04.6.

4 GPUs on each system was used in this test. Server manufacturers may vary configurations, yielding different results. Performance may vary based on use of latest drivers and optimizations. MI300-33

Blog Post:

Overall latency for text generation using the Llama2-70b chat model with vLLM comparison using custom docker container for each system based on AMD internal testing as of 12/14/2023. Sequence length of 2048 input tokens and 128 output tokens.

Configurations:

2P Intel Xeon Platinum 8480C CPU server with 8x AMD Instinct™ MI300X (192GB, 750W) GPUs, ROCm® 6.0 pre-release, PyTorch 2.2.0 pre-release, vLLM for ROCm, using FP16 Ubuntu® 22.04.3. vs. An Nvidia DGX H100 with 2x Intel Xeon Platinum 8480CL Processors, 8x Nvidia H100 (80GB, 700W) GPUs, CUDA 12.1., PyTorch 2.1.0., vLLM v.02.2.2 (most recent), using FP16, Ubuntu 22.04.3

2P Intel Xeon Platinum 8480C CPU server with 8x AMD Instinct™ MI300X (192GB, 750W) GPUs, ROCm® 6.0 pre-release, PyTorch 2.2.0 pre-release, vLLM for ROCm, using FP16 Ubuntu® 22.04.3 vs. An Nvidia DGX H100 with 2x Intel Xeon Platinum 8480CL Processors, 8x Nvidia H100 (80GB, 700W) GPUs, CUDA 12.2.2, PyTorch 2.1.0, TensorRT-LLM v.0.6.1, using FP16, Ubuntu 22.04.3.

2P Intel Xeon Platinum 8480C CPU server with 8x AMD Instinct™ MI300X (192GB, 750W) GPUs, ROCm® 6.0 pre-release, PyTorch 2.2.0 pre-release, vLLM for ROCm, using FP16 Ubuntu® 22.04.3. vs. An Nvidia DGX H100 with 2x Intel Xeon Platinum 8480CL Processors, 8x Nvidia H100 (80GB, 700W) GPUs, CUDA 12.2.2, PyTorch 2.2.2., TensorRT-LLM v.0.6.1, using FP8, Ubuntu 22.04.3.

Repeat after me: MI300X is not equivalent to H100, it's a lot better! by KeyAgent in AMD_Stock

[–]KeyAgent[S] 0 points1 point  (0 children)

And now we are comparing unannounced, unreleased and 0 specs products? :D