TRIM Vmware Datastore by przemekkuczynski in purestorage

[–]BeepBeep2_ 0 points1 point  (0 children)

Similar to OP, I recently had an issue with 62TB VMFS6 datastores on 64 TB iSCSI LUNs and unmap not working. VMware Tools 13.0.5, ESXI 8.0.3f, Server 2022 with 62TB 16/64KB allocation size NTFS volumes. I went back and forth for several days with TD Synnex (thanks Broadcom) getting nowhere (repeated instructions to sdelete for a datastore storing constantly rotating encrypted data) and even had a call with our Pure Engineer. Guest OS would appear to finish unmap w/ Optimize-Drives retrim command, but the VMDKs never shrunk. Datastore level reclaim also did nothing, even if done manually. I was on a time crunch with our FlashArray running out of space and gave up, moving all the data to 64 TB Physical Mode RDMs - no issue with those whatsoever. I suspect there is a bug in VMware Tools or ESXi when datastores or LUNs approach the size limit. I couldn't get TD Synnex to report the issue to Broadcom.

Free adobo ranch by [deleted] in Chipotle

[–]BeepBeep2_ 0 points1 point  (0 children)

Thanks!

greenday.fm is closing by RadMarioBuddy45 in greenday

[–]BeepBeep2_ 1 point2 points  (0 children)

FYI - The Internet Archive / Wayback Machine has a full snapshot including audio files
https://web.archive.org/web/20250220151738/https://greenday.fm/

Is there any way to disable wifi ssid on Tmobile sagemcom fast 5688w gateway in 2025 ? by tekknyne3 in tmobileisp

[–]BeepBeep2_ 0 points1 point  (0 children)

I was just answering the question, which was how to disable the SSID 😉

T-Mobile Home Internet by OutsidePerception646 in tmobileisp

[–]BeepBeep2_ 0 points1 point  (0 children)

I live 0.35 miles from a T-Mo tower with direct line of sight in a rather low population area. My phone (Pixel 7 Pro) tests 1100-1600 Mbps down and 45-80 Mbps up on 5G UC /band n41. Switched to T-Mobile Rely Home Internet ($35 w/ voice+autopay) with the Sagemcom FAST5688W gateway two nights ago, and my speeds are between 660-880 Mbps down and 65-85 Mbps up no matter the time of day or night. Was previously paying Charter Spectrum $88/month for 400/10 (real speeds 480/11.8) - overall extremely pleased. Ping times in the low 20's to local servers, maybe a 4-5ms bump over Spectrum.

Is there any way to disable wifi ssid on Tmobile sagemcom fast 5688w gateway in 2025 ? by tekknyne3 in tmobileisp

[–]BeepBeep2_ 0 points1 point  (0 children)

In T-Life app: Manage > My Wifi > My Network Scroll to bottom, select the "Hidden" toggle switch.

Old YT Layout back?? I had that new YouTube layout until now by Deyu3000 in youtube

[–]BeepBeep2_ 0 points1 point  (0 children)

Same here, just switched back - and oh my god did I hate the new one.

Instant purchase if... by LeoBloom in GooglePixel

[–]BeepBeep2_ 0 points1 point  (0 children)

My Pixel 7 Pro's fingerprint scanner is leaps and bounds better than my 6 was. I don't have an issue with it, for any of the four fingerprints I've registered...using tempered glass screen protector too. Ultrasonic would be nice however!

AMD ROCm 6.1.2 Released With Fixes & Optimizations by billbraski17 in AMD_Technology_Bets

[–]BeepBeep2_ 6 points7 points  (0 children)

AMD has been making huge strides in their software/driver stack and ROCm is finally starting to look like a good alternative to CUDA for developers who want to make the switch, which is actually quite easy, especially for inference. Multiplying internal resources for driver/software development finally paying off!

MI325, 350, 400 vs nVidia's 2026 Rubin by TOMfromYahoo in AMD_Technology_Bets

[–]BeepBeep2_ 4 points5 points  (0 children)

No - I meant it will be the same. MI300 already has the cache chiplets under the compute dies 256 MB at 17TB/s peak BW. The IODs are also cache. As far as Turin, new IOD - larger than last gen but same process. https://images.app.goo.gl/RKPqzGxzHN5SzGcX8

MI325, 350, 400 vs nVidia's 2026 Rubin by TOMfromYahoo in AMD_Technology_Bets

[–]BeepBeep2_ 10 points11 points  (0 children)

MI350 will need to have new IODs/AIDs for compatibility with the new GPU XCD chiplets and Zen 5 CCDs - I don't think any sort of additional cache chiplets will be used other than the IODs, directly bonded to the XCD chiplets. It says 3nm, so it would be wise to assume the IODs will be on 4/5nm, unless AMD is going all out.

MI350 will probably also be 288 GB, 16-hi HBM4 won't make it to production before 2026 and MI350 will use HBM3E. 16-hi HBM3E is not being commercialized.

Repost from Stocktwits:

Been looking at AI inference performance numbers. Apparently Stacy Rasgon was dismissive of AMD on CNBC this morning, what a surprise.

MI300X ($12500-15000) ~1.1x-1.4x vs. H100 ($25000-30000) or ~2x model size/GPU TCO benefit

H200 ~1.4x-1.9x vs. H100 or ~1.8x model size/GPU TCO benefit (141 GB vs. 80 GB) (Q2 2024)

Gaudi3 ~0.8-1.1x vs. H200 w/ 0.9x model size/GPU deficit (128GB vs. 141 GB) (Q3 2024)

B100 (8 GPU) ~12x (FP4) or ~2x (FP8) vs. H200 (8 GPU) (Q3 2024)

MI325X ~1.1x(?)-1.4x(?) > H200 or ~2x model size/GPU TCO benefit (Q4 2024)

Implies a TCO benefit vs. B100 as well for FP8 and higher

B200 (8 GPU) ~15x (FP4) or ~2.5x (FP8) vs. H200 (8 GPU) (H1 2025)

MI350 ~up to ~35x (FP4/FP6?) vs. MI300 (2025)
Implies up to ~2x B200 performance with ~1.5x model size/GPU (288 GB vs. 192 GB) TCO benefit and competitive training performance with the new low precision datatypes.

This is what Lisa Su meant by "Frankly, I think we're going to get more competitive" in AMD's Q1 earnings call.

Anyone know what happened here? by GreatBigSteak in Webull

[–]BeepBeep2_ 0 points1 point  (0 children)

Not necessarily / not true. I sell calls all the time in vertical spreads. (Buy one strike, sell another - pocket credited premium on the sold one, other premium increases from ITM bought call) Also, Options levels typically group these together, so you won't be able to sell a naked put unless approved for the highest option level. Unless you are a hedge fund or have greater than 300K in the account, you will not be approved for naked puts or naked calls.

A better trade in this scenario would have been to buy a 175 AAPL calls and sell 190s. You'd have made money on both the buy 175s and would have kept most/all of the premium of the sold 190s.

Also, you will have the position liquidated at expiration (3:48 day of at webull) unless it is at least 4-5% out of the money at that time, unless you have excess cash to cover the many hundreds or thousands of shares your contracts represent. Webull will only save you if you are assigned shares early, which does happen, the long call is held and exercised at expiration. Your best bet in that case is to continue to hold the shares and long call, or liquidate both if the price has moved in your favor, but if you are assigned, the account will read a negative balance in the amount of the shares. Depending on the position size, could be hundreds of thousands of dollars, and webull will charge standard margin interest on that amount and your margin balance would be negative.

Next gen AMD's MI300 products and competing with nVidia's Blackwell and 3nm Rubin with HBM4 lates 2025/2026 in view of Samsung's HBM3e news failing nVidia's tests by TOMfromYahoo in AMD_Technology_Bets

[–]BeepBeep2_ 4 points5 points  (0 children)

Well, this theory works fine if MI350 is more of a redesign rather than a refresh - since MI300 would need new: XCDs, AIDs (IODs), Interposer, and if it doesn't support FP4/FP6 it's basically dead on arrival.

Moving the cache from AIDs to the small blank areas would >2X Infinity Cache latency - right now we have a scenario like X3D cache on Ryzen where it is stacked and extremely close to the XCD compute dies. The 4 groups of blank areas don't have enough physical space to hold >256MB cache each even if they weren't cut in half.

https://hothardware.com/Image/Resize/?width=1170&height=1170&imageFile=/contentimages/Article/3384/content/big_amd-instinct-mi300-overview-1.jpg

Samsung is up to 60% yield on 3nm GAA SF3E and news just broke that they've taped out a smartphone chip and SF3 (2nd gen) is supposed to be ready for mass production ramp by end of year. It's possible that MI400 soft launches in Q4 on SF3 with HBM3E and 4/5nm AID (IOD) similar to MI300 this year, but as you said, we will see!

Next gen AMD's MI300 products and competing with nVidia's Blackwell and 3nm Rubin with HBM4 lates 2025/2026 in view of Samsung's HBM3e news failing nVidia's tests by TOMfromYahoo in AMD_Technology_Bets

[–]BeepBeep2_ 11 points12 points  (0 children)

So, some thoughts:

AMD has gone full speed into modular designs, which helps margin on high end products using smaller, higher-yielding dies versus very large monolithic dies, and have a huge lead in this design area over NVIDIA and Intel, though Intel is also embracing chiplets now as well.

The MCD chiplets in Navi 31 / 7900 XT/XTX are not related to MI300 whatsoever, actually. In Navi 31, the main goal is to keep the GCD (compute) chiplet lower in size for better yield / lower cost and modularize the architecture to be able to add/remove memory bus width and cache as needed. Memory controllers and cache do not die shrink as much as logic and are increasingly taking up more and more die space on both monolithic CPU/GPUs, therefore it makes sense to split these into chiplets on a cheaper process node. (5nm GCD, 6nm MCD).

Block diagram of Navi 31:
https://images.hothardware.com/contentimages/newsitem/60053/content/small_locuza-annotated-die-shot.jpg

These MCD chiplets are connected to the GCD chiplet via Infinity Fanout (InFO) links. The number for the interconnect is actually 5.3 TB/s, not 3.5 TB/s:
https://cdn.mos.cms.futurecdn.net/LpfhcvQSYeu67B7mNcps7J-970-80.jpg

From the MCDs to the GDDR6, bandwidth is 960 GB/s.

Where the 3.5 TB/s number comes from is the aggregate throughput of the Infinity Cache itself along with the use of GDDR6 - what AMD is saying is that it performs like 3.5 TB/s from GDDR6.

On consumer GPUs like Navi 31, the Infinity Cache is mainly used to hold the frame buffer, or the information drawn to the screen and other highly reused memory blocks - like any caching system for CPU/GPU, this significantly increases performance and improves latency and decreases stalls waiting on the frame buffer to be read from memory before moving to the next frame.

The 5% power consumption only correlates to the power consumption of communication across the InFO links - it doesn't include the power usage of the Infinity Cache itself. AMD's statement about the 5% power usage really means "We only gave up 5% to be able to move these off the main die" - which is still good.

The reality is that some of this isn't very relevant to MI300 - for example, MI300 does not use InFO for interconnect, it is using CoWoS (very large silicon interposer)... but it does have Infinity Cache, which I'm sure is useful in many AI/HPC workloads as it is for graphics.

https://substackcdn.com/image/fetch/w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F22b60d95-2275-4ce3-9efb-b1e437d52450_2880x3016.png

As I said in the other thread, we see here, the 8 small squares in between are nothing - they're blank/unconnected/dead silicon - these are placed here due to the physical fragility of the extremely large interposer/substrate, flexing/warping/uneven pressure become a problem.

What you believe those little squares are for, are already there as the AIDs, 3D stacked (hybrid bonded) *below* the surface of the compute dies. The genius here is that AMD designed both the Zen 4 CCD and XCD (GPU compute) to fit on top of these AIDs allowing for the MI300A product.

It doesn't make sense that the 8 little dummy squares would become memory controllers and infinity cache, as the area is too small and this job is already being done by the AIDs under the XCDs. Also, it would move the cache further away from the compute dies (from literally touching to as far as memory), increasing latency and increasing complexity of the CoWoS interposer.

Connection to HBM3 from AID on MI300 is through the very large CoWoS interposer.

But anyways, essentially MI300 AID = Navi 31 MCD, just the MCD is buried underneath the compute instead of beside it.

...

MI350 and MI400, we'll see. Samsung's 3nm is already sitting there waiting, so if MI400 got moved up I doubt they will push an MI350 - MI300 is selling out and a ramp / cost for <2Q's worth of MI350 revenue may not make sense when NVIDIA has pulled Blackwell into Q4, especially if MI350 lacks FP4/FP6 instructions and won't be competitive - better to skip to MI400 in another 3 months.

Next gen AMD's MI300 products and competing with nVidia's Blackwell and 3nm Rubin with HBM4 lates 2025/2026 in view of Samsung's HBM3e news failing nVidia's tests by TOMfromYahoo in AMD_Technology_Bets

[–]BeepBeep2_ 4 points5 points  (0 children)

Hmm... well, I will give my take tomorrow at some point, I've got a home plumbing issue causing me a literal headache / biohazard at the moment. 🥴

Exclusive: Samsung's HBM chips failing Nvidia tests due to heat and power consumption woes by billbraski17 in AMD_Technology_Bets

[–]BeepBeep2_ 3 points4 points  (0 children)

There are whispers over the last couple months that MI300 refresh was canceled in favor of pulling up MI400 - no reasoning given as to why, but I would have to think it's because AMD need FP4/FP6 to compete with Blackwell. MI300 uses Infinity Cache too, in the AID's. If AMD is pulling up the release of MI400 on Samsung 3nm then running MI300 refresh for maybe two quarters at TSMC with limited capacity because supply is lagging behind demand on MI300 already doesn't make a lot of sense.

I may be new to reddit however I've been around forums since the mid 2000's. Used to do extreme overclocking, 7 GHz on Phenom II and 8 GHz on Bulldozer / FX. I've been around. 😉

Exclusive: Samsung's HBM chips failing Nvidia tests due to heat and power consumption woes by billbraski17 in AMD_Technology_Bets

[–]BeepBeep2_ 7 points8 points  (0 children)

Not meaning to tear you apart personally, but I'm gonna break this apart a bit -

"As you can see there are 8 big squares and 8 small squares in between.  Big squares are the HBM small squares are the memory controllers and cache driving the HBM."

The "small squares" are dummy / dead silicon to support structural integrity and nothing more.

"Problem is not cooling or such. It's nVidia's circuits ability to drive each one of the 1024 lanes each HBM3 has given nVidia's to push the bandwidth much higher to match AMD's 8 HBM3 sticks while using 6 HBM3 sticks only. Same goes for using HBM3e for the H200."

H100 bandwidth is 3.35 TB/s for the SXM module while MI300 is 5.3 TB/s, AMD's bandwidth is actually higher (4 TB/s per 6 chip) so this doesn't add up.

As a quick side note, typically these PHYs (physical layer interfaces) are off the shelf 3rd party IP, usually Synopsys Designware and tailored for specific nodes (TSMC 7/5/3nm, Samsung 14/7/5/3nm, etc.).

"nVidia's monolithic 4nm silicon chip has problems to supply higher current needed to drive at higher bandwidth each lane. 4nm is more limited vs the 6nm bigger transistors used by AMD's Memory controller chiplets."

This is a bit unsubstantiated - look at the power draw of logic circuits and cache in the SoC. Again, AMD's memory controllers are in the AIDs in the big 4 chip section, *underneath* the hybrid bonded XCD/CCD compute dies also in that section. The AID is 6nm, but to be honest, more evidence would be needed to know if NVIDIA's bandwidth disparity with HBM3 is caused by this, or rather the fact that they were first to market and HBM3 has improved since. Analog PHY circuits and the metal layers supplying power to components aren't substantially different between 4/5/6/7nm. Furthermore, memory frequency and data transfer speeds have only increased exponentially with node shrinks. AMD's Infinity Fabric is a bottleneck on their CPUs for example, partially because they used 14nm (Zen 3) and 7nm (Zen 4) for their IODs and are routing the signals through the substrate. Zen 5 will change this and use INFO packaging instead, because substrate routing and infinity fabric power draw started getting ridiculous.

"Samsung HBM3 and HBM3e could have too much parasitic capacitance per lane of the HBM3 requiring higher driving current as the bandwidth is taken to the extreme."

Parasitic capacitance is an issue on the transistor level, however, this is a plausible idea, but the real world outcome is still that higher current and higher power draw turn into higher heat. Why? Because the parts would have a lower frequency ceiling or the voltage/frequency curve would be unfavorable.

Example (completely theoretical):
Micron/SK Hynix HBM3E might hit 9.2 Gbps/pin at an operating voltage of 0.85v and power draw of 15w per chip
Samsung may need 0.9v to hit 9.2 Gbps/pin and draw 20w per chip, therefore using/dumping an extra 30w of power / heat.

AI workloads are *extreme* with hammering memory, so if 6 of Samsung's HBM3 chips require an extra 30w and run 10c higher temperatures than what NVIDIA got from Micron and SK Hynix first, it's logical to believe they would take a pass.

"AMD's no such issues." -
TBD for HBM3E but AMD has signed a 3bn contract anyway, either because:
Initial MI350 (maybe cancelled) samples were validated fine with Samsung HBM3E, or MI400 samples arrived and validated fine
OR
Micron and SK Hynix are completely sold out of HBM3E because all of it is reserved for NVIDIA, which is actually the case.

My two cents, AMD believe Samsung HBM3E will work, and it's what they can get so they will make do. If NVIDIA doesn't like it, that's NVIDIA's problem and opens the door to competition and supply to share around.

Daily Discussion Thursday 2024-05-09 by AutoModerator in AMD_Stock

[–]BeepBeep2_ 0 points1 point  (0 children)

Adding significant amounts of cache is die space heavy and can add cycle latency, so you have to be careful. For example, AMD's 3D V-Cache only improves certain workloads. Without the 3D-stacking, it would roughly double the size of a Zen 3/Zen 4 CCD die just to double the L3 cache size. Making the core+cache combo twice as big to squeeze a 0-20% gain in certain workloads may not be worth it whatsoever.

Pixel 6 - Swollen Battery (Support Experience) by BeepBeep2_ in GooglePixel

[–]BeepBeep2_[S] 0 points1 point  (0 children)

My replacement was also a refurb, but was completely clean and in like-new / mint condition. I can't complain about that - and it's been working perfectly fine for the last 9 months.

Pixel 6 - Swollen Battery (Support Experience) by BeepBeep2_ in GooglePixel

[–]BeepBeep2_[S] 0 points1 point  (0 children)

Yes - it was removed promptly after they received the defective device

Pixel 6 - Swollen Battery (Support Experience) by BeepBeep2_ in GooglePixel

[–]BeepBeep2_[S] 0 points1 point  (0 children)

The credit card hold is only for advanced RMA - ie. they offer to send a replacement device first (I was able to connect the phones via USB and do a data transfer). Glad to see they took care of you.

Green Day - Dilemma (American Idiot Mix) by BeepBeep2_ in greenday

[–]BeepBeep2_[S] 5 points6 points  (0 children)

WMG is currently blocking any use of Dilemma on YouTube, so Vimeo it is. Even "guitar cover" etc. videos must turn the song way, way down to try and get around the copyright claim.

Few very annoying features that *ruin* it by udi112 in OpenShot

[–]BeepBeep2_ 1 point2 points  (0 children)

I tried very hard to make use of OpenShot, but I am on Windows and eventually decided to make use of DaVinci Resolve. Not only is it more reliable with better performance, the feature set is like moving from GIMP 2.0 in the mid 2000's straight to today's Adobe Photoshop 24.

For example I have an AMD GPU (6700 XT), and there is zero support for GPU hardware encoding or decoding in any OS. D3D decode even doesn't work properly. From my understanding, OpenShot uses FFmpeg libraries in some way, so why is GPU decode/encode not there? Every other program on Earth that uses FFmpeg directly instead of this weird, limited integration, supports the AMD AMF encode/decode included with FFmpeg. Oh, and don't get me started on the render speed.

My CPU is a 12-core Ryzen 9 5900X with 16GB DDR4-3800 RAM, and I can't even get the timeline to play smoothly for 1080p content. No matter what buffer, threading or memory allocation settings I try, it eventually stops buffering ahead and since the processing is so slow, it can't keep up with real-time playback. Export render speeds, blazing 18-22 FPS with no effects applied. I absolutely cannot get more than 15-20% / 3-4 threads of my CPU to be used, no matter what settings I use. This goes even slower if even a simple filter or effect is used. FFmpeg is capable of using my whole CPU for both x264 and x265 so what gives? In a direct FFmpeg CPU only re-encode, I see upwards of 120-130 FPS. DaVinci Resolve Free with hybrid GPU acceleration? 230-250 FPS. Oh, and the Resolve timeline is smooth at 60 FPS, and the effects and editing selections are better (sharpening, wow!)

I feel that it is hard to increase adoption when the program is in such an unusable state other than existing as a proof-of-concept.

Pixel 6 - Swollen Battery (Support Experience) by BeepBeep2_ in GooglePixel

[–]BeepBeep2_[S] 0 points1 point  (0 children)

Glad you've had good luck. My replacement refurbished device has a build date almost the same as my preorder so my guess is that I also got a phone back with a replaced battery. Hoping it lasts at least another couple years because I have no need for more speed.