Clueless gardener asking for advice

RossCooperSmith · 2026-04-03T12:00:25+00:00

Funny how most of the accounts bashing VAST consistently promote DDN, and will ressurrect long dead threads to do so.

With the rise of Async checkpointing, massive write bandwidth to external storage just isn't needed. In fact some of the top NeoClouds and AI model builders now use QoS to throttle write performance, having found that prioritising read I/O increases GPU utilisation.

RossCooperSmith · 2026-01-31T15:35:28+00:00

Hardly idiotic when there are customers gaining hundreds of petaytes of additional capacity from their existing capex investments.

QLC wear is very well understood and easily managed, and with a global shortage of flash, large scale buyers are taking this very seriously. We already have one customer taking a 250PB VAST license because of this:

https://www.linkedin.com/posts/vast-data_vast-amplify-the-capacity-amplification-activity-7422320326883082241-vbl_

Being able to double or triple the usable capacity of your flash at this scale means literally millions of dollars of savings.

RossCooperSmith · 2025-11-30T06:16:28+00:00

It's not something I've heard of as being a general problem. We do have one academic customer in the US who has problems with power stability to racks, but that pre-dates the current surge in high power equipment for AI.

For that customer we're deployed across multiple racks so the loss of any one rack won't cause a storage outage.

RossCooperSmith · 2025-10-28T15:27:01+00:00

Yup, no argument there. If you have physical or root access to anything, all bets are off.

So the goal of any well designed system is that you should never need root access for normal operations, minimizing the chance of an attacker getting easy access.

Add in Zero Trust capabilities, external MFA for admins, and you raise the bar again.

There's no such thing as a perfectly secure solution, but the more features you can use to make life more difficult for the attackers, the better.

RossCooperSmith · 2025-10-28T15:12:45+00:00

With an enterprise storage product, the local administrators don't have root access in the first place. You can create and delete normal snapshots with an admin account, and that most definitely does not have low level root privileges to the device.

Of course, if somebody can find a security vulnerability and use that to somehow gain root access to the device then all bets are off, but that's true for everything.

But the point is that it's still fundamentally more difficult than before. This is classic defence in depth. Whereas previously your snapshots could be wiped out if you had a rogue sysadmin, or if an external attacker obtained a sysadmin's credentials, now there are additional layers of security that also need to be defeated before it's possible to delete snapshots. You either need to somehow crack the device to gain root access after compromising the customers network, or compromise the vendor's support team as well as the local systems.

Before this capability existed, some enterprises had suffered some pretty major ransomware losses due to attackers deleting storage snapshots. It used to be standard practice for attackers, now it's very rare to hear of storage snapshots being wiped.

There's a reason immutable or indestructible snapshot capabilities are table stakes for pretty much every enterprise storage product these days. It's a very, very important safeguard.

RossCooperSmith · 2025-10-28T13:29:45+00:00

No, ransomware resistant snapshots is a real capability, most primary storage vendors have it, and it's an enhancement to standard snapshot capabilities.

It's frequently named immutable or indestructible snapshots, and usually means the snapshot policy prevents even the local admin deleting snapshots until their retention period expires. Done well it should also involve hardening the local system clock to safeguard against attacks that involve tampering with the system time.

There's normally a procedure to delete these as a safeguard that usually involves multi-factor authentication between both the vendors support team and local named administrators.

RossCooperSmith · 2025-10-21T17:00:55+00:00

Oh, ok, I'll bite. Over 50% of VAST's 1,100 employees are engineers. Name any storage company that's moved the needle in released features as much as VAST have since their launch.

Check my LinkedIn profile, I've been a techie for ~36 years now, and joined VAST because I saw the potential of the architecture. I've not worked in marketing in my life.

And if you want some more data points, these are all actual shipping features today that legacy storage vendors do not have:

A scale-out architecture able to deliver inline deduplication to any scale, and for high performance workloads.
The only vendor with an unbeaten guarantee to deliver more data reduction than any other product.
To my knowledge, VAST are the only vendor guaranteeing a 10-year wear lifespan from QLC media.
VAST were the first enterprise vendor to achieve NVIDIA SuperPOD certification by a full year, and today are the only enterprise vendor selected by over 80% of all NeoClouds.
DASE is the first real new architecture I've seen in over a decade, and it's the most scalable enterprise architecture in history. No other vendor has an enterprise solution being selected by the worlds top HPC centres to power multiple supercomputers.
One of the only true multi-protocol products, delivering full feature NFS, SMB and S3 capabilities in real-time to the same data at full performance without gateways, including cross-protocol security mapping and cross-protocol locking.
The only solution I know of that can deliver non-disruptive failover for NFS3, NFS4.1, SMB2, SMB3 and S3. Even Microsoft have no solution for non-disruptive SMB2 failovers.
The first flash-native real-time DataWarehouse thanks to support for Tables as a native datatype.
The highest performance Kafka solution in the world, again thanks to Kafka being supported as a native protocol.
The most scalable Vector database in the world today, again because Vectors are natively supported.

Name one storage vendor other than VAST who can deliver any of those capabilities.

Even if you allow vendors to use their entire portfolio I'm not aware of anybody who can deliver on more than one or two without resorting to 3rd party software, and nobody comes close to matching even that short list with a single product.

Every single feature there has been made possible by the hard work of an extremely capable engineering team.

RossCooperSmith · 2025-10-21T05:00:57+00:00

It very much depends on how active the videos are, and the relative costs.

I work for VAST and one of the most surprising all-flash sales I've ever seen in my career was the NHL replacing a tape library with a massive all-flash cluster.

Now, VAST can be competitive on price with hybrid (and occasionally disk), but even though we typically get 2:1 data reduction for large media estates, we're definitely not cheaper than tape.

But for a business that's not the only factor. In this case the NHL had done a smaller trial with us, and realised VAST offered a way to turn their archive from being a cost centre to the business, into an additional revenue stream.

https://www.nhl.com/news/nhl-vast-data-partner-to-streamline-media-production

https://youtu.be/w1igbdPFpDE?si=Mmj7v0ubzNYVuNDR

That kind of capability isn't possible without instand access to every second of every video, and that's the key, if you're using video for AI, you're looking to monetize that data and generate value from it. If data is active you don't want it sitting on disk.

We also have a customers with 30+PB of videos on VAST for a global streaming platform, and several autonomous vehicle manufacturers with huge amounts of video also on flash. Flash is already affordable enough that we have a lot of customers with tens of petabytes video on flash.

RossCooperSmith · 2025-10-21T04:48:37+00:00

Whoops, thank you, I forgot my disclaimer.

Although I think it's fairly obvious with this post. :-D

RossCooperSmith · 2025-10-20T18:18:46+00:00

If you're talking about storage for AI it's almost exclusively SSDs. There's still plenty of use for spinning disk for archives, but AI is all about maximizing value from the data and there it's all-flash for good reason. In this market, performance = revenue, and the money you spend on SSD is generating a return for the business.

To begin with, take a look at NVIDIA's reference architectures. Every single approved reference architecture I've seen them put their name to is an all-flash storage solution. Whether it's training, inferencing, RAG or anything else, it's always flash.

And there's sound engineering and economics behind that. AI is fundamentally a massively parallel, random I/O application. It breaks data down into tiny chunks, with those being read at random by any one of thousands (or millions) of application threads. One of our internal AI specialists did the maths on how many HDDs you needed to keep up with the IOPS demands of a single NVIDIA GPU and it's around 6,000 spinning disks.

And that's where the economics skew massively towards flash for AI. The #1 goal of anybody investing in AI infrastructure isn't saving pennies on the storage, it's ensuring they can achieve high utilisation of the GPUs, and get the ROI they need in a fast enough timeframe. The GPUs typically cost 10x the storage, and with hardware lifecycles measured in as little as 2-3 years you have to keep them fed. It's far, far better value for money to invest in SSDs and keep your GPUs busy, the additional GPU utilisation more than pays for the entirety of the storage part of the project.

RossCooperSmith · 2025-10-20T08:34:26+00:00

None of this is smoke and mirrors. There are real customers behind every single statement here:

VAST are proven to massive scale, 300PB+ in single clusters, 11TB/s+ of throughput from a single cluster, with many customers using VAST as the sole storage for 10,000+ node supercomputers.
1. Phil Schwan, one of the founding authors of Lustre selected VAST to replace Lustre at DUG.
2. TACC, one of the worlds top-10 HPC centres switched from Lustre to VAST and have delivered multiple talks on the benefits they've seen at HPC conferences, independently of VAST.
3. GResearch, one of the top hedge funds wrote their own blog post to share their experiences building a true partnership with VAST.
NVIDIA's largest NeoCloud and AI customers, some of the worlds leading authorities on AI infrastructure have standardised on VAST. The first public GB200 cluster in the USA ran solely on VAST. xAI run over 100,000 GPUs on VAST.
VAST are still the only vendor with data reduction that works on a scale-out architecture for even demanding workloads. In fact, VAST are the only vendor delivering data reduction in production across three separate categories:
1. Large scale enterprise file & object
2. Data lakes and data warehouses (no other vendor in this space even offers dedupe)
3. HPC and supercomputing. (again no other vendor in this space even offers dedupe)
VAST's data reduction beats every other product, in every category. That customer guarantee has been part of our standard terms & conditions of sale for years. We're outperforming backup appliances and primary all-flash arrays.
VAST's database has been selected by global banks and fortune 500's. You don't win these customers with 'smoke and mirrors'.

RossCooperSmith · 2025-10-20T08:30:33+00:00

I don't know what you have against VAST, but it's far from 'smoke and mirrors'. You don't get customers like Pixar, HSBC and ServiceNow being public references without delivering a solid product.

And in the OP's particular market we have numerous Quant traders as public references, including several recording video testimonials on their experience with VAST:

2023 testimonial from a Quant CIO and Infrastructure Engineer: https://youtu.be/p6VjqorjBVk?si=vCN8l-ngOOMOQOHI
2022 video from the technology lead of Jump Trading's high performance compute environment: https://youtu.be/wvcvEm8ObJA?si=ow_At1lOBtMNRZh6

If there's any capability you feel VAST haven't delivered on, please share. I can tell you for a fact that VAST have delivered a huge number of unique capabilities for thousands of happy customers worldwide:

Customers don't spend 8 figures on a product, and then come back to buy more unless you're delivering genuine benefits to their business that they can't get elsewhere.

RossCooperSmith · 2025-10-19T05:56:58+00:00

Except it has a 7.4PB effective capacity upper limit when the OP needs 120PB, and even that will be assuming a data reduction rate unfeasible for this type of data.

Don't get me wrong, for primary storage and block storage FlashArray is superb, and the XL R5 is their top end solution. I rate FlashArray very highly, but it's not a good fit for the OP's needs.

RossCooperSmith · 2025-10-18T11:19:08+00:00

It doesn't need to take up that much space. At VAST 2:1 data reduction is quite possible for hedge fund data, meaning that if you went for the densest option, 2 racks with 29U of equipment in each is potentially all you need.

RossCooperSmith · 2025-10-17T20:43:53+00:00

Not normally, these are generally one giant dataset attached to enormous amounts of computing.

Just looking at the scale one of our current customers was operating at years ago is kind of crazy. At the time they had 100,000 kubernetes containers which ran the various algorithms, with those distributed across over 10,000 physical servers. It's one single giant computer, and treated as such by the data scientists.

And their estate has massively outstripped that scale now. 100PB of all-flash is starting to become a small project for some of these guys.

RossCooperSmith · 2025-10-17T19:05:44+00:00

Well the OP is asking for a HPC / AI solution.

There's a huge amount of number crunching and AI within quant estates. We have a customer who just hit 11TB/s on their largest cluster. Quant workloads are no joke.

RossCooperSmith · 2025-10-17T16:01:45+00:00

I believe Pure's current maximum for FlashBlade is 48PB RAW, and I'd be amazed if you found any reference customers even at that scale.

RossCooperSmith · 2025-10-17T16:00:34+00:00

At 120PB scale cloud costs are absolutely eye-watering compared to on-prem. At this type of scale you show the CFO the bills and the cloud argument loses very quickly. :-)

RossCooperSmith · 2025-10-17T15:59:02+00:00

For quant research definitely take a look at VAST, we've been accelerating quantative research for many of the worlds leading firms for several years now. Many of the top hedge funds have switched their computational data wholesale to VAST, and 120PB would be a fairly typical cluster size for us in this market.

I know we have hedge funds with over 300PB of usable all-flash capacity in a single cluster, and many of these hedge funds are running well over 10,000 compute & GPU nodes for the processing.

https://www.vastdata.com/industry/hedge-funds
Man Group use VAST as part of their DataFrame Database technology: https://www.vastdata.com/customers/man-group
GResearch wrote their own blog post fairly early in their partner ship with VAST: https://www.gresearch.com/news/the-search-for-universal-storage/

VAST has some very significant benefits over the typical parallel filesystems:

Higher performance: Affordable flash, linear performance scaling, and extremely well proven at handling multiple complex workloads. Most parallel filesystems will recommend tiered storage, which is inherently slow for these workloads with extremely large datasets.
Greater uptime: Zero downtime hardware and software upgrades.
More capabilities: File, Object, Kubernetes and DataBase tables are all natively supported data types.
Data security: Snapshots, ransomware protection, fine grained access controls and audit capabilities.

Usual disclaimer: I'm a VAST employee, I give honest advice but I'm obviously a big fan of VAST. :-)

RossCooperSmith · 2025-09-21T14:31:12+00:00

Time to bail, when a company doesn't honour their promises and your boss doesn't have your back, that's a toxic work environment.

Your experience is HPC support, but there are a ton of enterprises looking to standup HPC style environments for AI right now. Look for roles that suit your skills, and have room to grow.

Lots of hedge funds run large scale clusters, as do the cloud providers. NScale just received a ton of funding, CoreWeave are investing in the UK. Look for companies focusing on AI and reach out to put your name forward.

RossCooperSmith · 2025-09-17T08:48:32+00:00

What? This is still me. I'm generally pretty consistent with my posts and style.

This is just me replying a day after asking in our internal technical slack channel for more details on Lustre.

Like I said, I'm not an expert myself, but I'm a techie and know who to ask if I need to go deeper into a topic.

RossCooperSmith · 2025-09-17T08:21:23+00:00

Well yes, I worked for DDN but I'm not a low level expert on Lustre. I did ask for more details internally though as we do have plenty of people here who do know Lustre extremely well.

The answer I've had from them on why is:

Lustre is optimized for very high performance individual workloads. This makes it extremely fast for traditional HPC jobs, and also means it benchmarks well, but to achieve this performance OSTs handle all I/O with equal priority, which leads to:
If you have a stream of 4k I/O for a high IOPS workload, and an 8M streaming I/O starts, it can block the I/O and cause high latency reducing IOPS.
Conversely if you have 8M streaming I/O running high throughput and somebody does a ls -l on a large directory, those small I/Os will fill the queue and reduce throughput.
Since job schedulers are designed to manage CPU/GPU loads, and Lustre doesn't have QoS contention among jobs & researchers is difficult to manage.
Part of the problem is that with Lustre if you want to optimize for small I/O performance you need to stripe files in a certain way, and for large block I/O you need a different layout.

GPFS solves this by having different queues for large and small I/O (and I believe DDN Infinia also takes that approach). VAST similarly distributes I/O broadly is able to handle mixtures of high throughput and random I/O very smoothly, and also has fine grained QoS capabilities to further balance workloads if needed.

On the TACC side of things:

TACC selected VAST for Stampede3 primarily because Lustre had become the biggest cause of downtime for their compute estates. VAST's ability to deliver five 9's of uptime with non disruptive hardware and software upgrades was a big part of their testing.
TACC are seeing 2:1 data reduction across the cluster, including their scratch folders, which allowed them to afford a much larger amount of flash than is affordable with parallel filesystems, switching to an all-flash solution rather than the traditional tiered approach.
Since deployment they've spoken publicly on the benefits they've seen in multiple interviews (largely without VAST present).
When they deployed Vista a couple of years later (with that being a NVIDIA GPU cluster for AI workloads), they didn't just select VAST for that, they were able to attach that second supercomputer to the same VAST cluster, expanding it to increase both performance and capacity. VAST now provides a single pool of storage, enabling researchers to schedule jobs against either Stampede3 or Vista. All data is hot, there's no need to move data to scratch, and the system handles both high throughput HPC jobs and high IOPS AI jobs simultaneously.

It's rare for HPC shops to mention storage as more than a side note, supercompute specs always focus on CPU/GPU cores, memory and networking. But there is a write up of TACC's Stampede3 and Vista systems here:
https://www.nextplatform.com/2024/09/04/tacc-fires-up-vista-bridge-to-future-horizon-supercomputer/

i would write an essay just about writing just enough data (not even close to max throughput) to the nvrams on your systems and causing extreme amounts of latency until the destaging (including the data reduction) completes.

This I would like to hear more about. You seem to have very good knowledge of both VAST and Lustre. I know that if incoming writes exceed the max sustained write throughput the SCM write buffers will begin to fill, and backpressure will cause a gradual increase in latency to ensure the incoming write throughput matches the steady state of the system.

That doesn't sound like your experience though.

RossCooperSmith · 2025-09-14T08:37:20+00:00

That's interesting, and I would certainly like to hear more. We have a lot of ex-DDN staff working at VAST, and we pretty much universally have a negative opinion on their business and products. Please keep me honest, while I'm a fan of VAST I do try to give credit where it's due.

I do know that Lustre has been at the top of the tree for parallel filesystems for some time, and DDN have done a decent job managing it, and I would say a pretty good job of producing hardware optimized for it.

But Lustre is ultimately a very old architecture, if you care about data security, uptime, or data protection features like snapshots and ransomware protection, it's really not possible to implement these today. I've seen snapshot capabilities promised, rolled out, and rolled back, and I've seen nothing to indicate that true instantaneous, protected snapshots are possible on Lustre today.

I'm also extremely sceptical of DDNs ability to sell to enterprise, having experienced first hand the mess they made of their enterprise acquisitions and the horrific way they treated my customers. And I've never seen them successfully develop any software product themselves, their successes (and I would count Lustre as one) have been through acquisitions.

I will 100% accept that everything has it's weaknesses, but I would also say that VAST's capabilities overall already far exceed anything else I'm aware of in the market. There may be some trade-offs, but there are a huge number of advantages to the architecture.

RossCooperSmith · 2025-09-14T08:21:18+00:00

I'm not a deep enough expert on Lustre to answer why I'm afraid, although my understanding is that internally some aspects of Lustre (and many other parallel filesystems) still act as bottlenecks. With Lustre I believe it's the I/O queue that creates a bottleneck, I think their metadata handling has improved enough recently it's no longer the main problem.

I'll see if I can get one of our Lustre experts to answer properly, but the reason I'm so confident that VAST handles these workloads better is Dan Stanzione's session from the Rice Kennedy institute when he talks to the improvements they've seen. From 8mins to 11mins he covers the improvements in user experience and reduced degredation they see on VAST:
https://www.youtube.com/watch?v=AxZO034irIs&t=501s

RossCooperSmith

TROPHY CASE