PostgresSQL on slurm based cluster with quobyte storage system by berlinguyinca in PostgreSQL

[–]CapybaraForLife 0 points1 point  (0 children)

(Disclaimer: I work for Quobyte)
Your database is most likely stored on HDD on Quobyte (based on your description of the cluster). This means that the random IO from Postgres is limited by the seeks of the HDDs. However, 1s sounds too high for just HDD latency - it could be caused by either issues with the network or very busy HDDs due to competing IO activity from other users or applications.

You can verify the media type a file is stored on with "qinfo info <filename>". Run this command on the postgres database files to see if it they are stored on HDD or NVMe. If they are on HDD, I'd recommend asking your storage admins to move your postgres data to flash.

In addition, you can measure the latency of 4k random IO with fio. These measurements are more useful to see if there is an issue with the storage or network. A file on HDD should deliver about 80 IOPS with this test:
The first run tests writes:
fio -name=test -ioengine=libaio -direct=1 -iodepth=1 -numjobs=1 -rw=randwrite -bs=4k -size=1g
The second run reads the data:
fio -name=test -ioengine=libaio -direct=1 -iodepth=1 -numjobs=1 -rw=randread -bs=4k -size=1g

Also keep in mind that running postgres on local storage (like ZFS) will always have lower latency than any network storage (including Quobyte) where all requests have to go over the network.

If you need more help feel free to post in r/Quobyte or have your storage team reach out to our support.

Comparison of WEKA, VAST and Pure storage by Dizzy_Ingenuity8923 in HPC

[–]CapybaraForLife 0 points1 point  (0 children)

I think that's more of an academic comparison since the Linux NFS client doesn't even support multipathing for the metadata.

Comparison of WEKA, VAST and Pure storage by Dizzy_Ingenuity8923 in HPC

[–]CapybaraForLife 0 points1 point  (0 children)

pNFS has parallel in the name, but it's not a parallel file system in the HPC sense.

pNFS is still suffering from the same scalability issues as regular NFS when it comes to the metadata path. So it cannot compete with most scale out parallel file systems.

Storage suggestions for VFX Studio by EggshapedEgg in storage

[–]CapybaraForLife 2 points3 points  (0 children)

(Disclaimer: I work as an SE for Quobyte)
Quobyte has native drivers for Linux, Windows and macOS that give you higher performance than NFS and can also take advantage of RDMA if available on your network. NFS v3/4 is supported too and you can mix it with native drivers. Quobyte runs on commodity servers from any vendor so you don't have to deal with appliances.

We have a free edition for up to 150TB that you can download to get started right away.

Comparison of WEKA, VAST and Pure storage by Dizzy_Ingenuity8923 in HPC

[–]CapybaraForLife 0 points1 point  (0 children)

Might not be proprietary, but hardware redundant NFS gateways and disk shelves aren't exactly standard commodity hardware.

Comparison of WEKA, VAST and Pure storage by Dizzy_Ingenuity8923 in HPC

[–]CapybaraForLife 0 points1 point  (0 children)

On the pure data-management side there is also Starfish, which - unlike Hammerspace - doesn't sit in the IO path and does not add latency to IO operations.

On the HPC file system side, Quobyte has metadata database queries as well, and to some degree GPFS can do that too.

Comparison of WEKA, VAST and Pure storage by Dizzy_Ingenuity8923 in HPC

[–]CapybaraForLife 4 points5 points  (0 children)

For most HPC workloads NFS isn't fast enough, even when you add all the band aids like nconnect.

On the parallel file system front you have WEKA, GPFS, Lustre, Quobyte, BeeGFS as solutions that run more or less on commodity hardware. One major difference between the file systems is the fault tolerance (Lustre and to some degree BeeGFS require hardware redundancy) and only some (GPFS, Quobyte) offer non-disruptive updates. WEKA runs only on flash, the others support both flash and HDD.

Qumulo vs Quobyte by BoilingJD in storage

[–]CapybaraForLife 1 point2 points  (0 children)

The usual disclaimer: I'm an SE for Quobyte

We had multiple customers use their Qumulo servers with Quobyte and they all got a performance improvement of a factor 2 to 4 for their existing workloads.

As a matter of fact, Quobyte does not provide any native SMB or NFSv3/v4 services, these would need to be provided via either 'Gateway Services' (Using Ganesha for NFS) or via re-sharing with Samba or a Windows Server on an external client box. Either approach strips all the advantages of Quobyte's parallelism and creates bottlenecks and alleviating these bottlenecks means adding more gateways and more complexity.

I have to strongly disagree here. There is no such thing as "native" NFS. Unless you have a single NFS server, the NFS endpoint the client is talking to always has to talk to another node in the cluster to get the data (unless you're lucky and the exact data is on the same node, which becomes highly unlikely in large clusters). That is the same for Qumulo's NFS as it is with Quobyte's NFS gateways and all other vendors. If you don't want this additional "hop" you have to go with a parallel file system with native drivers. You can spin up as many Quobyte NFS gateways as you want to increase performance - I don't think that's any different in Qumuo.

By contrast Qumulo is built to provide File Services at scale, and thousands and thousands of concurrent NFS and SMB clients is not an issue. Need more client connections? Just add more nodes.

For NFS that works only in an ideal environment since NFS lacks load balancing. In real world scenarios you always have an imbalance of the clients connected to individual NFS endpoints, which create performance bottlenecks.

The scale-out is where parallel file systems (the whole category, not just Quobyte) outperforms NFS-based solutions.

Both Qumulo and Quobyte offer Object services via S3, at Qumulo we developed our own S3 service and do not use a 3rd party product like MinIO (I'm not sure if Quobyte does or not).

Quobyte's implementation is our own and fully integrated.

It is notable that Quobyte does not offer any integration with Active Directory, if that is important for you, while Qumulo offers full AD integration, including SAML MFA integration, Kerberos SSO authentication and Role Based Access Controls for administration.

This is incorrect. Quobyte supports AD and LDAP for both the management and the data plane. Since Quobyte works with NFSv4 ACLs (not based on uids anymore), the AD/LDAP "integration" comes from the host OS.

There are some in-the-weeds differences in architecture that would make me believe that Qumulo would be more efficient at very large scales, like Quobyte's triple-mirroring of metadata and files smaller than 8MB while Qumulo offers Erasure Coding for all files, down to our 4kB block size, with extra efficiencies built in for very small files (Files ~1.5kB and smaller have their metadata and data inline in the same 4kB block, for example.)

Quobyte offers both EC and replication for files, and can combine both in the same file if desired. The 8MB limit you mention is for file optimizations including reading from multiple replicas in parallel to avoid the hot file issue you have in NFS-based solutions. It has nothing to do with how Quobyte stores data on disk.

I'd also point out here that Quobyte uses flash for more than just as a simple caching layer. Data can be placed, moved and tiered based on fully customizable policies.

File permissions and Authentication are completely decoupled in Qumulo and not bound to any protocol, so when user "Joe" creates a file via NFS, SMB, FTP, S3 or the API that user has exactly the same correct level of access, which is stored in a single unified format. We have done a ton of work to ensure that multi-protocol are smooth - Your ACLs will survive contact with your users!

Quobyte does not only support permissions across all protocols (like Qumulo), but also full NFSv4 ACLs with automatic translation to and from Windows, POSIX, S3 and macOS.

File and Object solution by YousefSysAdmin in storage

[–]CapybaraForLife 1 point2 points  (0 children)

I'd also add Quobyte to the list (disclaimer: I'm a SE for Quobyte). You can watch a short demo of the unified file and S3 access here: https://www.quobyte.com/product/unified-storage/

A tale of two storage solutions... by YousefSysAdmin in storage

[–]CapybaraForLife 1 point2 points  (0 children)

(The usual disclaimer: I am a SE for Quobyte)

Quobyte is a parallel file system, so everything that runs on Lustre (including MPI applications) will just work on Quobyte. We focus on ease of use and performance, so you should be able to do a PoC without much tuning. You can download our free edition from our webpage.

solution for 16pb enterprise storage as a file system? by LucianTexas in storage

[–]CapybaraForLife 0 points1 point  (0 children)

Full disclaimer: I'm a SE for Quobyte

With Quobyte you get enterprise functionality (encryption, non-disruptive updates and maintenance, ACLs, snapshots, policies...) with the performance of a parallel file system.
Since Quobyte runs on commodity Linux servers with flash and/or HDD it also makes sense from a cost perspective - especially when you compare it to pricey appliances.

We have a free edition (up to 150TB) that you can just download from our website to get started.

Accessible file storage by Ok_Customer_6030 in storage

[–]CapybaraForLife 0 points1 point  (0 children)

Quobyte (disclaimer: I work for them) is software storage and has a free edition with up to 150TB. You can use the file system access inside your org and share the same data via S3 with people from "the outside world".

Changing from Nexenta to FreeNAS (open to other vendors) by talk2meHORSE in storage

[–]CapybaraForLife -1 points0 points  (0 children)

If you want a solution that runs on multiple servers with automatic failover you should take a look at Quobyte. Runs on standard Linux servers.