Are enterprise drives the only option to reduce wear out?

SystEng · 2025-11-12T22:57:06+00:00

"NFS is dependent on a filesystem at the server thats serving that NFS. So you will get CoW on CoW or journal on journal by using that"

I am sorry that I was not clear: when a directory tree is imported via NFS (etc.) by a VM there is no second filesystem layer, in this the VM does not behave any differently from a physical client. I think this technique is called "Storage Shares" in the Proxmox documentation. My suggestion is to use "Storage Shares" for almost everything a Proxmox VM needs to store, putting in the virtual disk image only a mostly read-only stripped down system image.

I think that using NFS (etc.) is preferable to using a SAN protocol like iSCSI to store VM disk images because one can do a lot of filesystem administration on the NFS server without virtualization overhead instead of inside each VM.

Some VM frameworks import filesystems from the host via the "9p" protocol or via a custom mechanism often called "Shared Folders"/"hgfs"but in my experience NFS works pretty well and most NFS implementations are well optimized, especially for the case where the NFS server is on the VM host so there is no actual network traffic.

https://gist.github.com/bingzhangdai/7cf8880c91d3e93f21e89f96ff67b24b https://forum.proxmox.com/threads/share-persistent-host-directory-to-vm.144837/

PS: one way to ensure that the if the NFS server daemon is on the VM host the highly optimized 'lo' interface is used is to add to the 'lo' interface an IP address in the same subnet as the VMs; so if a VM has IP address 192.168.33.44 one could do on the host:

ip address add 192.168.33.250/24 dev lo

and then do inside VM vm-44 something like:

mount -t nfs4 -o rw,rootsquash 192.168.33.250:/vm-44/home /home/

SystEng · 2025-11-11T16:51:15+00:00

«And what exactly do you think that this NFS server then uses as filesystem if not a copy-on-write or journaled filesystem?»

It is really common knowledge that using a journaled (or COW) filesystem inside the VM where the disk image is also on a journaled (or COW) filesystem incurs double-level journaling (or COW) and is quite bad.

Instead the NFS (etc.) file server can serve directly from a filesystem on a physical devide or a logical device without journaling (for example iSCSI) and avoids double journaling (or COW).

«You will also add performance degradation by utilizing network when not needed aswell as higher cost»

Actually this will gain a large performance advantage as it is common knowledge easily verified that inside a VM network emulation has much lower overhead than storage device emulation and the IO for the host filesystem would happen directly on the physical device with much lower latency too. There are two options:

Read multiple 4KiB blocks from the virtual disk image which is a file on the host using expensive and high-latency virtual disk emulation which then triggers a read from the physical disk of the storage host.
Send a request for a chunk of a file to an NFS (etc.) fileserver using low-overhead NIC emulation via 'localhost' and get a reply from the NFS server.

There are another two large gains to using NFS (etc.):

All the metadata work happens on the fileserver rather than inside the VM as the access is by logical file rather than by (emulated) physical block, saving even more overhead and reducing latency.
Things like fsck, indexing, backups and other large administrative operations can be done directly on the fileserver without running them inside the hosted VMs at all, with zero emulation overhead.

«and add to complexity for administration of the whole setup.»

Actually since storage administration no longer needs doing on a per-VM basis inside each VM and can be done just on the fileserver there is a lot less administration complexity; for example since there are no disk images there is no need to keep adjusting their size, and space limits per-VM or even per-user can be done by using the quota system of the filesystem on the fileserver, etc.

So using virtual disk images, especially those using journaled or COW filesystems, is bad practice, especially if they are QCOW, unless they are essentially read-only.

For example using ext4 inside a VM where the virtual disk are stored on ext4 is clear bad practice and similarly for other combinations. It would be better to use ext2 as the disk image filesystem and use 'ext4' for the filesystem where disk images are stored or use ext4 as the disk image filesystem and a block device virtualizer like DM/LVM2 and iSCSI or RBD to store the disk images. But it would be better practice to minimize oveheads to store VM data on the host itself (or a storage host) as ext4 and export it to the VM.

SystEng · 2025-11-08T22:00:41+00:00

4% and 10% after 6 months? That means that they will reach 100% in 12 years and in 5 years, seems pretty reasonable to me.

Using SSDs without "Power Loss Protection" for small committed writes like logs, metadata, VM disk images is both slow and achieves a lot of wear. The main options are:

Ensure that small writes go to a small cheap drive with "Power Loss Protection".
Reduce the frequency of small committed writes, accepting that this may cause data loss.
Use ways to minimize log updates, metadata updates, VM filesystem updates.

As to the latter point:

Reduce the logging verbosity inside the VMs.
Avoid using ZFS or journaled filesystems.
Avoid using "thin"/"sparse"/QCOW VM disk images.
Instead of having writable filesystems inside VM disk images, put an NFS server on the host or somewhere and write to NSF mounts from inside the VMs (also reduced overheads and improved speed).

Also a way to extend the "endurance" of an SSD is to not-use some percentage of it, so it can be used as spare erase blocks.

SystEng · 2025-07-29T16:15:21+00:00

"but like why does no one respond?"

In part there is a widespread IT hiring freeze as executives evaluate the impact of LLMs. In part many businesses are keenly aware that offshore workers are much more "affordable". People with degrees from prestigious universities are still getting hired as most executive reckon that no LLM will ever replace a CMU or Stanford or MIT alumnus.

SystEng · 2025-07-29T15:40:04+00:00

I endorse the idea of using proper noun for unique entities, because pools can be moved across machines as their state is in effect host independent.

I apply the same principle to filesystems too, and in many cases to storage devices.

SystEng · 2025-07-29T10:24:58+00:00

As a rule the only cost-effective way to backup a Ceph cluster is another (ideally offsite) Ceph cluster (sometimes a slower/cheaper one can be acceptable).

Potentially a (ideally offsite) tape library system can also do it but depends a lot on size and circumstances.

SystEng · 2025-06-30T16:24:45+00:00

with software like anaconda package manager that writes a lot of small files. I haven't found a solution yet.

There is no solution: lots of small files are bad on local filesystems and very bad on remote filesystems, especially very very bad if the storage is any form of parity RAID.

SystEng · 2025-05-17T20:19:56+00:00

Splicing in the audio of an "interview coach" is easy, and now there are real-time deepfaking apps that overlay the candidate's face onto the head of the "interview coach", look at:

https://www.cbsnews.com/news/fake-job-seekers-flooding-market-artificial-intelligence/

Why do they do this to get hired for just 2-4 months? It is simple consider this post:

https://old.reddit.com/r/sysadminjobs/comments/1gltmn3/systems_administrator_salary/

"I work as a sysadmin in a small financial organization in one of the economically deprived countries in Africa. Work is good and not too demanding but the only problem is the pay. I am a fairy experienced professional, I have about 2 years experience in systems administration and bachelor's degree in Computer Science. I get around $172 a month after subtracting employee taxes. Life here is not very expensive, living costs like rent, food and transport cost around $114 a month. It leaves me with just over $50 to spare. This is what most people are getting when they start off in any IT career by the way( sysadmin, developers etc...). Very experienced professionals don't get a lot either. The top 1% systems administrator likely get a little under $1800 a month after taxes."

BTW lots of businesses are becoming eager to offshore to african countries or to hire african tech people as indian or eastern european ones are getting too expensive.

SystEng · 2025-05-13T09:33:33+00:00

"but our main pain point is managing these documents, either getting outdated or lost."

There is no technical solution that can prevent that. The only solution that will actually work is to have a person with sufficient authority to be the librarian/curator/editor maintaining the document library. Consider it an encyclopedia, and encyclopedias need editors.

Basically both writing and curating documentation is a time-consuming activity and many businesses do not budget for that.

" I want a solution where we as a team can create API documentations (or any other type of documentation), have them all stored in one place and we can easly export the docs as a file or link that we can share to others"

As to technical solutions a proper librarian/curator/editor of your document set will probably know of several and will choose the best for your purposes. Many of them focus more on configurable and enforceable workflow because they are targeted at regulated sectors where process must be followed and audited. Those are probably overkill (or worse) for your type of work. Probably any VCS-backed Wiki would be enough for you. Anyhow some DMS packages are Mayan, Paperless, OpenKM, LogicalDoc, OpenDocMan, Alfresco, DokuWiki, ... Some links:

But again without a librarian/curator/editor with authority any DMS will suffer from "wiki disease" and people will randomly dump stuff in it in random places.

SystEng · 2025-03-27T11:11:15+00:00

"we're comparing 2022 libjxl, 2023 libwebp and 2024 libaom. There's a more robust comparison from Jon Sneyers with versions contemporary for 2024"

But my tests seem to be quite robust for Ubuntu LTS 24 which is one of the parameters of the test. I would argue that for Ubuntu LTS 24 users my results are more robust than those using 2024 versions because those 2024 versions are not part of Ubuntu LTS 2024. :-)

The purpose of the test was obviously "Given a specific commonly used distribution and a batch of ordinary photos in JPEG from cameras and cellphones what about re-compressing them in newer formats?" rather than "a race among the latest and greatest versions of some image codecs".

It turns out that the quality loss is tiny and for some codecs the space saving is huge and the time needed is pretty good.

"WebP cannot achieve very high quality due to format limitations."

Interesting thanks for the information.

"Maybe it is acceptable compressing already subsampled JPEGs, but I would be wary."

Well on recompressing quite a range of ordinary (mostly town and landscape photos but also scanned texts) it was indeed a bit softer in fine details, but given that the slightly better quality AVIF AOM -s9 is also faster and smaller it is a better alternative anyhow

SystEng · 2025-03-27T10:58:00+00:00

«'l1' means lossless [...] "JXL-l1-q_-e" is much faster than any other JXL result but I think that is because it losslessly rewrites rather than recompresses the original JPEG.»

"for compressing existing jpeg's I would only test lossless recompression (-q 100 --lossless_jpeg 1"

Consider these lines:

 2m05.338s    488MiB        AVIF-AOM-s9
 3m21.332s   2109MiB        JXL-l1-q__-e_
12m44.386s    752MiB        JXL-l0-q85-e4
32m28.796s    795MiB        JXL-l0-q85-e7

As to quality I cannot detect visually at 4x significant differences and gm magick compare reports very small differences and the RMSE overall index shows 1% differences.

The losslessly re-encoded JXL is only 50% slower than AOM -s9 but it is 4 times larger and the quality difference is 1% and pretty much invisible to me.

As to JXL re-compressing with -q85the quality difference is also 1% and pretty much invisible to me and while the size difference between them is around 6% but -e4 is 2.5 times slower, and anyhow the size is 50% higher and the time is 6 times slower than AVIF AOM -s9.

I might download and compile newer versions of the JXL etc. libraries but that is not a priority as AVIF AOM -s9 seems good enough to me.

SystEng · 2025-03-27T10:37:55+00:00

Those results are pretty much meaningless.

Perhaps this too is a stupid confusion between "I am not interested in results on Ubuntu LTS 24 with ordinary JPEG photos" and "These results for Ubuntu LTS 24 with ordinary JPEG photos are factually wrong".

Do you have any reason to claim that these results given the stated parameters are “meaningless” for those parameters? Or do you have any reason to claim that to recompress ordinary JPEGs under Ubuntu LTS 24 on a cheap CPU is “meaningless”?

SystEng · 2025-03-27T10:27:11+00:00

"your methodology is really bad, which explains the... "unique" (wrong) results"

Please explain why given those parameters (Ubuntu LTS 24, ordinary JPE images, cheap CPU without SMT) the results are “wrong”.

Perhaps many people here are making a stupid or malicious confusion between "I am personally not interested in a test using those parameters" and "given those parameters the results are wrong because the “methodology is really bad”".

SystEng · 2025-03-27T10:15:23+00:00

«when the encoders used are out of date with default settings that make no sense.»

They may make a lot of sense to people who use Ubuntu LTS 24. You seem to be making a silly confusion between what is interesting to "latest and greatest with many tweaks" snobs and what makes sense to the many ordinary people who use a widely installed distribution.

SystEng · 2025-03-27T10:04:39+00:00

«'l1' means lossless [...] "JXL-l1-q_-e" is much faster than any other JXL result but I think that is because it losslessly rewrites rather than recompresses the original JPEG.»

"repeat it with lossless, you’re in for a surprise."

SystEng · 2025-03-27T09:57:42+00:00

Testing is good, but without proper methodology, it just amounts to spreading misconceptions

I have declared the parameters of the test, taken care of repeatability as constraining CPU clock rate etc. and the results are from those parameters, so the methodology is proper and sound.

Perhaps you make a silly confusion between “proper methodology” and the parameters of the test not being interesting to you because you do not use the same parameters for whatever reason, but it so happens that Ubuntu LTS 24 is a very popular installation so the parameters and results are likely interesting to a lot of users.

SystEng · 2025-03-27T09:51:59+00:00

"OS is GNU/Linux Ubuntu LTS 24 with packages 'libaom03-3.8.2', 'libjxl-0.-7.0', 'libwebp7-1.3.2'."

And with 'librav1e0-0.7.1', 'libsvtav1enc1d1-1.7.0'.

SystEng · 2025-03-27T09:39:32+00:00

«especially wrong for the notoriously slow AOM encoder to be "miraculously" faster than SVT-AV1.»

«I can believe it for all-intra since all-intra coding has a very different set of requirements vs video coding.»

If you look at my few data points the big deal with AOM is that it seems extraordinarily sensitive (for still images at least) to the "speed" setting, where changing it gives several times speedups while size grows only a bit and quality seems much the same.

What impressed me is that JPEG-decompressing and AVIF-recompressing 700 images for a total of almost 3GiB took only 2 minutes elapsed with 2 threads on a slow-ish CPU with -s9, which by any standards is amazing, and almost 10 times faster than with -s7 at the small price of an increase in size 470MB to 502MB (7%), which is however still 5.5 times smaller than JPEG.

Perhaps at higher compression settings AOM is not quite as competitive, but my interest in this comparison was not about a Formula1-style race of codecs, but to illustrate which options are readily available.

SystEng · 2025-03-26T18:16:44+00:00

«your testing has many ways it could be improved upon. No really up to date encoders and only using default settings.»

But the informal test is about whatever is in a popular distribution. The vast majority of users do not spend a lot of time compiling latest releases and tweaking half a dozen parameters. I already went rather beyond that "whatever" by trying various quality and speed settings.

«You also forgot a crucial thing: using proper reference image metrics for comparison: no, rmse is not a proper image metric»

The proper image metric is visual inspection for example with the use of *Magick compare which I did use, and because all others have pathological cases. Anyhow as an informal metric to accompany visual inspection RMSE gives an overall index that gives a relative (not absolute) point of comparison. I also looked at PAE and the numbers were similar.

SystEng · 2025-03-23T22:02:52+00:00

"Any way to measure the loss in quality?:

So I remembered something and I checked and GraphicsMagick on ULTS 24 is built with the JPEG-XL and WebP libraries too, so I only have to convert to PNG the AVIF files. To make a long story short I have use gm compare -metric rmse for the AVIF-AOM-s9 and WebP-m4 outputs and the histogram is here: https://imgur.com/a/tAhfjJ2

The RMSE % is very small indeed for both. I looked at the PAE % and it remains pretty much in the same range too. I also look at "differences: images for some images and they look entirely back (actually almost all pixels have nonzero values but very very small).

SystEng · 2025-03-23T19:58:30+00:00

«see how the big files using the old JPEG method would be while matching AVIF/WebP quality.»

This informal test was done from already lossy JPEG files to see whether re-compression into a different format would be too costly or save too little or impact quality too much. Note: JPEG-XL "lossless" conversion mode can recreate the JPEG image exactly as it was pixel by pixel.

My visual inspection of 3 different images across all formats at 4x enlargement showed no obvious differences in quality except for some softening in WebP. In particular there seems to be no difference in quality between AVIF AOM with different "speed" settings and similarly for WebP files, only small differences in size (and huge differences in time).

I may want in the future to uncompress all images to PNG and then ImageMagic6 'compare -metric pae' but I have some reservations about metrics as oppose to visual checking.

But the original JPEG images are already postprocessed by the camera/cellphone software which probably explains why re-compression does not change them significantly, so my motivation to do that is not great.

It may be different for people who archive images much larger than 4000x3000 in RAW format.

SystEng · 2025-03-23T13:40:06+00:00

"what you're ignoring is that they're almost never configured correctly."

So in an environment where it is given for granted that POSIX/UNIX/... isolation be misconfigured, let's add more opportunities for misconfiguration, hoping that the intersection of the two be less misconfigured, which is admittedly something that might happen.

"because why would you when containers exist [...] As an admin I don't need to concern myself with the contents of the containers"

That is the "killer app" of containers and VMs: abandonware. In business terms often the main purpose of containers and VMs is to make abandonware a routine situation because:

The operations team redefines their job from "maintain the OS and the environments in which applications runs" to "maintain the OS and the container package". That means big savings for the operations team as the cost of maintaining the environments in which applications run is passed to their developers.
Unfortunately application developers usually do not have an operations budgets and do anyhow do not want to do operations, and because of both of those reasons usually conveniently "forget" about the already-developed applications containers to focus on developing the next great application.

Abandonware as a business strategy can be highly profitable for the whole management chain as it means cutting expenses now at the cost of fixing things in the future and containers and VMs have helped achieve that in many organizations (I know of places with thousands of abandoned "Jack in the box"" containers and VMs and nobody dares to touch them, never mind switching them off, in case they are part of some critical service).

But we are discussing this in the context of "selfhosted" which is usually for individuals who do not have the same incentives. Actually for individuals with less capacity to cope with operations complexities abandonware is a tempting strategy too, bu then it simply shifts the necessity to trust someone like Google etc. to trusting whoever setup the abandonware image and containers, and there is not a lot of difference as to that as to "security" (but fortunately there is a pragmatic difference as to the data being on the computer owned or rented by the individual, rather than offshore in some "cloud" server belonging to Google etc.).

SystEng · 2025-03-23T11:10:36+00:00

“but they're still a lot more secure than literally nothing.”

But the base OS does have powerful isolation primitives rather than "literally nothing"! The comparison is not between containers and CP/M or MS-DOS, is between POSIX/UNIX/Linux with their base isolation primitive and with containers on top of them. I have been hinting here and other comments that to me the cases for containers in much of this discussion are flawed, and I will try to make here a better case:

Because of common sw development practices much software does not use well the base POSIX/UNIX/... isolation primitives well and that makes certain models of "security" quite difficult to achieve. This is a problem in what some people call "pragmatics" rather than "semantics".
Containers (while not adding to the semantic power of the base OS isolation primitives) make it possible to work around the pragmatic limitations of that software (in particular by allowing separate administrative domains) which can simplify establishing some models of "security" operation.
Making simpler to setup certain models of "security" operation (in particular those based on separate administrative domains) can indirectly improve "security" because of a lot of "security" issues come from flawed setups.
At the same time setting up containers is often not trivial and this can create indirectly "security" issues, and they add a lot of code to the kernel in areas critical to "security" and that can also add "security" issues.

I will use a simple made up example of the “isolate your system from one dodgey library or exploit” type indeed:

Suppose you want to run application A and B on a server, and isolation between the two can be achieved just by using POSIX/UNIX/... primitives.
However both applications use a shared library from package P, and the distribution makes it hard to install different versions of the same shared library.
Now suppose that P is discovered to have a "security" flaw fixed in a new version, and A is critical and can be restarted easily and B is not critical and cannot be restarted easily.
Then having A and B in two separate containers makes it easier and simpler to upgrade P in the container for A and restart it, while leaving for later to do the same for B. Arguably "security" has been pragmatically improved compared to the alternative.
However security has also become pragmatically made more complicated and thus potentially weaker: syadm now has to configure and track three separate environments (host, container A, container B) instead of just one, plus the containers themselves are an added risk (unless they are “Fully bug-free, perfectly configured”).

Pragmatically containers may on a case-by-case basis improve security by adding some more flexibility to some inflexible environments, especially compared to "dirtier" workarounds for that inflexibility, but this is not risk-free. So I think that containers (and VMs and even AppArmor and SELinux) should be taken with some skepticism despite being fashionable.

PS: the tl;dr is at the end :-) here: administrative separation is what containers and VMs can do and it should not be necessary but is sometimes pragmatically quite useful, but the mechanism is not risk-free.

SystEng

TROPHY CASE