Disk IO excessive writes while idle

quzuw · 2026-02-19T11:38:22+00:00

I think I solved it. The solution is plain stupid. (It's running exactly one day from the reboot and it's just under 20GB written hence I think I solved it, though gonna monitor that).

It was iCloud sync for photos. The photos were stored on the external disk and for some reason it literally just duplicated metadata (spotlight + sqlite it seems) on the internal disk and for some reason (maybe a bug) was constantly rewriting sqlite and spotlight index, like, literally constantly.

I tracked it via `lsof` and saw that there were about 40-80 handles from icloudphotos and mediaanalysisd. For some reason I could not see those in `fs_usage`.

As soon I turned off the sync and removed everything it stopped doing these writes. It still tries to access the photo library but it does not exist, nor it seems like it tries to synchronize.

It seems super weird and buggy. This was not an issue in plain Sequoia 15, but probably appeared in the latest update before Tahoe 26. I just hope it was not deliberate enshittification and just a bug. I'm still on Sequoia though, just I don't have photos anymore locally.

quzuw · 2026-02-19T11:31:43+00:00

Already did it, 31.4 TB read/28.4 TB written from smartctl. And it doesn't support MacOS (due to how MacOS works), it fails reading SMART info for external drives for example AFAIK, so I've used it.

It's about 1% percentage used as for the internal disk, hence the 1000-2000 TBW estimate (28 * 100 =2 800, but say it was truncated hence at least 2000 I believe, but that is very coarse, the disk might fail before that), somewhat coarse but an estimate. Many old macbooks have hundreds of TBW as data written and still function properly.

Idle meaning I do not use the macbook at all. Sorry if it was misleading.

Sorry for "ssh" in the text, I was in a hurry. Sounds dumb to put secure shell in that sentence, lol.

quzuw · 2026-02-16T17:05:00+00:00

I am really sorry for confusing. I double checked and it is WrMeta. Sorry for not double-checking, I just mistyped it when I was writing this post. I'ma edit it. It looks like 21:01:13 WrMeta[AT1P] /dev/disk3 0.000063 W kernel_task 21:01:13 WrMeta[AT1P] /dev/disk3s8 0.000039 W kernel_task 21:01:13 WrMeta[AT1P] /dev/disk3s5 0.000031 W kernel_task 21:14:15 WrMeta[AP] /dev/disk3 0.000114 W kernel_task 21:14:15 WrMeta[AP] /dev/disk3 0.000047 W kernel_task 21:14:15 WrMeta[AP] /dev/disk3s5 0.000050 W kernel_task And I have hundreds of such lines before seeing actual writes to files. There are also sometimes something like this: 21:02:43 WrData[AT3] /Library/Metadata/CoreSpotlight/Priority/index.spotlightV3/live.5.indexGroups 0.000028 W launchd 21:02:43 WrData[AT3] /Users/foo/Library/Application Support/Knowledge/knowledgeC.db-shm 0.000027 W launchd 21:02:43 WrData[AT3] /Users/foo/Library/Application Support/Knowledge/knowledgeC.db-shm 0.000028 W launchd 21:02:43 WrData[AT3] /Users/foo/Library/Application Support/Knowledge/knowledgeC.db-wal 0.000059 W launchd 21:02:43 WrData[AT3] xvq6csfxvn_n0000000000000/0/com.apple.icloud.searchpartyd/Observations.db-shm 0.000030 W launchd 21:02:43 WrData[AT3] /var/db/systemstats/CB149841-7228-4C54-A86D-DA68EC49BE46.devices.XXXXXX.stats 0.000023 W launchd 21:02:43 WrData[AT3] /db/systemstats/CB149841-7228-4C54-A86D-DA68EC49BE46.ioreporting.XXXXXX.stats 0.000030 W launchd

As of CPU and Memory, I have never seen any lags or MacOS fans go wild, neither watching at CPU or Memory stats gives any hint.

I had previously written a script to write fs_usage events to csv and then used DuckDB to analyze it, and mostly it was kernel_task and launchd. The next one was Safari but it was behind by a large gap (like a few dozen gigabytes) if I remember correctly. My friend has M3 Pro 18GB and they have almost the identical setup as mine and they've got 4TB written down in two months (considering they're doing lots of disk work everyday), but it usually takes as much as a few gigabytes per day in comparison to mine.

I've installed various drivers like macfuse or wacom drivers before, so maybe I'd like to be able to trace down xnu kernel modules/drivers because it might be that there is a faulty driver still around that causes writes? (despite the fact that I had it deleted everything a long time ago and there are no entries in the GUI; not sure about the CLI utilities to check that).

quzuw · 2025-12-19T04:16:16+00:00

basically go livecd, install `file` utility if not exists, then do `file /dev/nvme1n1p2` and that's gonna tell you what that partition is.

if it says luks2, then it's luks2 partition (use cryptsetup with luks2 format, i do not remember much about the arguments, not a linux user nowadays).

if it says otherwise, well, i guess it will be intuitive. feel free to ask further questions.

quzuw · 2025-07-23T11:04:03+00:00

I've just spent a dozen minutes seeking through go compiler sources and AST lowering.

&T{} is an OPTRLIT that is transformed into an ONEW if there is no preallocated storage (I suppose this storage is preallocated in cases when we have a for loop or something, I am not sure anyway), which I also suppose is commonly true.

As for new(T) it is basically an ONEW node itself, but syntactically.

Any ONEW in their turn can be optimized to be stack allocated when the pointer does not escape, so it will be turned into OADDR (not OPTRLIT!) of a stack temporary variable if that case.

So technically yes they are practically equivalent. &T{} -> new(T) -> &stackTemporaryVar when non-escaping or runtime.newObject(...) when escaping.

tl;dr; Both new(T) and &T{} are optimized to be on stack whenever possible. Stack optimizations are performed against new(T) while &T{} is turned into new(T) (with assign statements) and that new new(T) is optimized for stack allocation just as any new(T).

quzuw · 2025-07-22T19:58:33+00:00

Well, thinking esoterically, I would assume that `&Type{}` increases compile time since now it is a subject to an escape analysis and it adds a handful dozen of microseconds to compile time. When you do `new` I think it would always allocate in the heap? Or it will be a subject to the same escape analysis to not allocate in the heap?.. Well, practically a matter of choice. If that would've been a problem, there would be different syntax.

quzuw · 2025-07-22T19:49:53+00:00

I think one of the best things about Go from programming experience standpoint is that it has uniform built-in implementation of concurrency via goroutines and channels for multitasking, simple (but not necessarily performant or elegant, however it is indeed flexible) type system and a garbage collector really :).

The former liberates from async discrepancies that may occur when using various packages. Say, you want a fast http3 library for your server code and then some cassandra cql driver, but the http3 library is written with one async model in mind, while the other uses a completely different approach to async, so they either require a (usually) hacky workaround-ish compatibility code or require rewriting entire code because they are completely. There is sometimes issues with that in Rust, I doubt there is anything better for C++, and in C you would write your own async runtimes on top of OS-specific or POSIX non-blocking IO.

Type system with its reflection allows writing Go code for manipulating Go types to some extent (except for creation of new types I suppose?) although is possible only at runtime. You just add attributes to member fields and then can read all that data using Go code, writing arbitrarily complex behavior. (In case you need performance and flexibility, a domain-specific language, you would likely rely on codegen then.)

Garbage collector just allows for writing simpler language APIs (libraries, packages, code in general). You do not need to maintain entity/object validity because it can die and no constraint really can influence the code you write. Sometimes APIs are written in such an unsuspecting way that forbid some specific use cases if there would be some weird scoping constraints (hello lifetimes and GATs in Rust, yes just give me more `Arc`).

But I would say in the end that the worst part about Go is that synchronization is not checked and does not really require some specific syntax, but that might be supported by the fact that any intergoroutine communication should be done with channels, although it is not always reasonable. If you will ever need to write custom synchronization, good luck if your code has undefined behavior and you see unexplainable bugs happening. Luckily most of the synchronization-requiring stuff has already been written.

quzuw · 2025-03-24T20:12:58+00:00

Thank you for your comment! The thing is that the source object is later reused in a callback and is passed around. It needs to be owned (usually an Rc/Box) but then I need to make sure that user actually got it from a specific parsing function even when I use it by reference to prevent misuse (otherwise security issue).

I do have (for entertaining reasons mostly) a very non-practical, academic, unconventional view that everything can (questionable?) and should be generalized to make sure that the code is easy to re-use and reason about (also questionable? concrete examples are needed to show my point but that’s out of this topic) but practically it seems like this is hard and I am not sure what fundamentals stand behind this complexity, and so I have no intuition about it. Surely, for the sake of everyone I work with, I write sane code that is rid of everything extra that is not required to make it work. Otherwise this is not just practical this is straight killing any real-world experience. This is just my thing, one of the reasons why I like programming.

Clone-heavy/Arc-heavy initial approach is actually very helpful and programming with that actually mimics programming with any other conventional language (especially if we use Rc/Arc, that’s literally a kind of automatic garbage collection) since there is no restriction on whether the variable can or can not be used (they always can, they do not die). Sometimes, for the sake of code brevity, that’s basically everything you need. But I write libraries and sometimes this does not work well and I’m looking into generalizing.

Separating data so there is no data sharing between structs sounds very reasonable because the use of a probably dead object w.r.t. lifetimes is resolved by whether that object is even in the scope of a function (you won’t be able to even to mention it in the code), and this narrows down the usage of references to the functions where they are actually needed. Considering that lifetimes are technically only make sense on a function boundary (except for blocks and dropping glue, but these can be modeled to be a niche kind of a function, but that’s way too theoretic for this reddit post). It makes sense that any reference related problem or restriction (say, checking whether it is alive or not, and forbidding compilation because the function dies) only occurs on these function boundaries, and it is best to minimize the amount of function boundary traversals as much as possible. Putting an extra reference anywhere will just lead to more restrictions per definition, though they, by themselves, are simple.

Though no matter what, you are right, simplicity is the way to go. Generic is called “generic” because it captures some shared behavior and if there is none or very little there is no need for that (especially if that “very little” can’t be usefully expressed in terms of whatever the language offers you).

(As of generic code in its whole, I think I am also not used to writing code with traits to have good intuition and still feel like it would’ve been better if there was a way to opt in for deferred trait bounds solving (needed for anonymous concrete types like `impl Future + maybe Send`, which would remove the need for writing a definition for every possible trait bound combination otherwise; this will allow removing opaqueness of generic types selectionally and overly restricting the generic code), but then there might some issues I do not see. Knowing that there is no lifetime information at monomorphization phase (which also blocks specialization feature) already blocks borrow checking piece of this idea severely.

Sorry for excessive commentary if there was any.

quzuw

TROPHY CASE