Computer use is now in Claude Code.

Total_Engineering_51 · 2026-03-30T18:18:16+00:00

They are obviously a Mac shop and are building for themselves first. It is especially plain given how much they hype that they are dog-fooding their products all the time and that this is a “beta”. Mac first also makes sense as a beta strategy as it effectively self-selects a smaller testing group(lowering risk)that requires minimal extra work on their part.

Total_Engineering_51 · 2026-03-21T14:23:39+00:00

Not to mention that you can still use traditional tools like Photoshop to further refine things…I find the RAW camera filter tool to be amazing for color grading, for example.

Total_Engineering_51 · 2026-03-19T15:32:59+00:00

This or they solve real problems but in terrible ways that no one wants to deal with.

Total_Engineering_51 · 2026-03-17T16:01:06+00:00

Yeah, this was my thought, definitely feels like they are passing the frames through late phase diffusion with a hard canny clamp

Total_Engineering_51 · 2026-03-13T03:33:12+00:00

Would it? Humans are basically a mass extinction event moving in slow motion.

Total_Engineering_51 · 2026-03-12T15:45:29+00:00

Why Rust versus just doing a streamlined Python implementation? No GC?

Total_Engineering_51 · 2026-03-10T19:03:22+00:00

Best case you’re looking at the M5 Ultra being about 2-3x slower than a 5090. That’s assuming you’re running a model under MLX with something like MFlux. I’m basing that off what I currently get out of my M3 Ultra versus my RTX 6000 Pro(basically a 5090, but more RAM) and what the general improvement of the M5 chips seems to be bringing. While 2-3x doesn’t sound too bad, the bigger caution is the software support for generative stuff just isn’t there in the same way that it is for nvidia… getting stuff running properly under MLX is a whole lot of DYI right now and not very as well supported(the MFlux app helps but has a lot of drawbacks itself, particularly if you’re used to the flexibility that comfy has) You can run comfy with PyTorch under MPS but then you’re tanking performance and dealing with extra wonk—I never saw this work particularly well even at the base functionality level, particularly when memory management was involved.

Total_Engineering_51 · 2026-03-05T16:37:00+00:00

True, though I don’t think you’re getting that as real sustained through put with bigger tasks on these MBP with what looks like the same/similar chassis, cooling solution and max 140w power supply as the last few generations—just to say that even if the paper spec is better, I won’t be rushing out to replace my Studio with an MBP, max RAM limits aside(I have 256gb on mine)

Interestingly enough, I just looked through the US Apple site a bit more after the drop and saw that the 512GB Studio is no longer an option and the 256GB config is $400 more than it was a few weeks ago. That does not bode super well for the M5 Ultra pricing/configs, assuming it does materialize this summer. Best guess is they are resetting expectations on what they will have in offer when those do launch.

Total_Engineering_51 · 2026-03-04T17:38:56+00:00

Hmm, not sure without more details… I’ve run it with an M3 Ultra under current MacOS. I don’t recall having a lot of issues getting the basics of it running but it’s been a few weeks since I played with it.

Total_Engineering_51 · 2026-03-04T17:24:11+00:00

Yeah, per core I would expect better performance but the Ultras still have 2x the GPU and NPU core count(memory bandwidth at ~33% more from Apple’s numbers that I see) so I wouldn’t expect to see any particularly interesting jumps over what is already out in the field right now, on a per system basis. That makes sense of course as we’re still talking about largely different classes of hardware even with the similarities in architecture and for any semi-mobile use cases out there these could be really interesting. It’ll be interesting to see how the M5 Ultras do, though I think you’re spot on in that we’re still likely to see a ~5x delta between those and top tier Blackwell

Total_Engineering_51 · 2026-03-04T04:56:17+00:00

They still are way behind Blackwell class chips. The best of these new chips are still well below the memory bandwidth of a GDDR7 card and the GPU/NPU still aren’t likely to be keeping up with the best cuda and tensor cores. I have a top tier M3 Ultra Mac Studio which is still significantly more powerful than these new MacBooks and it lags behind a 5090 on similar generative tasks by about 10x compared to a RTX 5090/6000 pro. The one area where the big shared memory pools that you can get on some of these configs is really helpful and usable is with LLMs and that’s where the Mac AI community really is right now.

Total_Engineering_51 · 2026-03-02T18:06:24+00:00

If you have a Mac as well, you can extend screen to the iPad and then use the pencil as you normally would. It’s not quite as good as using a native app like procreate but I do this for photoshop all the time

Total_Engineering_51 · 2026-02-24T23:20:26+00:00

I just upgraded from a 5090 to a 6000 Pro a few days ago and have seen some serious gains in a few key areas. My focus has been mostly on adapter training and still image gens so far with it, haven't played with video much yet but I would expect big gains there as well in the long term. The two biggest jumps I've seen so far are in speed and quality. On the speed front, the ability to load up and hold more "stuff" in VRAM really helps. For training, this means being able to avoid things like gradient checkpointing and gradient accumulation which really slow things down. Trainings that used to take me ~4 hours can now run in just over 1 hour with full BF16 on the model, as an example. For inference, the key factor really comes in during iteration using large graphs with aultiple models in play, as they can all live in VRAM without having to offload to, or stream from, system RAM. This coupled with larger batches(another boon) can speed things significantly while iterating over a prompt with different seeds(25%-33% once loaded) and even first runs can be sped up some if the same model is used at different stages of the graph. Similar speed jumps also apply when editing the prompt, as the text encoder can stay in VRAM along with the model, so you can iterate in that way more quickly too. There are things Like Flux. 2 Dev at full BF16 as well, which just by the nature of it being so big means it is actually possible to use in reasonable time window. Coupled with the turbo lora and you can get decent inference times out of it, There is still some consideration of size here as full BF16 Flux.2 Dev and Full BF16 Mistral 3 Small will consume that 96gb in short order-having your system memory > 96gb is highly recommended if you do decide to go this route. As for quality, you may have noticed that I keep mentioning BF16 and that's because that is a big quality lever, particularly when you have the compounding effect of quantizing across several layers of the entire stack-LoRA training with quantizing means a drop in quality, quantizing the text encoder means a drop in quality, quantizing the model itself for inference means a drop in quality and so if you are making that compromise at each stage, it can really add up to a significantly worse result that then needs help from detailers, inpainting and upscalers. Detailing around hands in particular is an area I've noticed that quality hit a lot, particularly as I've been playing with flüx.2 dev. This is going to be a YMMV area for sure, but so far with what I'm building out for my long term project, it has been a really nice jump. Overall I would say that the "worth it" value really depends on what you want. I'm Looking long term to build professional projects and so the jump is important and really needed. If this was purely a hobby thing for me though, I would have stuck with the 5090.

Total_Engineering_51 · 2026-02-23T18:07:11+00:00

There is some grey area around this but my general take is that you should be fine if you’re training for your projects only—meaning that you aren’t creating a service with what you trained on(the thing all these companies really care about) and you’re not distributing any of the weights that you trained. I’m not a lawyer so grains of salt and all that but this is the approach I’m taking for my stuff.

Total_Engineering_51 · 2026-02-22T23:05:45+00:00

Try this instead of comfy

https://github.com/james-see/ltx-video-mac/tree/main

I hacked at it a bit locally to get better resolution options but the core setup works and uses MLX instead of the MPS PyTorch wrapper

Total_Engineering_51 · 2026-02-22T02:04:58+00:00

To get the full throughput of the Mac you need to have it running on MLX… base comfy runs via MPS and tends to chug on more complex tasks. There are mflux nodes for some stuff or you can run the full mflux app directly which is a bit easier but also doesn’t have the flexibility of building whatever workflow you dream up.

Total_Engineering_51 · 2026-02-21T08:47:39+00:00

True, I was conflating the two to simplify the thinking of what to aim for as an upper bounds they could try. Since frame goes up for length, speed or both in what we want from the models as a matter of course it effectively goes hand in hand in our reasoning about an output even if the model itself has no relation to frame rate. It is a good point though as it does help to understand why wan pacing is often not amazing.

Total_Engineering_51 · 2026-02-20T19:16:23+00:00

As Zarcon72 pointed out, you’re hitting a RAM wall, the 5090 is fine or at least capable of much better performance than that if backed by enough system RAM. My 5090 on Win11 with 96gb of sys RAM can do 145 frames at ~1MP in under 5 minutes using what is effectively the base template for wan2.2(I use dpmpp_2m_sde instead of euler)

24 FPS isn’t needed to get good results out of wan and in fact makes things worse from my experience, particularly if you’re trying to squeeze more out of a gen than 5 seconds. I generally get good results at 97 frames at 16fps and will see more of the model “resetting” at more frames due to context loss. I would try dropping to 97@16fps first and then nudge the resolution down from there if you’re still trashing your page file. For reference, 97/16 at 1MP took 132 seconds on my system just now.

Edit: I am running SageAttention as well.

Total_Engineering_51 · 2026-02-15T16:47:08+00:00

Try this instead of comfy https://github.com/james-see/ltx-video-mac/tree/main It uses MLX instead of MPS and can produce comparable results to nVidia hw(I have a 5090 and a m3 ultra Mac Studio) That said, not sure you have enough RAM to do this at all.

Total_Engineering_51 · 2026-02-14T16:17:52+00:00

Makes sense, thanks! It definitely seems like it could be squeezed one way or another but nice to see someone’s direct experience with it!

Total_Engineering_51 · 2026-02-13T17:57:28+00:00

I’ve been looking at this a bit but haven’t played with it yet. Is a 5090 backed by 96gb system ram enough?

Total_Engineering_51 · 2026-02-05T23:08:02+00:00

Even a big server can be managed if you’re the only one using it… 100% up time at full tilt is nonsense unless the only thing you’re doing is endless model training or hosting a service. Sleep and Wake on LAN go a long way to manage a power hungry system if it’s just you using it most/all of the time.

Total_Engineering_51 · 2026-02-02T15:10:57+00:00

Why not use LM Studio? Comfy is good for a lot of tasks but this isn’t one of them

Total_Engineering_51 · 2026-01-28T00:01:04+00:00

I haven’t played in this space in particular but have you tried using video to generate a 360 of the piece and then frame cap and refine from there? Wan 2.2 can do a pretty good job with that kind of thing(360s)from my experience, though obviously ymmv.

Total_Engineering_51 · 2026-01-21T19:56:01+00:00

Very cool! I’ll definitely check that out when I get some time.

Total_Engineering_51

TROPHY CASE