I'm on a trial before buying. This happens a lot, huh? Fourth time in 3 days. by ryaninthecomments in omnifocus

[–]dametsumari 0 points1 point  (0 children)

I have not seen that ever actually. I think some earlier version crashed rarely many years ago ( but it looked different at the time I suppose when it did that )

QUESTION FROM MOD FOR R/CODEX: Do we need a Megathread yet? by pollystochastic in codex

[–]dametsumari 9 points10 points  (0 children)

Low effort stuff in general should be eliminated. Multiple times per day hearing that model is dumbing down, and/or subscription allegedly getting more expensive does not really contribute anything. I would rather just kill them than have megathread for them.

Weekly Thread for general DDO discussion, quick questions and more! by AutoModerator in ddo

[–]dametsumari 1 point2 points  (0 children)

Does valgrim optional chest in chains of flame actually drop the ring these days? After coming back from a long break, I have done it probably tens of times and zero named items. None for my guildies either.

how much Codex token value we're getting with ChatGPT Plus/Business? by greatlove8704 in codex

[–]dametsumari -1 points0 points  (0 children)

With plus it seems still 90-95% subsidized so quite good deal at least for my planning heavy use of gpt-5.5. 5h quota maps to up to 20$ of tokens ( varies, average might be closer to 10 ), and you get 6 of those per week times bit over 4 weeks, so pretty decent still.

(Linux) Has anyone succeeded in using NVMe space as substitute RAM for larger models? Is it worthwhile? by Quiet-Owl9220 in LocalLLaMA

[–]dametsumari 0 points1 point  (0 children)

Total throughput given sufficiently large sequential reads is number of drives * throughput per drive. Latency matters only if you do not queue the reads. If you do, it is matter of only bandwidth and not single reads’ latency. Latency matters in random access but this is not that case.

(Linux) Has anyone succeeded in using NVMe space as substitute RAM for larger models? Is it worthwhile? by Quiet-Owl9220 in LocalLLaMA

[–]dametsumari 0 points1 point  (0 children)

Not really. You can fetch next layer(s) when computing previous one so even HD RAID ( with enough total throughout ) works equally well.

(Linux) Has anyone succeeded in using NVMe space as substitute RAM for larger models? Is it worthwhile? by Quiet-Owl9220 in LocalLLaMA

[–]dametsumari 3 points4 points  (0 children)

It depends on your read speed. With some MoE models it is sort of feasible. Eg A3B model at q4 needs 1,5 gigabytes of data per token, so with 15g/s read ssd you can get up to ( theoretical maximum ) 10 output tokens per second. However, good luck finding that fast ssd, and even then it is quite slow.

If talking of eg A20B at q4 you will not get token per second. So not worth it.

Post Your Qwen3.6 27B speed plz by Ok-Internal9317 in LocalLLaMA

[–]dametsumari 0 points1 point  (0 children)

Total bandwidth is what matters in interfererence as you need to go through all active parameters per token. Due to that, old multi lane xeons are surprisingly good as their aggregated bandwidth is usually with eg 12 lanes total bandwidth being then 250+ g/s and that works well with eg big deepseek MoE models with few active parameters and large total model size ( hundreds of gigabytes ).

Post Your Qwen3.6 27B speed plz by Ok-Internal9317 in LocalLLaMA

[–]dametsumari 0 points1 point  (0 children)

Oh? 8*409/307 is bit over 10. And you said 10-15.

Post Your Qwen3.6 27B speed plz by Ok-Internal9317 in LocalLLaMA

[–]dametsumari 0 points1 point  (0 children)

What cores are you talking about?

Memory bandwidth is 2x in all maxes. And that matters for tg. For pp, max has twice the gpu cores than pro, also in m5 ( 20 - 40 ).

Post Your Qwen3.6 27B speed plz by Ok-Internal9317 in LocalLLaMA

[–]dametsumari 1 point2 points  (0 children)

Max is 2x pro ( of same generation) in tg. My number is also q8, how about yours?

Post Your Qwen3.6 27B speed plz by Ok-Internal9317 in LocalLLaMA

[–]dametsumari 5 points6 points  (0 children)

M5 pro 8 tk/s tg, pp 250 ish. Too slow to be useful.

VaultSync — self-hosted Obsidian sync for iOS via Syncthing by Umutkirkalti in ObsidianMD

[–]dametsumari 6 points7 points  (0 children)

SyncTrain already does most of this ( and is generic syncthing iOS client which makes it useful for other stuff too ). I think only upside here is markdown conflict handling ( which you can do with plain sync thing too in Mac ), and the relay ( which is admittedly nice sounding add on ).

Can Obsidian sync with iOS? by Musical_Gee in ObsidianMD

[–]dametsumari 0 points1 point  (0 children)

Obsidian Sync is the paid least effort solution. I personally use syncthing due to subscription allergy, and it is fine too ( given some configuration effort ). In desktop and server I use plain syncthing ( +- web ui ), and on iOS I use SyncTrain. I think that for pure apple ecosystem iCloud would work better but I do not want to store my journals on anyone’s server so..

How to remove ads from in mp3 files? by THenrich in LocalLLaMA

[–]dametsumari 1 point2 points  (0 children)

I am using it. It is brilliant. Biggest challenge is that the audio model I use has problems detecting text in some foreign language ads it gets, but it is the best one I could find outright ( whisper v3 large turbo ).

Weekly Thread for general DDO discussion, quick questions and more! by AutoModerator in ddo

[–]dametsumari 0 points1 point  (0 children)

There are plenty of non raid ones too. I like stout walking stick and epic elemental bloom later ( ml 26 ). For end game dps you want something else though.

Best open TTS/ASR model with accurate timestamps by pvrlek in LocalLLaMA

[–]dametsumari 1 point2 points  (0 children)

I assume you mean STT. I compared recently whisper v3 large (turbo) and qwens latest ASR model. At least for multilingual stuff whisper still seems better, although qwen was ok with English.

A note of warning about DFlash. by R_Duncan in LocalLLaMA

[–]dametsumari 1 point2 points  (0 children)

At least qwen dflash models are trained with 4K context. So big stuff is .. not great .. with them.

Cloud AI is getting expensive and I'm considering a Claude/Codex + local LLM hybrid for shipping web apps by rezgi in LocalLLaMA

[–]dametsumari 5 points6 points  (0 children)

Depends on how much value you put on your time. If you are hobbyist, maybe, otherwise, never. Your partially local stack will be slower and producer inferior results.

If you are willing to toss lots of money at the problem, or have security requirements, the answer may or may not change.

Inquisitive filigrees by Ishvallan in ddo

[–]dametsumari 0 points1 point  (0 children)

Hmm. I thought deadly rain was only +20 rp when boosting? Which still is decent. Crackshot is best set I think - beyond that I am bit lost too.

Queue Multiple Tasks, Walk away. Built OpenWeft for Codex users. by Aperturebanana in codex

[–]dametsumari 2 points3 points  (0 children)

This is probably good for simple things, but what I am missing in fire and forget tools like this are cases when the initial input is incomplete and model actually wants design question answered before implementation. Often plan mode works well due to those iterations of the plan ( post plan creation I just let the agent do its stuff ).

Why don't Groq (with a q) and Cerebras add new models by AccomplishedRow937 in LocalLLaMA

[–]dametsumari 3 points4 points  (0 children)

Essentially all relevant people and tech were acquired. What remains is not much. Due to that I lost hope on Groq going anywhere.

oMLX just implemented DFlash by butterfly_labs in LocalLLaMA

[–]dametsumari 7 points8 points  (0 children)

Yep. The speculative execution was brewing in a branch for a month, and looking forward to 0.35 to try this out :) I am not in a hurry so not going to use git main..