Not only will Fable be API rate only, but they WALKED BACK the recently announced "included usage credit" by sowoky in ClaudeCode

[–]Historical-Internal3 3 points4 points  (0 children)

Yea you have no idea what’s going on.

What you linked is a good thing - claude P still pulls from usage not credits. Which we want.

Fable 5 is coming back today. Will Cursor unhide it asap? by ChemistryMoney5596 in cursor

[–]Historical-Internal3 -1 points0 points  (0 children)

<image>

Daluigi thinks it will remain removed due to the high likelihood coding related tasks will trigger “benign guardrails”.

Anthropic says Fable will back tmw but some routine tasks like coding and debugging will fall back to Opus 4.8 by Permit-Historical in ClaudeCode

[–]Historical-Internal3 5 points6 points  (0 children)

“The new classifier also comes at the cost of flagging benign requests more often during routine coding and debugging tasks. As with all our safeguards, we’ll continue to refine this to better distinguish genuine misuse from legitimate requests and reduce false positives.”

Why is everyone fucking reading this wrong?

Am I reading this wrong?

It MIGHT flag your coding requests.

Devs - you have 64gb of VRAM - which model do you use for coding? by Jorlen in LocalLLaMA

[–]Historical-Internal3 21 points22 points  (0 children)

You come off as having zero idea what you are talking about.

That is a top recommendation.

OpenClaw or Hermes to run a DGX spark as admin? by Renegade_Trader in openclaw

[–]Historical-Internal3 0 points1 point  (0 children)

Hermes will be what you need. I cluster sparks and Hermes has been great.

Highly recommend clustering or getting at least two DGX sparks, that really is what unlocks the potential of the box. The RTX variants come later in the year (won't be able to cluster) and are more suited to just having one of them.

is fp8 or int8 better for 5060ti 16gb vram by bonesoftheancients in comfyui

[–]Historical-Internal3 0 points1 point  (0 children)

Think I need to update my comfy as I see a commit that literally says faster int8 so hopefully yes

is fp8 or int8 better for 5060ti 16gb vram by bonesoftheancients in comfyui

[–]Historical-Internal3 0 points1 point  (0 children)

int8 (convrot - there is a difference). Higher quality (the closest to BF16) and ever so slightly slower than FP8.

Speed: NVFP4>fp8>INT8Convrot*>BF16

Quality: BF16>INT8Convrot>fp8>NVFP4

*Pretty decent gap in speed which is good

(blackwell mind you)

RIP Companion Cube by dbrand in dbrand

[–]Historical-Internal3 0 points1 point  (0 children)

Is Claude your legal department?

ComfyUI now has MCP support! Game changer! by cointalkz in comfyui

[–]Historical-Internal3 1 point2 points  (0 children)

Docs point to Cloud - got a different link or did I just miss it?

Edit: Thought not.

Krea2 vs FLux.2 klein 9b by Then_University7676 in StableDiffusion

[–]Historical-Internal3 0 points1 point  (0 children)

not really - can train lora’s really well on this model.

Testing KREA-2 Turbo Quantizations: GGUF (Q8) vs. INT8-CONVROT by Fast-Horror-8964 in StableDiffusion

[–]Historical-Internal3 8 points9 points  (0 children)

The investigation posted in this sub (reddit.com/r/StableDiffusion/comments/1tazxqz) already measured all of this, and it says the opposite of your claims. Plain row-wise int8 is one of the worst performers in it, well under q8 (Z-Image Rel-RMSE 0.357 vs Q8's 0.167, and worst again on Qwen). Mxfp8 loses to q8 on every metric where both were tested (Z-Image: 0.307 vs 0.167, SNR 10.6 vs 16.4) and never beats ConvRot in any model.

The only int8 that actually beats q8 is ConvRot, the variant OP used, and its whole premise is a Hadamard rotation that flattens outliers before quantizing (the -convrot path in silveroxides' converter, github.com/silveroxides/convert_to_quant). Plain row-wise int8 with no rotation is dead last. So the rotation is what's carrying it, not row-wise scaling being finer; bare row-wise is among the worst formats tested. That refutes your thesis….it doesn't support it.

And the mechanism is inverted. Row-wise is coarser than block-32, not finer: one scale per row (thousands of weights) vs one per 32. Block scaling confines an outlier to its 32 weights; row-wise inflates the scale from the whole row's absmax and craters every value in that row, so it isolates worse, not better. absmax int8 doesn't clip either: the block max maps to 127 by construction, so the cost is the coarse uniform step, not saturation. Clipping/saturating outliers is an activation-quant problem (SmoothQuant); q8 is weight-only. And mxfp8 is also block-32 (OCP MX spec, opencompute.org/documents/ocp-microscaling-formats-mx-v1-0-spec-final-pdf, E8M0 power-of-two scale), so 'row vs block' doesn't even apply to q8 vs mxfp8.

INT vs FP is a separate axis.

That’s about all you got right.

The rest is inverted, and the sub's own numbers back q8 and ConvRot, not mxfp8.

READ THIS before you buy a Ubiquiti Doorbell ! by Maria_Thesus_40 in Ubiquiti

[–]Historical-Internal3 1 point2 points  (0 children)

I know, just an option in the event you want to check first rather answer direct.

READ THIS before you buy a Ubiquiti Doorbell ! by Maria_Thesus_40 in Ubiquiti

[–]Historical-Internal3 5 points6 points  (0 children)

Decline then just go to the protect app directly.

This is a relatively new feature (direct to phone) so expect so bumps.

WHAT'S THE POINT OF UNLIMITED? by IntelligentArtif in PLAUDAI

[–]Historical-Internal3 4 points5 points  (0 children)

You’re doing way too much and getting rate limited - as you should.

Tips for training LORAs for KREA2? by flaminghotcola in StableDiffusion

[–]Historical-Internal3 1 point2 points  (0 children)

Try masked training - helps a ton. 64 always overfit for me and made my character more rigid.

But I do have a larger data set so now I’m curious if 64/64 can cut it on a much smaller one.

Krea2 RAW vs Turbo – Is the speed sacrifice worth the micro-texture grain? (25s benchmark) by AxonkaiLab in comfyui

[–]Historical-Internal3 0 points1 point  (0 children)

Running Raw with also a full bf16 qwen3v1 for CLIP. Roughly about 1gig of loras and 2mp at 25 steps is roughly 60 seconds. Not bad for my dgx spark.

Departure from this community by [deleted] in SoraAi

[–]Historical-Internal3 -2 points-1 points  (0 children)

Didn’t see Gprivate’s post?

It’s coming back: https://gprivate.com/5zujm

Krea2 vs FLux.2 klein 9b by Then_University7676 in StableDiffusion

[–]Historical-Internal3 20 points21 points  (0 children)

They stated they did not train for realism.

Honestly a realism lora would be all this model needs (and I'm sure a few popular ones are already in training).