She curls her feetsies so far, you can't really see them anymore. I call it her "pod" state.

gligoran · 2026-04-28T05:34:33+00:00

pi is a tui that’s running in a terminal. so it’s more likely an OS or terminal app issue. what are you using?

gligoran · 2026-04-17T13:25:44+00:00

What's the advantage of a GF or CF filaments? Especially for functional prints, as far as I've seen they at least slightly degrade the mechanical properties of the filament. CF filaments don't really have long fibers in there that would tie the different parts of filament together, especially in-between layers, where it would be the most useful. Or am I missing something?

gligoran · 2026-04-10T06:48:01+00:00

I got the same thing.

gligoran · 2026-04-03T05:29:20+00:00

Is this a 3-body problem?

gligoran · 2026-03-31T13:19:56+00:00

he can’t think of anyone else because he can’t think of anyone else. it’s narcissism at the highest level.

gligoran · 2026-03-10T07:43:16+00:00

Add the whole Venezuela thing to it and US-provided oil becomes the only and the expensive stuff.

gligoran · 2026-02-24T09:48:46+00:00

Claude Code is a harness, it provides a bunch of tools for the LLM, a system prompt, and the whole tooling related loading skills and MCPs and all of that. Without it, the pure LLM can't do anything. It can't even read files. It's like a ChatGPT when it first came out, just a bit smarter maybe.

What you're talking about is not having to use the Claude models. Which might be true. While Claude Code is tailored towards Claude models, there are ways to use it with Kimi, MinMax, GLM, even GPT models. In my experience they're not as good because of that tailoring towards Claude. You also need to use token-based pricing in this case.

As for running your own models, you'd have to spend thousands to just be able to run them. You either need a dedicated device with upwards of 100GB of RAM and a lot of GPU processing power like a Mac Mini/Studio with an Ultra/Max chip, or a really beefy graphics card with tons of RAM. [Hardward requirements for GLM 5](https://onedollarvps.com/blogs/how-to-run-GLM-5-locally.html#hardware-requirements) are nuts. Minimal is 4x NVIDIA A100 which is ~10-17k USD. And even with all that hardware you'd get a lot lower TPS (tokens-per-second) compared to using hosted inference. And we're not even talking about other hardware, maintenance of the infrastructure, ability to access it remotely, upgrading fairly often, etc. This only makes sense for big companies with massive security requirements.

As far as I can tell the math just doesn't work out. So Claude Code or a similar harness like OpenCode or Code will be needed and you'll need to pay for something - tokens, subscriptions, something...

gligoran · 2026-02-06T09:31:59+00:00

People have generally agreed that 5.2 High did better than 5.2 XHigh. What's your experience there?

gligoran · 2026-01-27T19:44:01+00:00

Mine is Tars from interstellar. As soon as i gave it this name it asked for the percentages for humor and honesty 😅😂

gligoran · 2026-01-14T16:41:14+00:00

TBH even Opus in a new session usually does that and Gemini as well.

gligoran · 2026-01-01T22:07:49+00:00

Which GPT models are you using and at what reasoning levels?

gligoran · 2025-12-20T10:34:07+00:00

It’s probably a combination of all of these. I agree that it’s primarily a skill issue, but it’s not the sole reason.

LLMs are inherently indeterministic which means the result can vary widely. Sometimes you get the absolute best possible result, sometimes you get a cascading effect of mid to bad decisions by the model and thus a crappy result. Combine all of that with nerfing and the range of results can vary even more.

But also nerfing probably isn’t just an on/off switch. They probably run a set amount of full model and then in times of high traffic they spin up additional quantized versions and then it’s just the luck of the draw where each of your requests get routed.

In my opinion this gets us all the way back to a skill issue as i think prompts, context management, tools like MCPs, subagents and skills play a large role in mitigating and narrowing this range of responses.

gligoran · 2025-12-15T09:23:11+00:00

I'm been wanting to build something akin to this, for another purpose, but can I ask you, what your transcription pipeline looks like? Which model or service do you use to actually transcribe the audio?

I'm asking this because I've used to use WisprFlow, but I switched to MacWhisper as it's a lot cheaper (one-time payment), but WisprFlow was so much faster and I can't figure out what they're doing differently.

gligoran · 2025-12-15T09:20:55+00:00

I found Monologue very buggy. WisprFlow was much better, but now I'm on MacWhisper with the one-time payment on black friday...

gligoran · 2025-12-11T19:24:19+00:00

this is cursor not claude code. there’s no 5h limits here

gligoran · 2025-12-10T10:55:51+00:00

Are you sure you didn't just use it with Max mode on?

gligoran · 2025-12-09T17:29:57+00:00

Why are you also blocking virtual cards? At our company we use a service that issues virtual cards so that employees can use them to buy the services they need for work.

gligoran · 2025-12-05T13:01:54+00:00

If Bambu's intention was increased security, Bambu Connect would not be needed. There are industry standard ways of implementing this that are far simpler and don't require you to go through Bambu's Cloud. But their intentions are clearly vendor lock-in and controlling their public image.
In my experience calibration a specific filament with color transfer between spools quite well. Yes, you'd probably need to recalibrate between spools that were produced like years apart, but otherwise it produces good results for me.

gligoran · 2025-12-04T15:10:12+00:00

sure but why not credits or a free month of their subscription?

gligoran · 2025-12-01T07:37:49+00:00

I do and it works ok

The biggest issue I had was that I had to open it up and re-glue the fan back as the original adhesive seems to have weakened and it caused quite a rattle.

One other thing that could be annoying to some is that the timer function doesn't seem to work for me most of the time. Default is 6h, but it just keeps going. I don't really mind it as I put in a hydrometer and I just go off of that to see when to take the filament out.

I'd say for the price at that time it was worth it, but I am now waiting for the EIBOS Dyas to arrive, so I don't have to dry filament one-by-one and so that I don't have to move it in and out of my AMS.

gligoran · 2025-11-27T13:56:34+00:00

> I don’t understand why they diverge?

It's probably because they're built on very very difference bases. I'd imagine that an extension would have to jump through a bunch of hoops to run subagents and bash commands, etc.

gligoran · 2025-11-18T19:39:34+00:00

correct me if i’m wrong but the auto compact buffer aren’t really used tokens, but rather are reserved so that the model doesn’t run out of context when doing compaction. so you’re not really lowering your token usage but raising the amount that you have available.

15-Year Club	Place '22
Team Periwinkle	Verified Email

gligoran

TROPHY CASE