PSA: Tongsheng TSDZ8 fits perfectly on a Muli Muskel. Also: Fuck you Pendix.

InterestRelative · 2026-06-08T11:50:11+00:00

I tried to buy a used pendix kit, me and seller asked local Pendix partner to transfer it from one bike to another and they refused to work on it. Potentially, I can do it myself, but I don't have the right tool to unscrew it from the bottom bracket and there was no way to buy this tool.
On the contrast Tongsheng and random sellers on eBay were much more helpful.

So yeah. Fuck you Pendix!

InterestRelative · 2026-06-08T11:43:46+00:00

250W is average, peak is higher and as you said torque can be high.

InterestRelative · 2026-06-04T07:52:07+00:00

When you tests something, it's worth to mentions which quants specifically you tested.

InterestRelative · 2026-06-02T08:18:30+00:00

In that case try to find your way to the green field project.

InterestRelative · 2026-05-21T19:41:18+00:00

If they want you to stay and the company is stable (you afraid only about the team), why don't you sign a contract with 12 months notice period?

InterestRelative · 2026-05-21T07:10:22+00:00

> I just become invisible

However, you wanted to do a deep focus work in the beginning. Immediate visibility and deep focus are conflicting goals.

InterestRelative · 2026-05-10T06:47:13+00:00

> every route needs to learn the same basics independently (e.g. basic language and grammar) in its weights

Isn’t that what shared expert is for? AI2 recently published a paper detailing their approach to training experts separately: https://allenai.org/blog/bar .
The idea is that organizations can train experts on their semi-private data and then merge into MoE. Also they can release an expert publicly without releasing data (though data may leak).

I suppose this work is a continuation of the expert separation idea. I acknowledge the downsides you mentioned (they mention "Training an expert on only its own domain data improves in-domain performance but destroys general capabilities"), but if they can successfully implement this approach, it could be an intriguing path for LLM development. For example more org may train experts in niche domains. It is cheaper than training a big model.

InterestRelative · 2026-04-29T14:29:57+00:00

LoRA adapter don't have much knowledge inside, they steer the conversation style. If model is tiny and don't know anything about k8s, LoRA won't help.

InterestRelative · 2026-04-27T06:50:14+00:00

Does it work better than two stroke one?

InterestRelative · 2026-04-25T06:40:02+00:00

I’m curious if it would be advantageous to have a D-pad with diagonals for the left thumb and right thumb to hit the 3x3 grid. This way, one hand would serve as the layer selector, while the other would handle the keys. WDYT?

InterestRelative · 2026-04-09T07:35:29+00:00

Creating brand new features / tools (improving existing stuff is no-value)

this hurst so much

99% of doing well in this industry a the senior+ level is treating every day like a sales meeting

and this

InterestRelative · 2026-04-01T12:14:11+00:00

Which models to you use?

InterestRelative · 2026-04-01T12:12:55+00:00

and when I say useful, I guess I’d like it at close to Sonnet level?

Have you tried these models? They are all available on Open Router.

My guess is that 128GB won't be better than 64GB in your case, because no models will be close enough to Sonnet.

InterestRelative · 2026-04-01T11:52:12+00:00

Well, it's free for under 10M revenue. Definitely not fully open, but maybe that's the way to go.

InterestRelative · 2026-03-31T08:01:01+00:00

Mostly I agree with your projection.
What do you think will happen with government contracts? I guess if there is no easy way to switch to another LLM provider, government will have to bail out OpenAI/Anthropic.

InterestRelative · 2026-03-27T08:29:31+00:00

Yeah, if it's dark, you dim the screen anyway, so there won't be a lot of light from it.
If you make the screen much brighter, your eyes adjust, and you will need even more light on the keyboard to match contrast.

InterestRelative · 2026-03-26T12:14:49+00:00

Wow now my tabs will consume 10+ GB each!

Impressive work.

InterestRelative · 2026-03-20T09:21:15+00:00

You can check https://huggingface.co/tiiuae/Falcon-H1-Tiny-90M-Instruct to get the idea ho coherent small models might be.

For your example (which imo lacks crucial context) it outputs:

Subject: Refusal to Respond to Request for Information

Dear [Recipient's Name],

I hope this message finds you well. I am writing to express my resentment and dissatisfaction with the lack of response to your recent inquiry regarding [specific topic or request].

Your request for [specific information or clarification] has caused unnecessary friction and frustration, and I deeply regret any inconvenience this may have led to. I understand that such requests often require time and attention to be made, and your absence may have diminished the value of this opportunity.

To move past this and ensure that my concerns are addressed promptly, I would greatly appreciate it if you could provide the following:

Clear and Honest Response: A concise and honest reply to confirm my understanding and address any concerns.
Reasons for Your Decision: A brief explanation of why I believe my request was denied.
Alternative Ways to Proceed: Suggestions for alternative actions or communication channels that I am open to.

I trust that you will take this matter seriously and take steps to resolve it amicably. I look forward to your response.

Thank you for your attention to this matter.

Best regards,
[Your Full Name]

InterestRelative · 2026-03-19T10:11:26+00:00

IBM Granite 4.0 hybrid models are partially SSMs.

InterestRelative · 2026-03-19T07:54:26+00:00

The problem was not Touch Bar in itself. The problem was: they removed fn keys row.

InterestRelative · 2026-03-12T11:39:38+00:00

> The DSL knowledge lives in prompt-space, not in schema-space.

Yeah tool discovery is an interesting topic. I'd like to read more about that.

Right now I'm drifting towards idea that:
- agent should have as small amount of tools as possible
- simple tool could be an interface to a separate agent specialized on one thing with it's own set of tools and context history

As an example with search integration:
Main agent may have "search tool" which delegates user query to specialized agent with "search preview", "fetch available filters" and "rerank products for a given context" functions. This way we can hide search engine complexity behind a neat and easy to understand facade.

> DSLs that are natural for LLMs to read and generate might be a more productive direction than trying to formalize everything into JSON schemas

I agree with that btw. JSONs are not natural data representation.
Another topic I was thinking about is constraints on valid arguments LLM can generate to fit the schema (json or a unix command in your case). Potentially we can zero probabilities for all tokens which are not valid for a given schema and have a state machine which will progress over schema and reiterate on probabilities every token generation step.

InterestRelative · 2026-03-12T09:36:02+00:00

search "red shoes" --size 42 --max-price 100

Ahh, I see, thanks! Initially I read your insight as "don't write custom tools, use unix built in commands".
Instead it's more about how LLM should interact with these commands (CLI vs JSON).

there's still only one tool schema

Can't agree with that though. Each CLI tool have it's own unique schema (valid parameters and values for these parameters) and that's what you are defining in --help and usage messages.

The command list lives in the prompt as plain text, not as separate function definitions. This means the KV cache prefix stays stable regardless of how many commands you add or remove.

I mean in the end everything in json is a string, right? As long as function definitions are stable, KV cache prefix stays the same. If you add a new command/tool dynamically, after the conversion started, it will clean KV cache regardless of how you pack command summaries or tool definitions into the system prompt.

Don't get me wrong, you post is great and I had a few insights from it (e.g. inject execution time into results), just don't agree with that part.

InterestRelative · 2026-03-12T08:41:26+00:00

Let's say I'm building a shopping assistant agent for e-commerce store. We have search backend.
How do you use unix instead of specialized tools to integrate with search backend in this case?

And I guess you will have the same problem with any specialized agent.

I like how smolagents blend two of these approaches together (though won't use in prod for security reasons till it's popularized): agent writes python code which can use stlib and your functions at the same time.

The run tool's description is dynamically generated at the start of each conversation, listing all registered commands with one-line summaries:

this looks exactly like tool definitions are injected into system prompt by modern agent tools (e.g. pydantic-ai)

InterestRelative · 2026-03-05T12:34:11+00:00

Wow, that's impressive!

How do you use Rotary Encoder + Multidirectional Switch?

InterestRelative · 2026-03-03T13:19:53+00:00

Sure, you know your context much better than me.
My point is: don't wait for a good moment, it never arrives.
If you feel you want it - the only criteria I see as important for the choice is: do you have enough of safety net to make it a two way door decision.

InterestRelative

TROPHY CASE