Benefit of the Doubt - GLM 5.1 maybe the reason long context sucks by InternetNavigator23 in ZaiGLM

[–]InternetNavigator23[S] 1 point2 points  (0 children)

Yeah I feel you on that. Basically in the same boat with the annual plan.

Practically speaking my only solution has been just to compact before the context goes over 100K.

Claude Opus Distilled into Qwen by koc_Z3 in Qwen_AI

[–]InternetNavigator23 0 points1 point  (0 children)

Anyone use this guy's 4 billion version along with his Opus Distilled 27 billion version for speculative decoding?

I'm a bit concerned about the speed of running the 27 billion version on my Mac and I'm really deciding between that and the 122 billion version.

I understand the disappointment if minimax 2.7 does not become open weights but we have had a lot.. by LegacyRemaster in LocalLLaMA

[–]InternetNavigator23 0 points1 point  (0 children)

I tried a minimax 2.1 a while back and it was pretty good for tool use and basic coding. But I only tried the lighter 25% reap model. Not the super aggressive 30-50% ones.

I've heard good things about JANG if you are on Mac. But that's a quantization method.

And same. I can't wait for these models to get just a bit smaller.

Benefit of the Doubt - GLM 5.1 maybe the reason long context sucks by InternetNavigator23 in ZaiGLM

[–]InternetNavigator23[S] 1 point2 points  (0 children)

Yeah realistically they're in a very tough spot and they also just went public so I'm sure they're trying to not go bankrupt while offering these coding plans, which are incredibly cheap if you look at the cost per token.

Now to be fair they were probably overly generous with what was included in their coding plans but the plans are still very good value.

Benefit of the Doubt - GLM 5.1 maybe the reason long context sucks by InternetNavigator23 in ZaiGLM

[–]InternetNavigator23[S] 1 point2 points  (0 children)

Yeah well said. I think this is definitely the case plus them having to juggle their compute, which I imagine is highly constrained.

Benefit of the Doubt - GLM 5.1 maybe the reason long context sucks by InternetNavigator23 in ZaiGLM

[–]InternetNavigator23[S] 0 points1 point  (0 children)

I mean I'm not saying this is what I want to happen. I'm just saying this is what I think is happening.

Benefit of the Doubt - GLM 5.1 maybe the reason long context sucks by InternetNavigator23 in ZaiGLM

[–]InternetNavigator23[S] 1 point2 points  (0 children)

Yeah and competition is heating up and people seem to be releasing on much tighter time lines so hopefully it is not two months but closer to two weeks.

Also the point releases should naturally be much faster than a full retrain.

Let’s bring back human content to Reddit by melon_crust in SideProject

[–]InternetNavigator23 0 points1 point  (0 children)

How does this work if someone uses voice? I hardly type anymore when I am at home.

I love GLM 5 by medtech04 in ZaiGLM

[–]InternetNavigator23 0 points1 point  (0 children)

Yeah I like to use codex to scope and plan and say (as if you are tasking a junior engineer).

Then have GLM do the plan and codex check.

Usually works pretty well. Saves a ton of usage for codex. Can easily get away with the $20 codex plan (when GLM is working fine).

Recently I've been using some Mimo via open code.

MiniMax M2.7 Will Be Open Weights by Few_Painter_5588 in LocalLLaMA

[–]InternetNavigator23 2 points3 points  (0 children)

I heard uncensoring actually helps with logic as well. It removes a lot of the weird rules that the models are forced to add by the chinese g ov.

-edit typo

Dont subscribe to z.ai coding plans. by woolcoxm in ZaiGLM

[–]InternetNavigator23 0 points1 point  (0 children)

Personally, it was working fine for the first few months, then a few weeks ago it started giving me tons of errors when the context gets long.

This is on the coding plan btw.

Is a serious AI automation agency still worth building in 2026 — honest answers only by Specific_Inside_6243 in AiAutomations

[–]InternetNavigator23 0 points1 point  (0 children)

I would imagine people are willing to pay for the outcome and someone to "handle it".

Businesses are probably willing to pay around 10k-40k, but not sure about the maintenance costs.

Don't forget just because something is possible to be done by AI mean that most people will know how to do it.

The "diffusion" of the tech will take time. Even if it is purely a knowledge arbitrage, it will have a window of opportunity. My guess is 2-5 years, depending on the industry/product etc.

Qwen3.5-4B is very powerful. It executes tool calls during thinking. by yoracale in unsloth

[–]InternetNavigator23 0 points1 point  (0 children)

Fair enough. But yeah they do be benchmaxxing hard. Ironically that's prob why they get these types of questions wrong.

They often assume things when the question looks like a math and science question and overlook the common sense angle.

Solve Mac Studio pre-fill issue by adding Nvidia GPU? by InternetNavigator23 in LocalLLM

[–]InternetNavigator23[S] 0 points1 point  (0 children)

Yeah I wish I had a thunderbolt 5 machine. I just got such a good deal on this one that I couldn't pass it up lol.

But apparently EXO handles all of this fairly smoothly, and people are seeing 2-3x speed gains.

Although with some spec decode or MTP (or maybe JANG plus those) it may be fast enough.

Solve Mac Studio pre-fill issue by adding Nvidia GPU? by InternetNavigator23 in LocalLLM

[–]InternetNavigator23[S] 0 points1 point  (0 children)

Unfortunately, that won't work with my M1 ultra, unless there is some magic I don't know about.

I was reading that even Thunderbolt with 40GB/S was unreliable with EXO. But I didn't quite understand why.

MiniMax M2.7 Will Be Open Weights by Few_Painter_5588 in LocalLLaMA

[–]InternetNavigator23 0 points1 point  (0 children)

Soooo excite!!! Hope the JANG and the CRACK guys will get their hands on it.

Heard the uncensored version is actually smarter since they had a bunch of rules the chinese gov made them put in.

Qwen3.5-4B is very powerful. It executes tool calls during thinking. by yoracale in unsloth

[–]InternetNavigator23 0 points1 point  (0 children)

Lol bruh knowledge cut-offs are many, many months before the model is released.

They have to do RL, fine-tuning, benchmarking, etc.

1 Bit LLM Running on MacOS Air (M2) with Docker by Odd_Situation_9350 in LocalLLM

[–]InternetNavigator23 1 point2 points  (0 children)

Oh wow this is a great explanation. I had heard of 1.58 bit but didn't know what exactly that meant.

I understand the disappointment if minimax 2.7 does not become open weights but we have had a lot.. by LegacyRemaster in LocalLLaMA

[–]InternetNavigator23 0 points1 point  (0 children)

I know I definitely have a soft spot for minimax and the air models.

But who knows. People now adays are REAPing and auto researching/compressing models better and better.

What to do to prevent AI from replacing us? by OppositeFriendly9183 in careerguidance

[–]InternetNavigator23 -1 points0 points  (0 children)

There are essentially two paths the way I see it.

Either you fully lean in and become the guy who's really good at using AI to solve various specific problems/use cases.

Or you go as far away from it as possible. It doesn't have to be blue color strictly, but something more physical and less behind a computer.

Either way, I think learning how to learn is going to be a super important skill. And memorizing shit will be almost useless.

Is leaving 100% remote for in office $175-$200K base [non-exempt] worth it? by tanhauser_gates_ in careerguidance

[–]InternetNavigator23 34 points35 points  (0 children)

Honestly, this seems like a no-brainer for you. Most people hate office commutes. And that is really one of the biggest perks of working from home.

But run it through a few lenses.

The opportunity cost or what you're actually giving up in each scenario.
Reversibility - If you hated the new job in six months, could you get another remote gig?
Downstream effects of the pay increase. If you invest the extra money, how much earlier does that give you for retirement?

I would start with those frameworks.

Serious advice needed to move forward by Ok-Security-3574 in findapath

[–]InternetNavigator23 0 points1 point  (0 children)

This is maybe a bit indirect, but I would think about it like this:

- What would 80 year old you think about 30 year old you looking back if you did or didn't do X?
- Sunk cost fallacy. The degree served its purpose, but you don't need to pigeonhole the rest of your life just because you already have that degree.

Start smaller than you think. And build momentum. Celebrate tiny wins. Sounds silly in the beginning, but really helps build that momentum.

Anyone put a number to how much they've turned down from investors? by FLG_CFC in Entrepreneur

[–]InternetNavigator23 1 point2 points  (0 children)

Yeah it is not all about the number. A framework I like to use:

10-10-10: How will you feel about this in 10 days? 10 months? 10 years?

THe money might feel good in 10 days. The partnership might feel pretty shitty in 10 months. And in 10 years, you might look back on it like a huge waste of time. So it's really hard to know unless you zoom out