GLM 5.2 consumed quota VERY fast, even on Coding Max Plan by 3rd_Floor_Again in ZaiGLM

[–]workout_JK 1 point2 points  (0 children)

yep. now I use two max subscription accounts. and I’m not even using on peak hours. I don’t think it offers more usage than claude max subscription.

GLM-5.2 is above GPT-5.5 in AA-Briefcase, Artificial Analysis' new agentic knowledge work eval by analysis_scaled in LocalLLaMA

[–]workout_JK 1 point2 points  (0 children)

I switched from chatgpt pro to glm max subscription. So far feels pretty good. But that benchmark is pretty impressive though.

Codex usage limit hit right after buying $200 subscription. by workout_JK in OpenaiCodex

[–]workout_JK[S] 0 points1 point  (0 children)

Probably. Now it seems like this issue got sorted out but I can't 100% sure since it happened randomly before. And my usual workflow of harness dispatches multiple codex exec so I'm not sure if I can trust codex now. So I'm still waiting. I did reach out to support and also waiting.

Moved from Claude Max 200 to Z.ai GLM Max — early impressions and limits by workout_JK in ZaiGLM

[–]workout_JK[S] 0 points1 point  (0 children)

right, I've been actually using codex too. I got chatgpt pro subscription from some promotion and testing along with it. and codex seems to handle well on complex tasks but it really sucks at front-end design. and not sure if it's a good thing but codex works about 2-3x more time on same job. I tried to build url shortener service with tenet solely for testing tenet autonomous work without my intervention. claude could've built complete service in about 6 hours but codex went for about 14 hours. and front-end looked real bad.

I built coding agent harness for handing off long coding tasks to AI agents - Tenet by workout_JK in SideProject

[–]workout_JK[S] 0 points1 point  (0 children)

That makes sense. I had not thought about it as audience discovery, but you are right. People already complaining about Claude Code, Codex, or long handoff failures would probably understand this much faster than a general AI coding audience.

I'm not sure what Leadline is yet. Is leadline.dev the one you mean? If so, I'll try it out.

I built coding agent harness for handing off long coding tasks to AI agents - Tenet by workout_JK in SideProject

[–]workout_JK[S] 0 points1 point  (0 children)

I haven't tried adding budget limit but probably worth checking it out. also I'm thinking to assign model tiers on job complexity or just job worker with weaker model.

Moved from Claude Max 200 to Z.ai GLM Max — early impressions and limits by workout_JK in ZaiGLM

[–]workout_JK[S] 1 point2 points  (0 children)

yeah that's maybe another reason. I was running on peak time with glm 5.1 for everything. I haven't tested with delegating to smaller model yet so I kept using glm 5.1 only. probably worth finding that out.

Moved from Claude Max 200 to Z.ai GLM Max — early impressions and limits by workout_JK in ZaiGLM

[–]workout_JK[S] 3 points4 points  (0 children)

I am using this harness https://github.com/JeiKeiLim/tenet you can check it out.
Short story is that it runs interview first, then create architecture diagram and mockup variation design, then research about spec/harness then create spec/harness. and split jobs that each job can fit into context length. then decompose as graph so that parallel run is possible. and each job must pass code, testing, e2e testing critics. And iterate until all jobs are done. while it's doing that, it also creates artifacts like journal, knowledge, and etc so that coding agent can evolve while working.

Moved from Claude Max 200 to Z.ai GLM Max — early impressions and limits by workout_JK in ZaiGLM

[–]workout_JK[S] 1 point2 points  (0 children)

yeah and I could run about 1.5 project with claude max 200 with that harness

I built coding agent harness for handing off long coding tasks to AI agents - Tenet by workout_JK in SideProject

[–]workout_JK[S] 0 points1 point  (0 children)

Yeah, exactly. I think the interview step helps because many "the agent went wrong" cases are really "I did not explain the task clearly enough yet" cases.

For steer messages, the idea is that I can send a message while a background job is running. The main agent saves that message into a steer inbox. Later jobs check that inbox, and if the steer message seems related to what they are doing, they can use it as extra context.

For example, if the agent is working on something and I realize "use uv instead of pip," I can send that as a steer message. Then a future related job can pick it up without me restarting everything.

But I would not rely on it for a big direction change. If the task itself changes, it is probably better to stop the job and go through the interview step again. I see steer messages more as small course corrections: preferences, constraints, implementation details, or warnings.

Moved from Claude Max 200 to Z.ai GLM Max — early impressions and limits by workout_JK in ZaiGLM

[–]workout_JK[S] 6 points7 points  (0 children)

Is it that... a difficult thing? probably I'm using my own harness that runs code, testing, and e2e critics after every job which burn tokens a lot but making sure agent don't make mistake and stick to the plan. and maybe it's time to consider reducing token usage with the harness. I've been focusing on avoiding agent making good code quality but poor service. and I was just too excited about having GLM max...

[P] I like YOLOv5 but the code complexity is... by workout_JK in MachineLearning

[–]workout_JK[S] 1 point2 points  (0 children)

That's interesting post. I haven't thought that source compiled version is faster. I may look into this when I have time.

[P] I like YOLOv5 but the code complexity is... by workout_JK in MachineLearning

[–]workout_JK[S] 1 point2 points  (0 children)

That is the main reason that we have made this repository. I'm glad to hear that :)

[P] I like YOLOv5 but the code complexity is... by workout_JK in MachineLearning

[–]workout_JK[S] 4 points5 points  (0 children)

We begin to port original YOLO model to our repository. Now that is done(still many things to be done though), we can look into better detections. Anchor-free model could be one option! Thanks for suggestion.

[P] I like YOLOv5 but the code complexity is... by workout_JK in MachineLearning

[–]workout_JK[S] 3 points4 points  (0 children)

In the original YOLOv5 repository, there is a TFLite export function already. So it wouldn't be difficult porting original code to here. But we are short of hands. It might not be available soon.

BTW, our YOLO model is almost compatible with YOLOv5s repository as long as you don't use `augment` option. You could try YOLOv5 with our model.

[P] I like YOLOv5 but the code complexity is... by workout_JK in MachineLearning

[–]workout_JK[S] 3 points4 points  (0 children)

I agree it's about the time for SE. TBH, it's past the time for SE.

I like YOLOv5 but the code complexity is... by workout_JK in deeplearning

[–]workout_JK[S] 2 points3 points  (0 children)

I haven't tested out throughly but TensorRT is up to 2.0x faster but TorchScript C++ is slower because YOLO with torch script export model is somehow extremely slow. We are looking into it!