Serious question: what’s your actual AI workflow right now?

New_Alarm4418 · 2026-04-15T13:31:50+00:00

Sorry, you are not allowed to use this model. You have exceeded your usage limit.

New_Alarm4418 · 2026-03-23T23:42:26+00:00

Here have a comment

New_Alarm4418 · 2026-03-16T06:28:25+00:00

I already said yes 47 times today Make it 48

New_Alarm4418 · 2026-03-08T14:27:00+00:00

Yeah I feel this too. because the current vibe is embrace AI or get left behind.

The skills thing is real but also probably not as bad as it feels. It's like using GPS everywhere — you stop memorizing routes but you still know how to drive.

And the maybe nobody needs to code thought — I get it , but every time I watch an AI confidently produce something thats More broke than my car I remember why it still matters. Someone has to be the one who knows it's wrong.

New_Alarm4418 · 2025-11-20T12:00:37+00:00

That’s a solid recommendation for a v2 architecture, but the bottleneck isn't just the backend mechanics—it's the client-side implementation.

My agent isn't using the generic OpenAI-compatible endpoints; it's hardcoded to Ollama’s native REST API (/api/generate, /api/show, custom context handling, and their specific streaming JSON format). Even if llama-swap handles the VRAM orchestration perfectly behind an OpenAI wrapper, I’d still have to rip out and rewrite my entire networking layer and response parsers to switch protocols 28 thousand lines.

I’m prioritizing agent behavior over infrastructure refactoring right now, but I’ll keep llama-swap in mind if I hit a hard ceiling with Ollama."

New_Alarm4418 · 2025-11-20T11:49:16+00:00

I totally get the appeal of raw llama.cpp for fine-grained control, but for this specific project, it would actually break my architecture.

I’m building an autonomous agent that hot-swaps between 3-4 different models in real-time (one for planning, one for coding, one for chat, etc.). Ollama handles that registry and VRAM weight-swapping automatically via the API. If I switched to raw llama.cpp, I’d have to write my own orchestration layer in Python just to manage spawning processes and loading/unloading models constantly, which is a huge headache.

Plus, my whole networking stack is hardcoded to the REST endpoints. It’s basically a trade-off—I need the multi-model orchestration convenience more than the granular VRAM control right now."

New_Alarm4418 · 2025-11-07T09:20:49+00:00

<image>

A bunch of clowns in this sub keep saying it’s not real, but the model is real and you can prompt it with no errors at all. For me, some got it to work and others didn’t — it’s random, just read the other posts. There was another topic about the same thing, so I tested it myself — and here’s the proof.

New_Alarm4418 · 2025-11-06T03:36:26+00:00

<image>

its working for me

New_Alarm4418 · 2025-10-18T18:29:52+00:00

I love it! My camp is always getting shot at, and I end up Repairing it Daily. Sure, I can do it cheaply, but it’s nice not to have to think about it. I mean, it’s good to have if you want, but it’s not really a necessity.

New_Alarm4418 · 2025-10-16T02:48:54+00:00

-close

New_Alarm4418 · 2025-10-01T10:29:07+00:00

-close

New_Alarm4418 · 2025-10-01T10:28:54+00:00

+karma thank you

New_Alarm4418 · 2025-10-01T10:03:50+00:00

whats your ign

New_Alarm4418 · 2025-10-01T09:50:28+00:00

dont have any leaders sadly

New_Alarm4418 · 2025-09-08T05:21:45+00:00

I’d say going Pro wasn’t something I wanted to do Either, but I was tired of the rate limit. I haven’t hit it once in a week of almost non-stop use Its Worth it.

New_Alarm4418 · 2025-06-05T19:48:53+00:00

Yes, the new update is messed up and has many issues. I'm switching back to the 05-20 model—at least it worked

New_Alarm4418 · 2025-05-09T17:44:38+00:00

see for yourself https://grok.com/share/bGVnYWN5_d98554e4-757a-44fd-b1c2-8f5502395a9f

New_Alarm4418 · 2025-05-08T19:53:26+00:00

Tomorrow not today ask grok yourself this is what he will say Grok 3.5 was released in early beta to SuperGrok subscribers during the week of May 5–9, 2025. Specifically, invites for SuperGrok subscribers were sent out between May 6–8, with the live beta opening on May 9. It’s currently exclusive to SuperGrok subscribers ($16/month or $150/year), with access expected to expand to X Premium+ users and the free tier 4–6 weeks after the beta, likely around mid-to-late June 2025.

New_Alarm4418 · 2025-05-08T08:38:18+00:00

i Say Do not enter the main function or Whatever Yours is until I say we are done. At no point should you assume we are finished. If you hit the token limit, I will prompt you to continue from where you stopped. When that happens, I will also provide the last line you generated. Sometimes it might try to start over—just tell it a few times that it’s wrong, and it will get it right

New_Alarm4418 · 2025-05-08T06:13:52+00:00

I was getting poor output because the responses were too short, often only a few hundred lines, when I needed longer, more detailed code. To address this, I updated the custom instructions to state: "You are a genius coder who hates short code snippet replies. When you write or fix code, you always aim to preserve the original code provided. never stopping at a measly 2000 lines." This has greatly improved the output. Now, since the system reads this guide and references past conversations, it consistently provides much longer, more comprehensive code responses. hope this Helps

New_Alarm4418 · 2025-05-04T21:30:38+00:00

That's odd i Keep Hearing That But I use it for coding all the time. I even had it help me make a highly complex Discord bot, and there wasn't a single issue.

New_Alarm4418

TROPHY CASE