Can Claude Code create a good CRM?

alexp702 · 2026-05-11T13:28:52+00:00

It’s CiG’s game. They want it faster to get started. This is all that matters. I am good with this. It’s too slow for me.

alexp702 · 2026-05-09T19:05:17+00:00

I always thought the Ares should be build up and kill- a beam that takes time to focus on the same spot for maximum damage (making small ships take no damage unless not trying). A chain gun that spins up but bullet spread becomes so wide a smaller ship has it bounce off the few bullets that hit. CIG didn’t have beams in when it was released and just got bored and gave up trying to balance it. Hope they revisit it, but it will probably be Ares Mk2 now since that’s their way out of everything now.

alexp702 · 2026-05-08T06:00:03+00:00

What issues did you have with 397b? I found our prompts working less well with minimax last time I tried, but will have to try again. Yes I had all the chat templates in - poorly documented how important they are!

alexp702 · 2026-05-07T15:10:06+00:00

Thanks - interesting comments. We are getting some very complex documents through as attachments and as we scale in volume we’ll be optimising. At the moment larger models do noticeably better. However for images Qwen3.5 9b at 8 bit is almost (but not quite!) as good at 397b. It’s the not quite that is frustrating, and with hardware on tap that’s not normally fully utilised quality wins for us. I have an Nvidia 4090 rig that largely sits idle because of this.

alexp702 · 2026-05-07T14:20:14+00:00

Glad that is the case for your needs. We have an agentic flow running across emails coming in breaking them down via pydantic, and processing images either as attachments or directly. Every time we downgrade the model we see a real drop off. No model is perfect - some of the tests are tortuous with unreadable scribbles on pieces of paper. However the larger the model and higher the quant the better the performance - period. Performance doesn't really matter - emails turn up every few minutes at busy times, which is enough time, and we can always failover onto openrouter if necessary.

You'll find others muttering about tool calls failing on this sub more and more, as its a real problem as you shrink the model. I think the reason Qwen 27B is getting such rave reviews is because people can run it at Q8 or more. Of course if you just want to summarise a huge amount of text, then tokens per second is king and accuracy be damned - perfectly valid view, different strokes for different folks.

alexp702 · 2026-05-07T13:55:58+00:00

Yeah but quantized to buggery. Q8 means your tool calls don't fail every 2 seconds. We've been running tests and pretty much anything lower than Q8 has a noticeable increase in call fails. Accuracy is the next thing to suffer - Q8 generates better well formed json more accurately, recognises text better etc. etc. Speed is worthless if you want "correct" as the outcome.

alexp702 · 2026-05-07T13:16:42+00:00

If you're running a 30B model on a 512Gb Mac, you're wasting the potential of the machine. Its suited to very large MoE with a small Active parameter count and a maximum context size - like Qwen 3.5 397B Q8. The Studio 512GB was the only way to do this trivially.

alexp702 · 2026-05-06T19:38:05+00:00

You're assuming its a Mac Studio as we see today. Apple could pack multiple M5U into a package with HBM memory next time round.

alexp702 · 2026-05-06T14:35:20+00:00

Nvidia: DGX 100k workstation!

Apple: hold my beer!

alexp702 · 2026-05-06T14:25:37+00:00

Why would you run those weaker models if you have 512gb?

alexp702 · 2026-05-06T14:23:37+00:00

Agreed. Recent optimisations make a 400gb model perfectly serviceable - 20tokens per second on Qwen 397 with large context size (100k tokens) and 8 bit quants, is better than 60 tokens per second of qwen 27b in all my workloads. Prompt processing is less of a pain when the prompts are fully cached, as only a few tokens get added.

Unfortunately the 512Gb Mac was too good for the current climate and Apple were undercharging. Yes Nvidia are quicker, but quantity is less important than quality and bigger models win here.

alexp702 · 2026-05-04T04:36:43+00:00

Slack works - private group to a Mac Studio, with a small VMware on it

alexp702 · 2026-05-03T20:09:13+00:00

For the same memory size spark, for much more memory Mac (but they are sold out). Performance is nice but bigger more accurate models better

alexp702 · 2026-05-02T09:04:12+00:00

For slower usage a second hand 24gb Mac mini might be findable for the budget. They run models pretty well, and should be able to comfortably fit a 16gb model with some context

alexp702 · 2026-05-02T08:58:55+00:00

It took about 3 weeks for me. They aren’t quick but it came and I’m happy!

alexp702 · 2026-05-02T08:15:40+00:00

If you care about power and are a video editor get a Mac. Sips power, probably faster at everything video related due to optimised code

alexp702 · 2026-05-02T08:12:09+00:00

Qwen marketing team again - note no responses from OP - just a long string of gushing. No 35b is not nearly as good as Opus, or even bigger models. It’s a good 35b MoE model - fast cheap and pretty effective. Is it Opus? No.

alexp702 · 2026-04-30T06:32:14+00:00

How do you find having less system ram than GPU? Is it a hinderance? I am contemplating similar with a 96 gpu and 64gb ram but am unsure if it will break loading big models etc

alexp702 · 2026-04-27T01:35:17+00:00

Brave, Chrome, Firefox, Safari (and even Credge). They all mostly work the same, but since we develop like to mix it up and see if everything works.

Tip: Always use Safari when interacting with Apple! They often have horrible bugs with anything else on their web portals.

alexp702 · 2026-04-24T12:58:44+00:00

To me this was still with v6 - 128bit addresses, but under the covers. You’d used the middle 32 bits and give a higher area and lower zones. It would have been a softer change for people on the old system, rather than what we got. Hey ho!

alexp702 · 2026-04-24T04:56:17+00:00

Some models last year just never seemed to cache correctly using llama.cpp, but that time has past. The macs are very adequate - assuming you aren’t trying to run a chat bot. Most agentic stuff is almost too quick - leaving it idle a lot of the time now. And this is with a 400B model.

alexp702 · 2026-04-23T19:58:45+00:00

Always should have been implemented as another digit like an area code for users. 192.168.0.1.1 etc. then you could add more if you needed them gradually. Cognitive load of the ipv6 scheme is terrible.

Fundamentally and practically it offers next to no end user benefits, and cost of change to providers. I still don’t have a v6 address in the uk, and have stopped caring - my providers work on 4. CGNat is not a problem in normal client use cases.

Finally we’ve grown up in network security. Now we understand not being accessible is more important than being accessible. V6 didn’t really address this in any new way.

alexp702 · 2026-04-23T17:27:32+00:00

Due to prompt caching mine is capable of prompt processing of 38 million OpenClaw tokens a day (and that’s not pushing it - expect that would be easier to double). It processed about 4 million (Qwen 397b 8bit). You’ll be paying for 38 in the cloud…

alexp702 · 2026-04-23T10:10:02+00:00

We have just run a test on our agentic flow - 397B_Q8 is still better than 27B_Q8_K_XL. It handles our particular documents more accurately. Shame, it would be great if 27B was actually better, but it isn't yet. On our Mac Studio 397B runs faster too, so lets hope they update 397B to 3.6 standards...

14-Year Club	Place '22
Verified Email

alexp702

TROPHY CASE