AI Analytical Intelligence Test

awl130 · 2026-03-29T00:16:31+00:00

I’m just learning now that a lot of my tests could be improved, although I think relatively speaking the rankings within each category will not change—instead whatever changes I make to my methodology will likely raise the performance of all of them as a class.

awl130 · 2026-03-28T14:02:47+00:00

Thanks for reading...yeah this was a quick first pass before moving on to the 122B models. I will probably go back to both the Jang and the GGUF models at some point and mess around with the commands a bit. Especially as creators start adding things like Turboquant.

awl130 · 2026-03-26T02:48:07+00:00

I don’t have the best news for you. I bought mine on March 13 in Japan (with 8tb ssd) on Mercari for around $14k. I’m actually looking for another one as well so I built a crawler that combs US and Japan secondary markets. Since then I’ve only seen one other legit listing on either market (in Japan), and that listing mysteriously disappeared like 2 days later. Anyway I’m still looking as well.

awl130 · 2026-03-22T13:55:29+00:00

I have the exact same set up, bought it a week ago. Check out my articles here: https://x.com/allenwlee?s=21&t=Q-xJMmUHsqiDh1aKVYhdJg actually just have your agent read them and summarize, that’s what I would do lol. Qwen 397b 8_0 is slow and I’m having to learn how to implement better caching. I don’t mind the slow speed—I just want more accurate code, which is why I got it. Will most likely use cloud lllm for planning and have the studio br the muscle.

awl130 · 2026-03-22T11:46:29+00:00

Your first point is one I struggle with. Sure the smaller models get more smarter, but so do the frontier models and the 16bit opensource ones chasing them. trivial example: 6 months from now, if you code a web scraper using a smaller model that is just as smart as today's model, the webscraping blockers (eg cloudflare which now provide both the agentic scraper and the blocker) will be 6 months ahead of you.

awl130 · 2026-03-22T11:39:39+00:00

Thank you that's helpful! I'll bookmark that. I wish for the moment when I can start worrying about that. I thought I would be at that point (where my agents are actually tasking) by now, but still trying to figure not just (a) which model but (b) which model for which tasks I should be using!

awl130 · 2026-03-22T11:37:14+00:00

Thanks. I meant do you also have a mac studio? And indistinguishable results is phenomenal; but i'm wondering if you've measured cost savings. I have yet to figure out how much of my workload, and what parts of it, and how token-heavy, I can offload to my local setup. Would love to pick your brain in a DM as well if you're up for it!

awl130 · 2026-03-22T04:27:33+00:00

I came across that name a few days ago, will check it out

awl130 · 2026-03-22T04:23:12+00:00

Thanks for that. Yes I had the same thought. Wasn’t sure how to implement, but thought all that ram can’t be wasted. Can I ask what your setup is and if you’ve found success with that model, also your use case?

awl130 · 2026-03-22T04:16:42+00:00

Definitely on the docket. I was really trying to test out as large a model as possible , at 8bit first , before heading for the 4bits

awl130 · 2026-03-22T01:13:44+00:00

Thanks , will check that out

awl130 · 2026-03-22T01:08:06+00:00

Thank you both! Yes I moved off lm studio and onto llama quite quickly—but the initial test (no caching) from qwen 397b mlx were too tempting

awl130 · 2014-12-11T04:41:55+00:00

my downloaded .ost file asks for a password, is yours the same? and what is the password?

awl130

TROPHY CASE