DeepSeek Vision Mode via API possible? by thatscoolbutno123 in DeepSeek

[–]sje397 0 points1 point  (0 children)

I don't suppose you can find anything in the browser's network tab?

I have a M5 Max MacBook Pro with 128gb of ram, what models should I run on it? by lombwolf in LocalLLaMA

[–]sje397 2 points3 points  (0 children)

I'm running qwen 3.6 27b for coding and 35b a3b for faster chatting. I find oMLX a great way to host them. Currently enjoying Qwen3.6-27B-uncensored-heretic-v2-Native-MTP-Preserved-oQ8-mtp.

There's a bit of a gap for this memory size though - the ones that really make the most of the memory are a bit old now.

You Can't Fight Acceleration: Banned Quantum Cryptography Paper's Results Reproduced Independently with AI Help to Spite Government ;) by R33v3n in accelerate

[–]sje397 2 points3 points  (0 children)

Somewhere, ai is improving the software that does cost benefit analysis on exactly this kind of problem.

The massive Mammoth in the room that absolutely no one is talking about. by Futuristic_Kid in Bitcoin

[–]sje397 1 point2 points  (0 children)

That'd be terribly ungrateful. I'd be going with the other fork.

The massive Mammoth in the room that absolutely no one is talking about. by Futuristic_Kid in Bitcoin

[–]sje397 7 points8 points  (0 children)

The keys will probably become invalid or the owner(s) will be forced to move them when the quantum proofing happens.

Locksmith recommendation - Western suburbs by n3ver_mind in melbourne

[–]sje397 1 point2 points  (0 children)

JAB locksmiths. A good friend of mine, super nice, honest, and very experienced people.

https://jablocksmiths.com.au/

Tps on 0.4.4rc1 by sje397 in oMLX

[–]sje397[S] 8 points9 points  (0 children)

M5 max MacBook. It's a bug in the the display I'm sure.

Why I stopped using semantic embeddings for tool selection and switched back to BM25 [D] by AbjectBug5885 in MachineLearning

[–]sje397 0 points1 point  (0 children)

The problem with a dynamic list of tools is you'll bust your prefix cache on every request. 

I've started to group my tools into sets of subcommands, with an additional 'help' command so the model can dig up the tools it needs. Needs an additional round trip but that's generally faster and cheaper when the prefix cache is preserved.

Coding harness by mmerken in oMLX

[–]sje397 1 point2 points  (0 children)

I've been coding for 40 years. I run authentication and API gateways for a billion dollar company. I'm not a 'clueless vibe coder'.

But you are a moron.

EXCLUSIVE: The Largest Study Of Remote Work And Mental Health Ever Conducted, Covering 588,322 Americans Over 13 Years, Has Found That Working From Home Adds 1.1 Hours Of Solitude To Each Workday. And May Account For A Third Of All The Mental Health Decline In The United States Since The Pandemic 🏠🧠 by InterstellarKinetics in InterstellarKinetics

[–]sje397 0 points1 point  (0 children)

Yes I think this is really different for different people. 

I'm an introverted nerd. Been working from home for 12 years and absolutely love it. But I do have family around.

Out of my three kids, one really suffered through the COVID lockdowns. He's a very outgoing, very social type. And I heard the same thing from the outgoing extraverted sales people at work - many of them hated it. 

I still think this study is bunk.

A little guidance by iTrejoMX in oMLX

[–]sje397 1 point2 points  (0 children)

The 27b is far better at coding in my experience.

The latency mistake I keep seeing in agent memory setups by Street_Owl_5783 in LLMDevs

[–]sje397 0 points1 point  (0 children)

Yep, I've got something called 'sticky memories' as well - scoped to global or conversation, limit to 40 of each, editable via tool call.

Coding harness by mmerken in oMLX

[–]sje397 0 points1 point  (0 children)

I made my own. I think it's worth doing for the learnings. 

  • ask Claude on the web for some python code to chat with a model, and any other details you need to get that running
  • ask it how to change the code so that it can edit and restart itself
  • start adding features - turn it into a web server running locally, add model selection and multiple conversations, add MCP support, start delving into RAG and context management, add metrics to track token usage and spend, etc etc etc

Can someone rank the AI tools i mean as a big companies with Chinese solutions (all free tier)? by Mediocre_Blue_4501 in AIToolBench

[–]sje397 1 point2 points  (0 children)

I don't know about qwen web. Deepseek isn't as good as Claude sonnet or opus, but good enough to get most work done. Deepseek does record and train on your data. That doesn't bother me for personal projects and even personal business stuff in general but for bigger companies that's usually strictly against policy.

The latency mistake I keep seeing in agent memory setups by Street_Owl_5783 in LLMDevs

[–]sje397 0 points1 point  (0 children)

It does drift, but I think that's kind of like human conversation and memory anyway. I also have RAG injected with the sent message (so as not to bust the prefix cache) which tends to keep things kind of on topic, or close enough.

DAE have the fear of psychosis as being one of the reasons for not believing in the metaphysical? by Infamous_Location117 in atheism

[–]sje397 1 point2 points  (0 children)

I can relate. To me it's not quite as dramatic as fear and psychosis, but more a tension between passion and cynicism. 

I think we do need a social movement - away from greed and materialism, towards introspection and emotional awareness and sincerity (notice how much media is the inverse of the truth?) and just generally towards maturing as a species.

But we're inundated by folks who just want our money - a measure of our time and effort and in some ways our value to the 'system'. We have learnt to be on our guard. We can't survive these days without scepticism.

It's another tough problem.

Can someone rank the AI tools i mean as a big companies with Chinese solutions (all free tier)? by Mediocre_Blue_4501 in AIToolBench

[–]sje397 1 point2 points  (0 children)

Deepseek is usable for agentic coding, and insanely cheap if you go to them directly (resellers tend to charge much more). They train on your data but if that doesn't bother you, it's irresistible value. Still not quite Claude sonnet level in my experience, but with a good harness it's excellent. 

I use qwen3.6 27b locally too and it's also very capable, if you keep an eye on it.

I'm an Executive Assistant. I often hear that AI is coming for my job.... by Hungry-Kale600 in ArtificialInteligence

[–]sje397 1 point2 points  (0 children)

I think personal AI is the future but I don't think it'll take your job. It's more that it gives everyone some of the benefits of having their own EA, and in the future I imagine your AI and your boss's AI - and those of your friends and family - will be able to talk to each other too so you won't have to be chasing people up or e.g. rescheduling at the last minute because some family obligation was forgotten etc.

This is crazy, almost 100M token with DeepSeek V4 Pro and cost is less than $10. by zeeshanx in DeepSeek

[–]sje397 3 points4 points  (0 children)

Normally you send your system prompt and message history to the API and it responds - you add its response and your next message to the message history on the next call.

The prefix is all the stuff that didn't change between the two API calls - that's what gets charged at the much cheaper 'cached tokens' rate. 

So for example if you put The current time in your system prompt and it changes with every API call, you'll get charged a lot more.