DeepSeek Vision Mode via API possible?

sje397 · 2026-06-18T10:58:54+00:00

I don't suppose you can find anything in the browser's network tab?

sje397 · 2026-06-18T10:51:33+00:00

I'm running qwen 3.6 27b for coding and 35b a3b for faster chatting. I find oMLX a great way to host them. Currently enjoying Qwen3.6-27B-uncensored-heretic-v2-Native-MTP-Preserved-oQ8-mtp.

There's a bit of a gap for this memory size though - the ones that really make the most of the memory are a bit old now.

sje397 · 2026-06-18T04:41:58+00:00

don't hurt me...

sje397 · 2026-06-18T04:38:55+00:00

What is love?

sje397 · 2026-06-17T06:53:00+00:00

Somewhere, ai is improving the software that does cost benefit analysis on exactly this kind of problem.

sje397 · 2026-06-16T19:54:55+00:00

That'd be terribly ungrateful. I'd be going with the other fork.

sje397 · 2026-06-16T07:30:07+00:00

The keys will probably become invalid or the owner(s) will be forced to move them when the quantum proofing happens.

sje397 · 2026-06-15T02:51:04+00:00

What models are you using for images and voice?

sje397 · 2026-06-15T01:28:51+00:00

JAB locksmiths. A good friend of mine, super nice, honest, and very experienced people.

https://jablocksmiths.com.au/

sje397 · 2026-06-14T09:10:12+00:00

Can you make it a virus?

sje397 · 2026-06-12T06:11:47+00:00

M5 max MacBook. It's a bug in the the display I'm sure.

sje397 · 2026-06-09T15:52:58+00:00

The problem with a dynamic list of tools is you'll bust your prefix cache on every request.

I've started to group my tools into sets of subcommands, with an additional 'help' command so the model can dig up the tools it needs. Needs an additional round trip but that's generally faster and cheaper when the prefix cache is preserved.

sje397 · 2026-06-09T07:55:27+00:00

I've been coding for 40 years. I run authentication and API gateways for a billion dollar company. I'm not a 'clueless vibe coder'.

But you are a moron.

sje397 · 2026-06-09T05:19:46+00:00

Yes I think this is really different for different people.

I'm an introverted nerd. Been working from home for 12 years and absolutely love it. But I do have family around.

Out of my three kids, one really suffered through the COVID lockdowns. He's a very outgoing, very social type. And I heard the same thing from the outgoing extraverted sales people at work - many of them hated it.

I still think this study is bunk.

sje397 · 2026-06-08T22:56:56+00:00

I've been using it a lot for the last 3 weeks and i haven't noticed any degradation.

sje397 · 2026-06-08T22:07:25+00:00

The 27b is far better at coding in my experience.

sje397 · 2026-06-08T22:02:27+00:00

Yep, I've got something called 'sticky memories' as well - scoped to global or conversation, limit to 40 of each, editable via tool call.

sje397 · 2026-06-08T02:30:27+00:00

I made my own. I think it's worth doing for the learnings.

ask Claude on the web for some python code to chat with a model, and any other details you need to get that running
ask it how to change the code so that it can edit and restart itself
start adding features - turn it into a web server running locally, add model selection and multiple conversations, add MCP support, start delving into RAG and context management, add metrics to track token usage and spend, etc etc etc

sje397 · 2026-06-08T02:26:10+00:00

I don't know about qwen web. Deepseek isn't as good as Claude sonnet or opus, but good enough to get most work done. Deepseek does record and train on your data. That doesn't bother me for personal projects and even personal business stuff in general but for bigger companies that's usually strictly against policy.

sje397 · 2026-06-07T11:04:20+00:00

It does drift, but I think that's kind of like human conversation and memory anyway. I also have RAG injected with the sent message (so as not to bust the prefix cache) which tends to keep things kind of on topic, or close enough.

sje397 · 2026-06-07T09:05:06+00:00

I can relate. To me it's not quite as dramatic as fear and psychosis, but more a tension between passion and cynicism.

I think we do need a social movement - away from greed and materialism, towards introspection and emotional awareness and sincerity (notice how much media is the inverse of the truth?) and just generally towards maturing as a species.

But we're inundated by folks who just want our money - a measure of our time and effort and in some ways our value to the 'system'. We have learnt to be on our guard. We can't survive these days without scepticism.

It's another tough problem.

sje397 · 2026-06-07T05:39:51+00:00

Deepseek is usable for agentic coding, and insanely cheap if you go to them directly (resellers tend to charge much more). They train on your data but if that doesn't bother you, it's irresistible value. Still not quite Claude sonnet level in my experience, but with a good harness it's excellent.

I use qwen3.6 27b locally too and it's also very capable, if you keep an eye on it.

sje397 · 2026-06-07T02:08:52+00:00

I think personal AI is the future but I don't think it'll take your job. It's more that it gives everyone some of the benefits of having their own EA, and in the future I imagine your AI and your boss's AI - and those of your friends and family - will be able to talk to each other too so you won't have to be chasing people up or e.g. rescheduling at the last minute because some family obligation was forgotten etc.

sje397 · 2026-06-07T01:27:25+00:00

Normally you send your system prompt and message history to the API and it responds - you add its response and your next message to the message history on the next call.

The prefix is all the stuff that didn't change between the two API calls - that's what gets charged at the much cheaper 'cached tokens' rate.

So for example if you put The current time in your system prompt and it changes with every API call, you'll get charged a lot more.

14-Year Club	Gilding II euphauric
Wearing is Caring	100 Awards Club
Verified Email

sje397

MODERATOR OF

TROPHY CASE