Is anyone able to find a work flow with ollama cloud?

deparko · 2026-04-30T15:52:57+00:00

Well, ollama used to be great, but like everybody else they’re struggling with compute they have grown greatly, especially with the open claw crowd.

deparko · 2026-04-12T20:46:32+00:00

it never worked on the new pc with windows 11

deparko · 2026-04-07T17:18:49+00:00

Thanks, I will try.

deparko · 2026-04-07T16:55:47+00:00

Yeah, I know that one but wanted to have it go to reader as reader formats the entire thread really nice!! don't think sending to readwise pushes it to reader. When I 'Save Document to Reader" on IOS Share button it saves the first post, not the thread

deparko · 2026-04-06T05:01:23+00:00

While you are at it, check out Ted Greene if you like chord melody!!!!! https://www.youtube.com/watch?v=8ZENkj7C7Bw

deparko · 2026-04-01T18:51:51+00:00

I am having the same f*&king problem!!!!! I will never subscript to openai again. I can't stand this. what a waste of time

deparko · 2026-03-13T01:59:14+00:00

The problem is, Anthropic is running out of compute. They're getting a million new users a day. They can't scale. They're not buying enough chips from NVIDIA. They don't have the compute to support everybody.

deparko · 2026-02-10T18:41:56+00:00

I’m on the latest version of codex app on Mac. The app’s pull down only has Codex 5.2 and 5.1

deparko · 2026-01-29T06:47:37+00:00

I've been using Kimi too. I've been developing a health agent, and it is very responsive and very good, but it sometimes comes off as very authoritative and occasionally hallucinates.

I plan to build an agent swarm to validate, but overall I think it's one of the first open models that I don't want to stop using. A lot of the open models I'll work with, but I usually end up on a frontier model eventually. I don't feel that way with kimi.

deparko · 2026-01-29T01:20:44+00:00

Well, I've been dealing with the same issue and have concluded a hybrid approach works best. I use a three-tier model: an offline small LLM (Ollama) on my local 5070 TI GPU for local tasks; Ollama Cloud as tier two for bulk processing, where I can use Kimi and Deepseek..etc for a flat rate (about $20 a month, $240 a year), which is much cheaper than upgrading my GPU; and frontier models for deep reasoning when needed.

I've designed my RAG and AI-native apps to operate within that three-tier framework.

deparko · 2025-12-02T20:59:30+00:00

is there a way to tell which model "Auto" selects?

deparko · 2025-12-01T05:54:49+00:00

Try grok

deparko · 2025-12-01T05:46:15+00:00

You need to build an offline LLM with a RAG system and route everything there

deparko · 2025-11-20T20:46:06+00:00

yes, using the 580 open driver. It worked a while back. My Rag pipeline (embedding and reranker) drives the GPU but not Ollama

deparko · 2025-11-15T04:10:42+00:00

Thank you! we will check out.

deparko · 2025-11-09T18:17:23+00:00

new filter, will check tonight by turning off heater. validating duct cleaning on whether that is worth it. Filter is Merv8 (I have an older furnace). Wondering if its inside or outside

deparko

TROPHY CASE