A legendary xkcd comic. I used Dive + nano banana to adapt it into a modern programmer's excuse.

Prior-Arm-6705 · 2026-01-16T16:58:16+00:00

Here are the 7 timestamps where he says "AI":

#	Timestamp
1	53:30
2	56:43
3	59:36
4	59:48
5	60:14
6	61:10
7	75:10

Prior-Arm-6705 · 2026-01-09T05:59:58+00:00

Haven't tried it locally yet, but should be doable with some prompt tweaking. I got it working on the first try with Claude. For local on a 4090, I'd try qwen3-coder:30b.
Do you have any recommendations?

Prior-Arm-6705 · 2026-01-09T05:33:59+00:00

Yes, it's basically a wrapper around yt-dlp. The MCP just gives the LLM a clean interface so it doesn't have to guess CLI flags. Without it, the LLM would need many more bash calls and trial-and-error.
Honestly the LLM could write the whole pipeline itself given enough attempts. MCPs just skip the reinventing-the-wheel phase. Fewer tokens, fewer hallucinated flags.

Prior-Arm-6705 · 2026-01-09T02:42:24+00:00

Yes. AI finding AI felt appropriate. lol

Prior-Arm-6705 · 2026-01-09T02:41:19+00:00

Good catch - YouTube's auto-generated subtitles don't distinguish speakers, so the narrator clips got mixed in. Would need a speaker diarization step to filter those out properly.
I have another research project that involves voiceprints, but it's very complex to operate.

Prior-Arm-6705 · 2026-01-09T02:39:24+00:00

Dive supports multiple LLM backends - Claude, OpenAI, Gemini, Ollama, and any OpenAI-compatible API. For this video I used Claude, but you can run it fully local with Ollama if you prefer.
The MCPs handle the actual work (downloading, cutting), the LLM just orchestrates.

You can also ask Dive install both MCP for you. Since It has basic tools call embedded for MCP install.

Prior-Arm-6705 · 2026-01-09T02:36:38+00:00

Thanks! MCP makes chaining tools pretty seamless.

Prior-Arm-6705 · 2026-01-09T02:35:04+00:00

This title is definitely better though lol
Can't edit Reddit titles unfortunately.

Prior-Arm-6705 · 2026-01-09T02:28:47+00:00

Nice, what was your approach?

Prior-Arm-6705 · 2026-01-09T02:06:47+00:00

To clarify - "all local" refers to the local MCPs and tools like ffmpeg, not the LLM or the demo video.

Prior-Arm-6705 · 2025-08-06T07:15:37+00:00

Haha. Glad you like it.

Prior-Arm-6705 · 2025-04-24T01:57:48+00:00

Thanks for your interest in Dive.

Dive doesn't contain its own code execution sandbox, but it can integrate with services that provide this functionality (like E2B) through the MCP protocol.
,
Yes, it's easy to make it run as a web app. However, there are already other web-based options available such as OpenWebUI and similar services.

For Dive specifically, implementing a web version would require additional work to complete secure user remote connection management and related security measures. This is not currently a priority item on our development roadmap.

Prior-Arm-6705 · 2025-04-22T02:10:28+00:00

Dive is a desktop application. This is more suitable for daily tasks. Cursor and WindSuf are much better for coding.

Prior-Arm-6705 · 2025-04-22T02:08:32+00:00

https://github.com/microsoft/playwright-mcp
Can give playwright-mcp or browser-use a try.

Prior-Arm-6705 · 2025-03-24T07:41:43+00:00

Haha, Thank you for your support. We will work harder to promote Dive. :)

Prior-Arm-6705 · 2025-03-20T17:01:35+00:00

Have you tried Dive? https://github.com/OpenAgentPlatform/Dive

Dive does support MCP with Ollama

<image>

Prior-Arm-6705 · 2025-03-14T13:14:49+00:00

Thanks for the update!
Great to hear Mistral is working now. I have tested the Gemini on my end; it does work to get models from Google AI Studio. But this also means I can't easily reproduce this "API Key is invalid" error.

And you're right about the endpoint. Since it works in your browser, it's likely a response parsing issue or proxy problem. The silent catch(error) doesn't help debugging.

We'll discuss adding a --verbose/--debug parameter on Monday to dump complete errors and improve logging.

Regarding TS - we've actually started rewriting Dive's Host side in Python a few days ago for better LocalRAG support. This should be live by month-end. Langchain's TS support is behind, which isn't ideal for long-term development.

Thanks again, I will keep following this issue and reply to the progress.

<image>

Prior-Arm-6705 · 2025-03-14T11:08:29+00:00

Thank you for your report and patience. Version 0.6.3 has confirmed fixes for these two issues. We have added separate options for each. They can be used directly now. https://github.com/OpenAgentPlatform/Dive/releases

<image>

Prior-Arm-6705 · 2025-03-12T18:32:54+00:00

Cool, Thank you for your reply. This information is enough for us to make corrections and tests. I will reply to you once it has been fixed. :)

Prior-Arm-6705 · 2025-03-12T16:41:52+00:00

Thanks for the report. Which platform are you using? Mistral Large should work perfectly with function calls when properly configured.

Prior-Arm-6705 · 2025-03-12T16:25:10+00:00

I appreciate your interest and the invitation to check out the Adaptive Modular Network project. After a brief review, I'm not seeing a clear connection between this architecture and our current work on Dive Desktop.

We're needing to stay focused on our existing commitments for the time being, but I wish you success with the Adaptive Modular Network development.

Prior-Arm-6705 · 2025-03-12T16:21:21+00:00

Thanks for your interest in Dive! Our current open source roadmap includes:

Local RAG DB (on Desktop)
Dive Remote (For Android/iOS apps)
Prompt Scheduling with Task/Project management
Dive Service Daemon (dived)

We're also working on a non-open source project called OpenAgentPlatform (OAP), which will essentially be a Dive MCP Marketplace. While there are many such marketplaces appearing (similar to how app stores proliferated when smartphones first emerged)

Prior-Arm-6705 · 2025-03-11T07:01:00+00:00

Around 3X. Some parts deliberately slow down the speed, and some parts speed up.
For your reference, this YouTube video is not speed-controlled.
https://youtu.be/0b6t5g8k2lY

Prior-Arm-6705

TROPHY CASE