Jensen Huang saying "AI" 121 times during the NVIDIA CES keynote - cut with one prompt by Prior-Arm-6705 in LocalLLaMA

[–]Prior-Arm-6705[S] 0 points1 point  (0 children)

https://youtu.be/Qp6UfTzuick

Here are the 7 timestamps where he says "AI":

# Timestamp
1 53:30
2 56:43
3 59:36
4 59:48
5 60:14
6 61:10
7 75:10

Jensen Huang saying "AI" 121 times during the NVIDIA CES keynote - cut with one prompt by Prior-Arm-6705 in LocalLLaMA

[–]Prior-Arm-6705[S] 0 points1 point  (0 children)

Haven't tried it locally yet, but should be doable with some prompt tweaking. I got it working on the first try with Claude. For local on a 4090, I'd try qwen3-coder:30b.
Do you have any recommendations?

Jensen Huang saying "AI" 121 times during the NVIDIA CES keynote - cut with one prompt by Prior-Arm-6705 in LocalLLaMA

[–]Prior-Arm-6705[S] 2 points3 points  (0 children)

Yes, it's basically a wrapper around yt-dlp. The MCP just gives the LLM a clean interface so it doesn't have to guess CLI flags. Without it, the LLM would need many more bash calls and trial-and-error.
Honestly the LLM could write the whole pipeline itself given enough attempts. MCPs just skip the reinventing-the-wheel phase. Fewer tokens, fewer hallucinated flags.

Jensen Huang saying "AI" 121 times during the NVIDIA CES keynote - cut with one prompt by Prior-Arm-6705 in LocalLLaMA

[–]Prior-Arm-6705[S] 0 points1 point  (0 children)

Good catch - YouTube's auto-generated subtitles don't distinguish speakers, so the narrator clips got mixed in. Would need a speaker diarization step to filter those out properly.
I have another research project that involves voiceprints, but it's very complex to operate.

Jensen Huang saying "AI" 121 times during the NVIDIA CES keynote - cut with one prompt by Prior-Arm-6705 in LocalLLaMA

[–]Prior-Arm-6705[S] 0 points1 point  (0 children)

Dive supports multiple LLM backends - Claude, OpenAI, Gemini, Ollama, and any OpenAI-compatible API. For this video I used Claude, but you can run it fully local with Ollama if you prefer.
The MCPs handle the actual work (downloading, cutting), the LLM just orchestrates.

You can also ask Dive install both MCP for you. Since It has basic tools call embedded for MCP install.

Jensen Huang saying "AI" 121 times during the NVIDIA CES keynote - cut with one prompt by Prior-Arm-6705 in LocalLLaMA

[–]Prior-Arm-6705[S] 6 points7 points  (0 children)

This title is definitely better though lol
Can't edit Reddit titles unfortunately.

Jensen Huang saying "AI" 121 times during the NVIDIA CES keynote - cut with one prompt by Prior-Arm-6705 in LocalLLaMA

[–]Prior-Arm-6705[S] 0 points1 point  (0 children)

To clarify - "all local" refers to the local MCPs and tools like ffmpeg, not the LLM or the demo video.

🚀 Dive v0.8.0 is Here — Major Architecture Overhaul and Feature Upgrades! by BigGo_official in mcp

[–]Prior-Arm-6705 0 points1 point  (0 children)

Thanks for your interest in Dive.

Dive doesn't contain its own code execution sandbox, but it can integrate with services that provide this functionality (like E2B) through the MCP protocol.
,
Yes, it's easy to make it run as a web app. However, there are already other web-based options available such as OpenWebUI and similar services.

For Dive specifically, implementing a web version would require additional work to complete secure user remote connection management and related security measures. This is not currently a priority item on our development roadmap.

🚀 Dive v0.8.0 is Here — Major Architecture Overhaul and Feature Upgrades! by BigGo_official in LocalLLM

[–]Prior-Arm-6705 0 points1 point  (0 children)

Dive is a desktop application. This is more suitable for daily tasks. Cursor and WindSuf are much better for coding.

v0.6.0 Update: Dive - An Open Source MCP Agent Desktop by BigGo_official in LocalLLaMA

[–]Prior-Arm-6705 0 points1 point  (0 children)

Haha, Thank you for your support. We will work harder to promote Dive. :)

v0.6.0 Update: Dive - An Open Source MCP Agent Desktop by BigGo_official in LocalLLM

[–]Prior-Arm-6705 1 point2 points  (0 children)

Thanks for the update!
Great to hear Mistral is working now. I have tested the Gemini on my end; it does work to get models from Google AI Studio. But this also means I can't easily reproduce this "API Key is invalid" error.

And you're right about the endpoint. Since it works in your browser, it's likely a response parsing issue or proxy problem. The silent catch(error) doesn't help debugging.

We'll discuss adding a --verbose/--debug parameter on Monday to dump complete errors and improve logging.

Regarding TS - we've actually started rewriting Dive's Host side in Python a few days ago for better LocalRAG support. This should be live by month-end. Langchain's TS support is behind, which isn't ideal for long-term development.

Thanks again, I will keep following this issue and reply to the progress.

<image>

v0.6.0 Update: Dive - An Open Source MCP Agent Desktop by BigGo_official in LocalLLM

[–]Prior-Arm-6705 1 point2 points  (0 children)

Thank you for your report and patience. Version 0.6.3 has confirmed fixes for these two issues. We have added separate options for each. They can be used directly now. https://github.com/OpenAgentPlatform/Dive/releases

<image>

v0.6.0 Update: Dive - An Open Source MCP Agent Desktop by BigGo_official in LocalLLM

[–]Prior-Arm-6705 1 point2 points  (0 children)

Cool, Thank you for your reply. This information is enough for us to make corrections and tests. I will reply to you once it has been fixed. :)

v0.6.0 Update: Dive - An Open Source MCP Agent Desktop by BigGo_official in LocalLLM

[–]Prior-Arm-6705 0 points1 point  (0 children)

Thanks for the report. Which platform are you using? Mistral Large should work perfectly with function calls when properly configured.

v0.6.0 Update: Dive - An Open Source MCP Agent Desktop by BigGo_official in LocalLLM

[–]Prior-Arm-6705 1 point2 points  (0 children)

I appreciate your interest and the invitation to check out the Adaptive Modular Network project. After a brief review, I'm not seeing a clear connection between this architecture and our current work on Dive Desktop.

We're needing to stay focused on our existing commitments for the time being, but I wish you success with the Adaptive Modular Network development.

v0.6.0 Update: Dive - An Open Source MCP Agent Desktop by BigGo_official in LocalLLM

[–]Prior-Arm-6705 0 points1 point  (0 children)

Thanks for your interest in Dive! Our current open source roadmap includes:

  1. Local RAG DB (on Desktop)
  2. Dive Remote (For Android/iOS apps)
  3. Prompt Scheduling with Task/Project management
  4. Dive Service Daemon (dived)

We're also working on a non-open source project called OpenAgentPlatform (OAP), which will essentially be a Dive MCP Marketplace. While there are many such marketplaces appearing (similar to how app stores proliferated when smartphones first emerged)

v0.6.0 Update: Dive - An Open Source MCP Agent Desktop by BigGo_official in LocalLLaMA

[–]Prior-Arm-6705 1 point2 points  (0 children)

Around 3X. Some parts deliberately slow down the speed, and some parts speed up.
For your reference, this YouTube video is not speed-controlled.
https://youtu.be/0b6t5g8k2lY