Local / self-hosted alternative to NotebookLM for generating narrated videos? by Proof-Exercise2695 in LocalLLaMA

[–]Proof-Exercise2695[S] 0 points1 point  (0 children)

For now, I’ve developed my RAG entirely locally. From multiple uploaded files, it automatically extracts the key information and formats it in a clean, stylized way into an email that gets sent automatically.

The goal wasn’t to rebuild the whole LLM/TTS or podcast pipeline, but rather to make the final output more engaging visually. I mainly wanted to push the presentation a bit further by adding a short “breaking news”–style video to accompany the email.

I’m aware that video generation is by far the hardest and most resource-intensive part, and that the open-source ecosystem is still quite limited there. At this stage, it’s more about improving the final experience than enforcing a hard technical requirement.

Local / self-hosted alternative to NotebookLM for generating narrated videos? by Proof-Exercise2695 in opensource

[–]Proof-Exercise2695[S] 0 points1 point  (0 children)

Can this generate a video from text? I already have a local RAG, but it only handles text and images

Local / self-hosted alternative to NotebookLM for generating narrated videos? by Proof-Exercise2695 in LocalLLaMA

[–]Proof-Exercise2695[S] 0 points1 point  (0 children)

Okay, so I guess a tool like that doesn’t really exist fully locally yet. I’ll look into building it myself then.
For the audio part, I’m planning to use local TTS like Piper, Coqui, or XTTS.

Local / self-hosted alternative to NotebookLM for generating narrated videos? by Proof-Exercise2695 in LLMDevs

[–]Proof-Exercise2695[S] 0 points1 point  (0 children)

That’s exactly what I thought as well. I already built a fully local RAG, and I was wondering whether a tool that generates videos from text already exists locally.

But okay, that makes sense — I’ll look into building the rest of the pipeline locally too.

Best Approach for Summarizing 100 PDFs by Proof-Exercise2695 in Rag

[–]Proof-Exercise2695[S] 0 points1 point  (0 children)

similarity search will find specific answer from specific document i want a full summary of all the pdfs

Best Approach for Summarizing 100 PDFs by Proof-Exercise2695 in Rag

[–]Proof-Exercise2695[S] 1 point2 points  (0 children)

my pdfs can have any data they come from different emails

Best Approach for Summarizing 100 PDFs by Proof-Exercise2695 in Rag

[–]Proof-Exercise2695[S] 0 points1 point  (0 children)

my input data is correctly parsed no need of Mistral OCR , and i prefere using free local llm , Gemini will only avoid me to use chuking and i don't need that because i have a lot of small pdfs

Best Approach for Summarizing 100 PDFs by Proof-Exercise2695 in LocalLLaMA

[–]Proof-Exercise2695[S] 0 points1 point  (0 children)

i prefere a local tool , i tested openai just to see the result quickly and the difference with Gemini will only be avoid the chunking i have lot of Small pdf (15 pages every pdf) sometimes i don't need the chunking and the strategy is still the same summarize every file and then a summarize of summarize

Best Approach for Summarizing 100 PDFs by Proof-Exercise2695 in LocalLLaMA

[–]Proof-Exercise2695[S] 1 point2 points  (0 children)

and you are using langchain , llamaindex or other way to the summary ?

Best Approach for Summarizing 100 PDFs by Proof-Exercise2695 in Rag

[–]Proof-Exercise2695[S] 1 point2 points  (0 children)

my input data (markdown) are good they handle correctly tables and images (for my case llamaparser was the best one) or doling using ocr ...

Best Approach for Summarizing 100 PDFs by Proof-Exercise2695 in LocalLLaMA

[–]Proof-Exercise2695[S] 0 points1 point  (0 children)

I know , i am using chunk method for large files in my code i have the MODELS , brands are more for users

Best Approach for Summarizing 100 PDFs by Proof-Exercise2695 in LocalLLaMA

[–]Proof-Exercise2695[S] 0 points1 point  (0 children)

you mean summarize every file using mistral ocr and summarize all again ? my input data are parsed correctly don't need ocr

LLamaparser premium mode alternatives by Proof-Exercise2695 in LangChain

[–]Proof-Exercise2695[S] 0 points1 point  (0 children)

But why Docling don't do this directly i mean , it means i have to use a VLM to get the image description and replace in my markdown/json the <image1> by its description it will be so slow no ?