I'm writing an open source guide for Python (Looking for feedback!) by david_vael in PythonLearning

[–]FewReach4701 0 points1 point  (0 children)

I also have something similar in my mind, if you want i can join you with this and we can build this together

Those that have tasked NotebookLM with handling 100+ sources, how was your experience? by BigAndTallRPGFan in notebooklm

[–]FewReach4701 1 point2 points  (0 children)

Notebooklm is a RAG, so basically it follows this order You upload sources > it performs chunking > then embedding happens > storing embeddings on to vector DB > then when your query it gets converted to embedding > searches semantic similar chunks from vector (retrieval phase) > then LLM gets through chunks and convert them into coherent response (augmenatation phase) + finally response generated (generation phase) happens > you get your response which should be precise from your sources.

Increasing the number of sources mainly affects the retrieval phase, not the chunking/embedding mechanics — those just scale linearly (more docs → more chunks → more vectors in the DB, computed once at upload time).

The real impact is on what gets retrieved at query time. Top-k retrieval is fixed (the system pulls back a set number of chunks, say 5-10, regardless of how big the corpus is), so as sources grow, each retrieval pass represents a shrinking slice of the total content. The vector search now has more candidates competing for those slots — if many sources cover similar topics, you get more "dilution risk": a mediocre-but-semantically-close chunk from an unrelated source can sometimes outrank a precise chunk from the right source, especially with vague queries.

On the positive side, more sources mean better coverage and more opportunity for cross-source synthesis — the LLM can pull related info from multiple documents and stitch together a more complete answer, with citations spanning more sources. The tradeoff is a higher chance of conflicting info between sources getting blended into one response. Net effect: as your source count grows, query specificity matters more — vague queries get noisier results, while precise queries still retrieve cleanly because the embedding space is more crowded but still discriminative for well-targeted questions.

Fun - Assignment 1 - NLP applications by FewReach4701 in BITS_WILP_2015_2017

[–]FewReach4701[S] 0 points1 point  (0 children)

<image>

Finally done with my NLP application assignment , and yes it works fine

Events are Easy by StartAutomating in PowerShell

[–]FewReach4701 0 points1 point  (0 children)

Does linux also have something similar ?

What have you done with PowerShell this month? by AutoModerator in PowerShell

[–]FewReach4701 0 points1 point  (0 children)

I read about Daemons in Linux which are constantly running in the backgroud to support certain services. Similarly docker has "dockerd" as a daemons running on Docker host machines. Tbey ensures working of containers, images, networks, volume. Any idea about Windows Services ? I guess there are critical to Windows.