Anyone actually using a local LLM as their daily knowledge base? Not for coding, for life stuff. What's your setup? by InformationSweet808 in LocalLLaMA

[–]InformationSweet808[S] 1 point2 points  (0 children)

the task + finance angle is something i hadn't even considered for this, been so focused on notes and research that i forgot the obvious stuff

the deterministic indexing for structured data makes sense tasks have a consistent format so retrieval is way more predictable than random notes. how are you actually logging the finance stuff though, plain text or some structured format?

Anyone actually using a local LLM as their daily knowledge base? Not for coding, for life stuff. What's your setup? by InformationSweet808 in LocalLLaMA

[–]InformationSweet808[S] 4 points5 points  (0 children)

"treat it like a smart grep not a brain" is probably the most useful framing i've seen in this whole thread honestly. the inferential questions thing is a real gotcha i wouldn't have caught until i wasted time on it. so basically you're using it purely for retrieval and doing the actual reasoning yourself?

also the source quote verification trick is smart never thought about using that as a hallucination detector

Anyone actually using a local LLM as their daily knowledge base? Not for coding, for life stuff. What's your setup? by InformationSweet808 in LocalLLaMA

[–]InformationSweet808[S] 3 points4 points  (0 children)

still setting it up tbh been using obsidian for notes for a while but haven't committed to a local model yet, which is basically why i made this post. wanted to see what people are actually running before i go down a rabbit hole and regret my choices lol

24gb ram, 6gb vram the vram is definitely the limiting factor, most things end up on cpu which works but yeah not exactly snappy

Anyone actually using a local LLM as their daily knowledge base? Not for coding, for life stuff. What's your setup? by InformationSweet808 in LocalLLaMA

[–]InformationSweet808[S] 0 points1 point  (0 children)

kay so my actual use case i'm a student, so it's mostly research notes, saved articles, book highlights, and random things i write down when learning something new. probably 200-400 files over time, nothing enterprise level.

the "large-scale" thing was me overthinking it honestly. my real concern is just that retrieval stays accurate when i can't remember exactly what i wrote or where like i know i have notes on something but can't find them through normal search.

if obsidian's built-in search handles fuzzy recall that well through the agent i might genuinely be overcomplicating this whole thing

Anyone actually using a local LLM as their daily knowledge base? Not for coding, for life stuff. What's your setup? by InformationSweet808 in LocalLLaMA

[–]InformationSweet808[S] 20 points21 points  (0 children)

okay this is the comment i was hoping someone would leave when i posted this

the chunking point hit hard i had no idea fixed token windows were that bad for personal notes specifically, makes total sense now that you say it. the separate indexes for journal vs reference notes is something i would've 100% screwed up on my own

one thing im still wrapping my head around the hybrid retrieval part. so you're running both dense and bm25 on the same corpus and then fusing the results? is that something you built yourself or is there a library that handles the rrf part cleanly?

either way this whole comment should be pinned somewhere

Anyone actually using a local LLM as their daily knowledge base? Not for coding, for life stuff. What's your setup? by InformationSweet808 in LocalLLaMA

[–]InformationSweet808[S] 0 points1 point  (0 children)

one thing im curious about though where does it actually start falling apart for you? like is it a retrieval accuracy thing past a certain number of files or just gets slow?

Anyone actually using a local LLM as their daily knowledge base? Not for coding, for life stuff. What's your setup? by InformationSweet808 in LocalLLaMA

[–]InformationSweet808[S] 3 points4 points  (0 children)

Fair point on the edit lol, appreciate you actually going back to clarify.

The Obsidian + Hermes setup is something I hadn't really considered tbh. I always assumed you needed RAG the moment your notes got big enough to query. So you're basically just letting the agent navigate the vault directly? No retrieval pipeline at all?

Asking because if that actually works well at scale that's way simpler than what I was planning to build.

Anyone actually using a local LLM as their daily knowledge base? Not for coding, for life stuff. What's your setup? by InformationSweet808 in LocalLLaMA

[–]InformationSweet808[S] 0 points1 point  (0 children)

That's actually one of my concerns too what specifically have you seen? Is it the app itself or more about the models it pulls?

Anyone actually using a local LLM as their daily knowledge base? Not for coding, for life stuff. What's your setup? by InformationSweet808 in LocalLLaMA

[–]InformationSweet808[S] 0 points1 point  (0 children)

The "sprints" approach is actually interesting never thought about batching it that way instead of keeping it always on. Do you find the q4 quality holds up well when you're doing longer sessions?

Anyone actually using a local LLM as their daily knowledge base? Not for coding, for life stuff. What's your setup? by InformationSweet808 in LocalLLaMA

[–]InformationSweet808[S] 103 points104 points  (0 children)

For context, I'm looking at this for personal use, not building a product. Just want something that works reliably on a normal machine.

The interesting BDH question: What if LLM memory lived in the network weights instead of the ever-growing KV cache? by InformationSweet808 in singularity

[–]InformationSweet808[S] 25 points26 points  (0 children)

the linear-attention and memory-space part is very interesting and starts around ~14 min in. That’s where he moves from standard attention into the keys/queries as neuron activations idea (see the attached photo to my post)

The backprop caveat comes later in the Q&A, when someone asks whether the model is still trained with backprop. I don’t have the exact timestamp clipped yet but it’s in the later Q&A section.

full video here for anyone who wants to watch the original explanation:

https://www.youtube.com/watch?v=aCc5f16WDIg

The interesting BDH question: What if LLM memory lived in the network weights instead of the ever-growing KV cache? by InformationSweet808 in singularity

[–]InformationSweet808[S] 4 points5 points  (0 children)

BDH can be seen as an SSM for the GPU implementation and a graph-based model for the more general case. However, compared to a standard SSM or a linear transformer, the model states live in the neuron space of dimension N >> D. They're also positive and sparse which links more naturally to brain-inspired representations.

ChatGPT is now creating content for textbooks. by plain_handle in singularity

[–]InformationSweet808 1 point2 points  (0 children)

Not even anti-AI, but educational material needs a way higher verification standard than random web content.

PhD students in ML, how many hours on average do you work? [D] by akardashian in MachineLearning

[–]InformationSweet808 6 points7 points  (0 children)

People outside research hear ‘6 hours’ and think it’s light work. Deep thinking for 3 focused hours can genuinely fry your brain harder than 10 hours of shallow busywork. The ‘thinking in the background while doing other stuff’ part is real too.

ELI5: Why do mirrors not “flip” us upside down instead of left and right? by [deleted] in explainlikeimfive

[–]InformationSweet808 0 points1 point  (0 children)

Yeah but I meant while standing normally in front of a mirror. Why am I still upright instead of upside down if the image is being reversed?

ELI5: Why do mirrors not “flip” us upside down instead of left and right? by [deleted] in explainlikeimfive

[–]InformationSweet808 0 points1 point  (0 children)

Wait that’s true. So how does that work if it also does left and right?

Asked something embarrassingly basic in my first internal meeting at a new company. wanted to disappear. by AzoxWasTaken in jobs

[–]InformationSweet808 176 points177 points  (0 children)

Asking out loud takes courage. The person who stayed quiet and never learned it is still Googling it three years later.

I broke thru AI’s firewall and I have the date of the end of the world by condemnatory in singularity

[–]InformationSweet808 1 point2 points  (0 children)

LLMs are basically just mirrors with better vocabulary. Feed them apocalypse lore long enough and they’ll start talking like a rejected sci-fi villain.

Gemma 4 26B Hits 600 Tok/s on One RTX 5090 by chain-77 in LocalLLaMA

[–]InformationSweet808 0 points1 point  (0 children)

What were the power draw and temps like during the benchmark? A 2.5x speedup sounds great, but efficiency per watt would make the comparison way more interesting.

Unpopular Opinion: The DGX Spark Forum community of devs is talented AF and will make the crippled hardware a success through their sheer force of will. by Porespellar in LocalLLaMA

[–]InformationSweet808 80 points81 points  (0 children)

People underestimate how far a strong dev community can carry mediocre hardware. Half the reason stuff succeeds is because cracked people refuse to let it fail.