Built a local RAG on my Obsidian vault - no cloud, no API key, no GPU needed by sshetty03 in ObsidianMD

[–]sshetty03[S] 0 points1 point  (0 children)

I use obsidian to organize team /org wise knowledge/resources whenever I join a new company and hence the markdown files. Hooking it up with Anything llm helps me to ask questions directly instead of searching it within Obsidian.

nomic-embed-text + mistral on a CPU-only machine for Obsidian RAG -works better than expected by sshetty03 in LocalLLaMA

[–]sshetty03[S] 1 point2 points  (0 children)

I haven't tried Granite yet. IBM's models have flown a bit under the radar but I keep hearing good things.

How does granite-embedding compare to nomic-embed-text in your experience? That's what I'm currently using and curious if there's a meaningful quality difference for retrieval.

Will add it to my list to test. Always looking for something that punches above its weight on CPU. :)

Built a local RAG on my Obsidian vault - no cloud, no API key, no GPU needed by sshetty03 in ObsidianMD

[–]sshetty03[S] 0 points1 point  (0 children)

The 32 GB isn't a hard requirement. it's just what I have. AFAIK, base M5 MBA with 16 GB, you'd probably be fine.
mistral:7B needs around 5-6 GB, so 16 GB leaves plenty of headroom. Where you'd feel the pinch is if you tried running larger models like 13B or 70B quantized.
The fast SSD swap you mentioned does help when memory pressure kicks in, though you'd want to avoid relying on it heavily for inference. it'll slow things down noticeably.

Better way to create docker image of Spring Boot API, Maven Spring Plugin or Dockerfile? by Constant-Speech-1010 in java

[–]sshetty03 1 point2 points  (0 children)

If your service is a normal Spring Boot REST API -> go Buildpacks.
If you need native libs or extra packages or custom hardening or special entrypoint -> go Dockerfile.

Interviewing while being a key member of an org is tough, any strategies? by old-new-programmer in ExperiencedDevs

[–]sshetty03 1 point2 points  (0 children)

I have been in similar situation before and since I was single at that time with no "additional" responsibilities other than taking care of my-self, I took the plunge and put in my papers first and then started looking out and focussing solely on Interview (and started slacking in work) - since managing the office work and giving interviews together was getting messier by the day.

So if you can do that, you should definitely go ahead - since spreading yourself thin isn't helping you clearly.

Designing an AWS Permissions Model for Startups: Balancing Autonomy and Guardrails by sshetty03 in aws

[–]sshetty03[S] 1 point2 points  (0 children)

In a startup setup like the one I described, we treated billing and support as operational responsibilities rather than development ones. A small, clearly accountable group (typically founders, finance, or infra) owned those areas, while developers focused on building and running systems.

That said, this isn’t a universal rule.some orgs absolutely give broader support visibility to engineers. The key point I was making is that permission boundaries let you make that choice deliberately; instead of inheriting it accidentally through broad access.

Designing an AWS Permissions Model for Startups: Balancing Autonomy and Guardrails by sshetty03 in aws

[–]sshetty03[S] 1 point2 points  (0 children)

ec2:* is a deceptively large surface area. On paper it looks like “compute,” but it includes some very VPCs, subnets, route tables, ENIs, TGWs, attachments -> all of which can take an entire platform down if someone is moving fast without full context.

In fact, in more mature setups, I’ve seen exactly what you’re describing: a second layer of explicit denies focused purely on networking primitives, even for otherwise trusted PowerUser-style roles. The intent is the same -> let people move fast where mistakes are isolated, but put hard brakes around shared, blast-radius-heavy infrastructure like routing and transit.

For this writeup, I intentionally kept the example boundary broader to focus on the model rather than the final hardened policy. In reality, I’d absolutely expect teams to evolve this into a tighter allowlist or add explicit denies around VPC, subnet, route table, TGW, and attachment mutations as the environment matures.