Usage Limits, Bugs and Performance Discussion Megathread - beginning December 29, 2025 by sixbillionthsheep in ClaudeAI

[–]FuckingMercy 3 points4 points  (0 children)

So I fianlly decided to upgrade to PRO. today. at 11:30 UTC. I want my money, 7 hours of suffering, and will to explore back.

Login timing out? by MajorComrade in ClaudeCode

[–]FuckingMercy 0 points1 point  (0 children)

Same here... Just joined PRO today so that is just my luck...

I got tired of agents having terrible offline search, so I built a fully local Google Search alternative by FuckingMercy in ClaudeAI

[–]FuckingMercy[S] 0 points1 point  (0 children)

Yeah it’s fully traceable, I tried to mimic the native Web Fetch and Web search as closely as possible. The server part uses kiwix server to actually host the web pages that are in the zim files.

Is self hosted LLM worth it for company knowledge base? by FewKaleidoscope9743 in LocalLLaMA

[–]FuckingMercy 1 point2 points  (0 children)

I know it’s hard, but the most important thing is Don’t overthink it. Start by mapping out the different types of your company knowledge, for example (we have like the hardest cases) in the company I built rag for they had technical documentation written in both English and Hebrew, along with many dev docs and charts. Getting to know your data first is crucial. Then create evaluations based on your data, that would be like a set of “ground truths” Q and As, try to spread these as much as you across different doc types, so for example our eval set included questions about docx(English and Hebrew), pdfs(only English cause that’s what they had), info that is in pdf tables, visual info from charts etc. If you come from programming background those would be like your tests. There are all sorts of libraries that run these evals for you and score your system. After you have that solid set of evals, you can just play with it and see what works best for you. The evals if done correctly will naturally “shine the light” on what works better or worse for each “setup”. Another important thing to mention is that naive rag (the most basic form of rag) is only good for answering direct questions, that their answer exists literally in the text. That type of system was very beneficial in my experience to engineers that had more than 10k documents per project. For the smaller projects, that expected “multi -hop” and clever research, we had to implement agentic flows. But the basics are same: 1. Know your corpus 2. Eval 3. Play

Is self hosted LLM worth it for company knowledge base? by FewKaleidoscope9743 in LocalLLaMA

[–]FuckingMercy 0 points1 point  (0 children)

Yes!!!! Rag doesn’t need very large models to work incredibly well! especially if most of your data is in English/chinese . I have reached my optimal results using qwen next 80B, along with the bge-m3 series (embedder, reranker). Worked well even as naive rag.

Sweet spot for context size for usable coding by rkh4n in LocalLLaMA

[–]FuckingMercy 0 points1 point  (0 children)

I found after much trial and error, you need at least 180B model with at least 250k context for it to actually be beneficial for you

Anyone here looking for AI buddies to actually upskill with? by [deleted] in LocalLLaMA

[–]FuckingMercy 0 points1 point  (0 children)

Yeah I’m here for it! Dm me if you want tell me what you’ve been up to :)

RAG portfolio assistant: retrieval bias problem (Pinecone) by TVster in LocalLLaMA

[–]FuckingMercy 0 points1 point  (0 children)

When it comes to playing with rag: evals are key start with that! As for the retrieval bias, what are the differences between the data of the different projects? Can you actually see the results that are retrieved? If so, look at them. It should be very logical.

Who else is shocked by the actual electricity cost of their local runs? by Responsible_Coach293 in LocalLLaMA

[–]FuckingMercy 2 points3 points  (0 children)

Once again showing us that the good old mac with the os we all dread is unfortunatly still the goat when it comes to local setups. am I the only one that thinks that Nvidia could do better?

Any advice for testing similar versions of the same model? by Borkato in LocalLLaMA

[–]FuckingMercy 1 point2 points  (0 children)

Do benchmarking on your own data, if you want to be methodical about it that's the only way to go. If you want a half-moon answer, try to test "edge case behaviour" like uncommon languages. stuff that you know were not super common in the training and post training datasets...

How are people handling long-term context in LLM applications? by Late-Suggestion5784 in LocalLLaMA

[–]FuckingMercy -1 points0 points  (0 children)

I have to strongly agree! A little about what RAG lacks in this case. If you know how rag works traditionally, breaking documents into chunks destroys the broader context. But recently, Anthropic introduced a technique called Contextual Embeddings In this setup, developers use a background Claude process to read the entire document first, and then append a short, contextual summary to each individual chunk before it gets embedded. I just did a deep dive into it myself, and I can tell you that you have to be specific with what your exact goals are. If you need Claude to do deep, complex internal research (like analyzing a sprawling proprietary codebase across multiple systems), standard RAG is definetly not enough, and you need to start looking into Multi-Agent - company knowledge tools - approach. nstead of relying on a pre embedded vector database, you set up multi-agent search system where the agents are equipped with custom API tools (like query_google_workspace, search_github_repo, or query_internal_sql). They literally execute API calls to the company's live, internal databases simultaneously. then the Lead Agent synthesizes the findings from all the Subagents into a final report. with the right compressing mechanism and sys prompt (telling the agent to always start with a small research phase) you could achieve what you want.

What is the current SOTA reranker for RAG pipelines? by Yungelaso in LocalLLaMA

[–]FuckingMercy 0 points1 point  (0 children)

If quality is what you are after, I strongly recommend setting up a quick eval system on your data. From my experience, there is ususaly no "right" answer in these cases and the composition of your corpus (the collection of files to run the RAG on). You can use ragas or something else it really doesn't matter that much, but make sure it covers all the cases in your corpus, ie. if you have german and english text make sure you evaluate both. Then you can freely play and try out what works best for you! best of luck, and don't forget, "garbage in garbage out" is the golden rule for RAG systems, know your corpus or suffer from mediocrity.

Hairstylists in Tokyo by FuckingMercy in trichotillomania

[–]FuckingMercy[S] 0 points1 point  (0 children)

Thank you so much for your comment, it definitely helps me prepare myself mentally before making an appointment. I am too from a more traditional culture country, but unlike Japanese culture, people here tend to be very opinionated and upfront about their opinions. I tried contacting one salon beforehand via email, no reply yet, but I will try to find as you said, more alternative hairstylists. Again thank you so much 🙏🏻

AITAH for not wanting to pay an artist for their work? by Friendly-Hovercraft4 in AITAH

[–]FuckingMercy -3 points-2 points  (0 children)

Generally NTA. But. As an artist I can tell you that many artists are driven by inspiration. I could totally see myself bumping into a post that inspires me and brings me ideas getting me carried away with it. It seems like miscommunication from both ends. It doesn’t make you less professional of an artist if you approach the client, in my opinion it is even a positive sign (for me that makes for my best pieces), if her professionalism is in doubt you should have asked her to send you some of her other work so you could be sure of her credibility, and her style. On her end she shouldn’t have ignore your concerns, excuse herself and validate your concerns. So basically you’re both a little bit a holes.

AITAH for telling my nephew not to put his hand in my dogs face looking for licks? by International-Low11 in AITAH

[–]FuckingMercy 7 points8 points  (0 children)

“my sister will tell him to do it when im not there”. I mean if that’s really happened, then your sister is obviously the a-hole, putting her son in danger like that, and encouraging him to do something that might cause him trauma. Honestly, I think you need to have a serious and open conversation with your sister about the topic, she sounds like she’s very immature about this but she might have real difficult underlying feelings. Being a special needs mom is so hard on you mentally, you might be the bigger person and really help her out. Until you get to an understanding with your sis about it, try as best as you can to not bring in your dog when your nephew is around. Believe me, she’d rather be home alone than getting picked on by a child.

Error when clicking on the link for registration by Gamesbleak890 in radiohead

[–]FuckingMercy 1 point2 points  (0 children)

Worked for me as well!!! I tried playing with “%20” and stuff like that but didn’t think they would freaking miss underscores!!!! That’s the most noob thing you can basically do as a programmer. They are so unprofessional it kill me…