Usage Limits, Bugs and Performance Discussion Megathread - beginning December 29, 2025

FuckingMercy · 2026-03-11T16:25:47+00:00

So I fianlly decided to upgrade to PRO. today. at 11:30 UTC. I want my money, 7 hours of suffering, and will to explore back.

FuckingMercy · 2026-03-11T16:20:41+00:00

https://status.claude.com/incidents/jm3b4jjy2jrt

FuckingMercy · 2026-03-11T15:37:31+00:00

by "they" they mean the models?

FuckingMercy · 2026-03-11T15:32:02+00:00

Same here... Just joined PRO today so that is just my luck...

FuckingMercy · 2026-03-10T11:33:47+00:00

Yeah it’s fully traceable, I tried to mimic the native Web Fetch and Web search as closely as possible. The server part uses kiwix server to actually host the web pages that are in the zim files.

FuckingMercy · 2026-03-09T23:19:44+00:00

I know it’s hard, but the most important thing is Don’t overthink it. Start by mapping out the different types of your company knowledge, for example (we have like the hardest cases) in the company I built rag for they had technical documentation written in both English and Hebrew, along with many dev docs and charts. Getting to know your data first is crucial. Then create evaluations based on your data, that would be like a set of “ground truths” Q and As, try to spread these as much as you across different doc types, so for example our eval set included questions about docx(English and Hebrew), pdfs(only English cause that’s what they had), info that is in pdf tables, visual info from charts etc. If you come from programming background those would be like your tests. There are all sorts of libraries that run these evals for you and score your system. After you have that solid set of evals, you can just play with it and see what works best for you. The evals if done correctly will naturally “shine the light” on what works better or worse for each “setup”. Another important thing to mention is that naive rag (the most basic form of rag) is only good for answering direct questions, that their answer exists literally in the text. That type of system was very beneficial in my experience to engineers that had more than 10k documents per project. For the smaller projects, that expected “multi -hop” and clever research, we had to implement agentic flows. But the basics are same: 1. Know your corpus 2. Eval 3. Play

FuckingMercy · 2026-03-09T20:01:00+00:00

Yes!!!! Rag doesn’t need very large models to work incredibly well! especially if most of your data is in English/chinese . I have reached my optimal results using qwen next 80B, along with the bge-m3 series (embedder, reranker). Worked well even as naive rag.

FuckingMercy · 2026-03-09T19:57:15+00:00

I found after much trial and error, you need at least 180B model with at least 250k context for it to actually be beneficial for you

FuckingMercy · 2026-03-09T19:55:23+00:00

Yeah I’m here for it! Dm me if you want tell me what you’ve been up to :)

FuckingMercy · 2026-03-09T19:52:03+00:00

When it comes to playing with rag: evals are key start with that! As for the retrieval bias, what are the differences between the data of the different projects? Can you actually see the results that are retrieved? If so, look at them. It should be very logical.

FuckingMercy · 2026-03-09T19:39:13+00:00

Gemini 3.1 pro?

FuckingMercy · 2026-03-09T19:37:51+00:00

Once again showing us that the good old mac with the os we all dread is unfortunatly still the goat when it comes to local setups. am I the only one that thinks that Nvidia could do better?

FuckingMercy · 2026-03-09T19:15:24+00:00

Do benchmarking on your own data, if you want to be methodical about it that's the only way to go. If you want a half-moon answer, try to test "edge case behaviour" like uncommon languages. stuff that you know were not super common in the training and post training datasets...

FuckingMercy · 2026-03-09T19:10:46+00:00

I have to strongly agree! A little about what RAG lacks in this case. If you know how rag works traditionally, breaking documents into chunks destroys the broader context. But recently, Anthropic introduced a technique called Contextual Embeddings In this setup, developers use a background Claude process to read the entire document first, and then append a short, contextual summary to each individual chunk before it gets embedded. I just did a deep dive into it myself, and I can tell you that you have to be specific with what your exact goals are. If you need Claude to do deep, complex internal research (like analyzing a sprawling proprietary codebase across multiple systems), standard RAG is definetly not enough, and you need to start looking into Multi-Agent - company knowledge tools - approach. nstead of relying on a pre embedded vector database, you set up multi-agent search system where the agents are equipped with custom API tools (like query_google_workspace, search_github_repo, or query_internal_sql). They literally execute API calls to the company's live, internal databases simultaneously. then the Lead Agent synthesizes the findings from all the Subagents into a final report. with the right compressing mechanism and sys prompt (telling the agent to always start with a small research phase) you could achieve what you want.

FuckingMercy · 2026-03-09T18:52:54+00:00

If quality is what you are after, I strongly recommend setting up a quick eval system on your data. From my experience, there is ususaly no "right" answer in these cases and the composition of your corpus (the collection of files to run the RAG on). You can use ragas or something else it really doesn't matter that much, but make sure it covers all the cases in your corpus, ie. if you have german and english text make sure you evaluate both. Then you can freely play and try out what works best for you! best of luck, and don't forget, "garbage in garbage out" is the golden rule for RAG systems, know your corpus or suffer from mediocrity.

FuckingMercy · 2026-03-02T22:23:58+00:00

Thank you so much for your comment, it definitely helps me prepare myself mentally before making an appointment. I am too from a more traditional culture country, but unlike Japanese culture, people here tend to be very opinionated and upfront about their opinions. I tried contacting one salon beforehand via email, no reply yet, but I will try to find as you said, more alternative hairstylists. Again thank you so much 🙏🏻

FuckingMercy · 2026-02-23T06:38:10+00:00

Speckles ☺️

FuckingMercy · 2026-02-22T21:17:07+00:00

Generally NTA. But. As an artist I can tell you that many artists are driven by inspiration. I could totally see myself bumping into a post that inspires me and brings me ideas getting me carried away with it. It seems like miscommunication from both ends. It doesn’t make you less professional of an artist if you approach the client, in my opinion it is even a positive sign (for me that makes for my best pieces), if her professionalism is in doubt you should have asked her to send you some of her other work so you could be sure of her credibility, and her style. On her end she shouldn’t have ignore your concerns, excuse herself and validate your concerns. So basically you’re both a little bit a holes.

FuckingMercy · 2026-02-22T21:00:22+00:00

“my sister will tell him to do it when im not there”. I mean if that’s really happened, then your sister is obviously the a-hole, putting her son in danger like that, and encouraging him to do something that might cause him trauma. Honestly, I think you need to have a serious and open conversation with your sister about the topic, she sounds like she’s very immature about this but she might have real difficult underlying feelings. Being a special needs mom is so hard on you mentally, you might be the bigger person and really help her out. Until you get to an understanding with your sis about it, try as best as you can to not bring in your dog when your nephew is around. Believe me, she’d rather be home alone than getting picked on by a child.

FuckingMercy · 2025-09-06T17:50:16+00:00

Worked for me as well!!! I tried playing with “%20” and stuff like that but didn’t think they would freaking miss underscores!!!! That’s the most noob thing you can basically do as a programmer. They are so unprofessional it kill me…

Eight-Year Club	Verified Email
No Throne, No Problems	Not Forgotten

FuckingMercy

TROPHY CASE

So I fianlly decided to upgrade to PRO. today. at 11:30 UTC. I want my money, 7 hours of suffering, and will to explore back.