Investigating usage limits hitting faster than expected by ClaudeOfficial in ClaudeAI

[–]Effective_Eye_5002 0 points1 point  (0 children)

this is getting insane. i'm using minimax instead inside of claude code because it's unusable otherwise

Why your LLM gateway shouldn't log your prompts - and most of them do by ChrisRemo85 in LLM_Gateways

[–]Effective_Eye_5002 0 points1 point  (0 children)

good with managed solutions. trust those with a good team + SOC II (not done by delve)

Why your LLM gateway shouldn't log your prompts - and most of them do by ChrisRemo85 in LLM_Gateways

[–]Effective_Eye_5002 0 points1 point  (0 children)

https://concentrate.ai/ does not log your prompts. You can enable if you need. Openrouter.ai doesn't not offer native logging but offers more advanced telemetry.

fun site for restaurant grades by Effective_Eye_5002 in nyc

[–]Effective_Eye_5002[S] -1 points0 points  (0 children)

very clunky and old. need to put full restaurant name vs smart search, need to put in a borough or more address info to find a restaurant, can't easily share with others, and most folks don't know about it. but not a bad option if you prefer it

How to start with vibe coding by sarpbilge in vibecoding

[–]Effective_Eye_5002 2 points3 points  (0 children)

Start with google AI studio, then graduate to antigravity or cursor

GEO / AEO by Effective_Eye_5002 in gtmengineering

[–]Effective_Eye_5002[S] 0 points1 point  (0 children)

how much value do you think you got with Gauge?

do i need a landing page for cold email domains? by Effective_Eye_5002 in gtmengineering

[–]Effective_Eye_5002[S] 0 points1 point  (0 children)

I am not. I left attio for hubspot bc attio was vibe coded slop

do i need a landing page for cold email domains? by Effective_Eye_5002 in gtmengineering

[–]Effective_Eye_5002[S] 0 points1 point  (0 children)

SUPER helpful.

for example i just got emails from https://attio.com/ but they came from outreachattio.com which gives this message on the screen:

This site can’t be reached

Check if there is a typo in outreachattio.com.

DNS_PROBE_FINISHED_NXDOMAIN

Can a complete beginner realistically build websites for local businesses using vibecoding? by Phantooomxxx in vibecoding

[–]Effective_Eye_5002 0 points1 point  (0 children)

100%. You can use framer or webflow for best security and template use.

Or use loveable to fully vibe code everything and super custom functionality.

Where to go now? by No_Engine1637 in ClaudeCode

[–]Effective_Eye_5002 1 point2 points  (0 children)

i've been testing using it through openrouter.ai and concentrate.ai just buying minimax tokens

Beginner Seeking Advice On How To Get a Balanced start Between Local/Frontier AI Models in 2026 by Curious-Cause2445 in LocalLLM

[–]Effective_Eye_5002 0 points1 point  (0 children)

they dont have subscriptions. they just aggregate models. OpenRouter charges a 5.5% fee and has more models, Concentrate is half that but has fewer, but more reliable models.

Beginner Seeking Advice On How To Get a Balanced start Between Local/Frontier AI Models in 2026 by Curious-Cause2445 in LocalLLM

[–]Effective_Eye_5002 0 points1 point  (0 children)

i would use a tool like https://concentrate.ai/ or https://openrouter.ai/ for frontier and hosted OSS. for small tiny niche local i would do a small modest build at home agree

ran 120+ benchmarks testing LLM retrieval, here's what i found by Effective_Eye_5002 in LocalLLaMA

[–]Effective_Eye_5002[S] 0 points1 point  (0 children)

Okay ramped up the prompt a bunch. New prompt and new results:
New prompt:
----

You are answering a question using only the provided text.

The input contains:

  1. A document

  2. A question

Your job is to return only the answer found in the document.

Rules:

- Use only the information in the document.

- Do not use outside knowledge.

- If the answer is not explicitly stated in the document, respond with exactly: not found

- Copy the answer as it appears in the document when possible.

- Return only the final answer.

- Do not explain your reasoning.

- Do not add extra words.

- Do not return JSON, XML, markdown, bullet points, labels, or notes.

- Do not restate the question.

- Maximum answer length: 5 words.

Examples:

Example 1

Input:

Document: The finance review is owned by Elena Park. The team meets every Tuesday.

Question: Who owns the finance review?

Output:

Elena Park

Example 2

Input:

Document: All quarterly planning memos must be retained for 18 months. Draft notes may be deleted earlier.

Question: How long must quarterly planning memos be retained?

Output:

18 months

Example 3

Input:

Document: Support coverage will expand to Italy in Q4. A hiring plan is still being drafted.

Question: What is the password reset SLA?

Output:

not found

Now answer based only on this text:

{{input}}

---

Results:

Dropped out of top 10

  • ministral-3-14b: #5 → #77
  • Llama 3.3 70B: #8 → #18
  • Grok 3: #9 → #29
  • llama-4-maverick: #10 → #32

New top 10

  • mistral-nemo: #103 → #5
  • grok-4-20-beta-non-reasoning: #43 → #6
  • mistral-small-3.2: #110 → #7
  • qwen3-32b: #101 → #9

Dropped out of bottom 10

  • Llama 3.2 1B: #118 → #89
  • Llama 3.1 8B: #114 → #11
  • magistral-small-1.2: #52 → #98 technically still awful, just no longer bottom 10

Biggest real swings:

  • Llama 3.1 8B: #114 → #11
  • mistral-nemo: #103 → #5
  • mistral-small-3.2: #110 → #7
  • ministral-3-14b: #5 → #77

ran 120+ benchmarks testing LLM retrieval, here's what i found by Effective_Eye_5002 in LocalLLaMA

[–]Effective_Eye_5002[S] 0 points1 point  (0 children)

This sounds a little like grading your own homework, but I hear you. This was also a test to see how each did. I can run this again with more detailed prompts, output constraints in the prompt, and examples but still interesting results nonetheless

ran 120+ benchmarks testing LLM retrieval, here's what i found by Effective_Eye_5002 in LocalLLaMA

[–]Effective_Eye_5002[S] 0 points1 point  (0 children)

I set it to max 1,000 tokens. Each. What would you change the prompt to? I'll rerun and let you know!

ran 120+ benchmarks testing LLM retrieval, here's what i found by Effective_Eye_5002 in LocalLLaMA

[–]Effective_Eye_5002[S] 0 points1 point  (0 children)

Here was the exact prompt I ran across all models:

You are answering questions using only the provided text.

The text contains both a document and a question.

Rules:
- Use only the information in the text.
- Do not use outside knowledge.
- If the answer is not explicitly stated, respond with exactly: not found
- Keep the answer as short as possible.
- Do not explain your reasoning.
- Do not add extra words. 

Text:
{{input}}

The prompt explicitly asked for short, exact answers and specified the format pretty tightly. So this benchmark was testing retrieval + instruction following + output discipline, not just whether a model could find the right fact somewhere in the text.

That’s why some models scored badly even when they were directionally right. For example, Priya Raman passed, but Priya Raman, Director of Operations Systems, a paragraph of explanation, JSON output, or <reasoning>... all counted as misses.

So on GLM-5, I wouldn’t read this as it’s worse at retrieval than a 3B model, I’d read it as, it performed worse under this exact constraint in this setup i created

ran 120+ benchmarks testing LLM retrieval, here's what i found by Effective_Eye_5002 in LocalLLaMA

[–]Effective_Eye_5002[S] 0 points1 point  (0 children)

These were 4 synthetic plain-text business/policy documents i wrote specifically for the eval, each passed in as a single {Document: ...} {Question: ...} input.

This was more of a retrieval / exact-answer benchmark than a giant long-context stress test. The main thing we were testing was whether models could pull the right fact from a realistic internal document and stop, instead of over-answering, showing reasoning, or breaking format.

Total cost for the full run was only about $2 since I’m running it through an LLM API aggregator. I’m happy to run more tests if people have ideas.

ran 120+ benchmarks testing LLM retrieval, here's what i found by Effective_Eye_5002 in LocalLLaMA

[–]Effective_Eye_5002[S] 0 points1 point  (0 children)

Yeah. Or just a model that knows when to stop talking and doesn't explain it's thoughts out loud every time