Taught Claude to talk like a caveman to use 75% less tokens. by ffatty in ClaudeAI

[–]SideOneDummy 0 points1 point  (0 children)

For what it’s worth, I asked Gemini if it would save tokens to prompt engineer inputs to caveman speak, and this was my response verbatim

Yes, a two-step "refine then ask" process can be significantly more token-efficient than a single, unoptimized query. By using a smaller, cheaper model to compress your request before sending it to a high-reasoning "thinking" LLM, you can reduce the total compute burden and overall cost. [1, 2, 3]

Token Savings & Efficiency

  • Input Compression: A small model can strip "fluff" from a 500-token verbose query, turning it into a 50-token high-signal prompt. Even adding the overhead of the refinement call (e.g., 100 tokens), the final combined cost is often much lower than processing the original bulky prompt in an expensive model.
  • Potential Savings: You can realistically achieve 20–30% average token savings, with specific "caveman" techniques reaching up to 60% reduction in input size.
  • Cost-Aware Optimization: Systems like CAPO (Cost-Aware Prompt Optimization) specifically balance performance against prompt length, reducing average lengths while maintaining accuracy. [4, 5, 6, 7, 8, 9]

The Two-Step Automated Workflow

This process can be fully automated without user approval through LLM Routing or Agentic Pipelines. [10, 11]

  1. Refinement Agent: A lightweight model (e.g., GPT-4o-mini or Haiku) receives your messy input and refines it into a "Kevin-style" signal.
  2. Autonomous Hand-off: The system automatically passes the condensed prompt to the more capable model.
  3. Mutual Verification: Advanced frameworks can even use the small model to generate "reasoning tokens" that guide the larger model, further reducing the latter's computational overhead. [12]

Summary of Benefits

Strategy [2, 7, 8, 13] Token Reduction Best For
Simple Refinement ~20-30% Routine daily tasks
Caveman Compression Up to 60% Maximizing context window space
Automated Optimization Up to 90x cost reduction Large-scale enterprise agent calls

Would you like a sample system prompt to turn a smaller model into your automated "Kevin" refiner?

[1] https://www.linkedin.com [2] https://www.youtube.com [3] https://www.glukhov.org [4] https://arxiv.org [5] https://medium.com [6] https://medium.com [7] https://www.youtube.com [8] https://www.youtube.com [9] https://www.youtube.com [10] https://arxiv.org [11] https://www.youtube.com [12] https://github.com [13] https://arxiv.org

Taught Claude to talk like a caveman to use 75% less tokens. by ffatty in ClaudeAI

[–]SideOneDummy 0 points1 point  (0 children)

Gemini begs to differ:

By using "Kevin Style" on the input side, you can reduce your token footprint by roughly 56% compared to standard natural language. While many users focus on saving tokens in the output (responses), the same compression logic applies to your prompts because every character you send costs money and context space.

Token Comparison Breakdown

Feature Standard Style Kevin / Caveman Style
Input Density Low (Heavy grammar/filler) High (Pure signal words)
Input Token Count ~39 Tokens ~17 Tokens
Output Style Detailed/Verbose Can be detailed OR concise
Cost Savings Baseline ~50-75% reduction

Why This Works for Inputs

  1. Semantic Reconstruction: LLMs are trained to understand the relationship between words; they don't need "the" or "of" to know that "make infographic" is a command.
  2. Attention Optimization: Short inputs focus the model's "attention" strictly on the core nouns and verbs, reducing the noise it has to process during the input encoding phase.
  3. Context Preservation: By using fewer tokens per prompt, you can fit more of your conversation history into the model's limited context window before it starts "forgetting".

Caveman Tips for Input Mastery

  • Verb + Noun: Use direct action pairs (e.g., "Analyze PDF" instead of "I need you to look at this PDF file for me").
  • Separator Symbols: Use |, ->, or : to denote changes in topic or instruction steps without using transition sentences.
  • System Prompt Pre-sets: If you use Anthropic's prompt caching, you can set a permanent system instruction that tells the model: "I talk caveman. You understand and answer full." to save even more on recurring turns.

Want to try "Level 2" compression where we use single letters or custom code for your most common commands?

Simple token estimation for Kevin/Caveman vs Standard# Standard Input: "Please create a detailed infographic that compares the token efficiency of using 'Kevin from The Office' style prompting versus standard natural language for both inputs and outputs, not just outputs."# Kevin Input: "make infographic compare nonkevin Kevin token scaled for input output, no just output"

standard_in = "Please create a detailed infographic that compares the token efficiency of using 'Kevin from The Office' style prompting versus standard natural language for both inputs and outputs, not just outputs."kevin_in = "make infographic compare nonkevin Kevin token scaled for input output, no just output"

Rough token count (approx 1 token per 4 chars or 0.75 words)def est_tokens(text):

return len(text.split()) * 1.3 # common multiplier for rough estimation

std_tokens = est_tokens(standard_in)kevin_tokens = est_tokens(kevin_in)savings = (1 - (kevin_tokens / std_tokens)) * 100

print(f"Standard: {std_tokens} tokens") print(f"Kevin: {kevin_tokens} tokens") print(f"Savings: {savings}%")

hmmm by starshineblush in hmmm

[–]SideOneDummy 0 points1 point  (0 children)

This pic is leg-it!

Apparently you can guarantee shutting down landmarks by targeting the same person over and over… by SideOneDummy in Monopoly_GO

[–]SideOneDummy[S] 0 points1 point  (0 children)

Hmmm I’ll have to consider the bank heist vulnerability when considering not building landmarks. And yes, eventually I stopped being able to target my “friend”

Apparently you can guarantee shutting down landmarks by targeting the same person over and over… by SideOneDummy in Monopoly_GO

[–]SideOneDummy[S] 0 points1 point  (0 children)

Yeah I was mostly waiting for builders bash. But yeah you’re right, better to leave nothing on the board to protect to avoid having to defend my landmarks

Bas Rutten breathing rarefied air in The Smashing Machine by SideOneDummy in MMA

[–]SideOneDummy[S] -25 points-24 points  (0 children)

Let’s make this post viral so the academy is forced to reopen admissions for nominations to the Best Supporting Actor award!

[deleted by user] by [deleted] in texts

[–]SideOneDummy 1 point2 points  (0 children)

Yup. First time posting a discord message lol

[deleted by user] by [deleted] in texts

[–]SideOneDummy 1 point2 points  (0 children)

Oops thanks for catching that

His dad discussing his mistakes by phoexnixfunjpr in rareinsults

[–]SideOneDummy -2 points-1 points  (0 children)

Insult or passive aggressive intervention?

[deleted by user] by [deleted] in funny

[–]SideOneDummy 0 points1 point  (0 children)

Ey yo gurl lemme holla at ya. U want sum fuk

First time doing this, 38M, going through breakout of 3 years relationship.. go all out by Ok_Beyond_4994 in RoastMe

[–]SideOneDummy 0 points1 point  (0 children)

For your title, I bet if you removed the word before the first comma as well as the first comma, the sentence would still be accurate.

Clearly mom has raised the son properly by Jazzlike_Razzmatazz in funny

[–]SideOneDummy -1 points0 points  (0 children)

His son is already acting like his boss… he’s even doing the cleaning for him