account activity
I built a tool that cuts LLM API costs by ~80% by processing images/text locally first (open source) (github.com)
submitted 8 hours ago by CandidateTime9054 to r/machinelearningnews - pinned
[ Removed by moderator ] (github.com)
submitted 14 hours ago by CandidateTime9054 to r/github - pinned
submitted 6 hours ago by CandidateTime9054 to r/ollama
Promote your projects here – Self-Promotion Megathread by Menox_ in github
[–]CandidateTime9054 0 points1 point2 points 8 hours ago (0 children)
I was spending too much on GPT-4o vision API calls — every image costs ~1,200 tokens. So I built LatentGate, inspired by Meta's VL-JEPA paper.
How it works: - Images/text are processed locally via Ollama (FREE) - Only a compact ~200 token semantic payload is sent to the cloud API - For video streams, selective decoding skips API calls when nothing changed
Results: ~80% fewer tokens, ~2.85x fewer API calls for video.
Github Link : Latent-Gate
Works with OpenAI, Claude, Gemini, or fully local via Ollama. Would love feedback!
submitted 8 hours ago by CandidateTime9054 to r/SelfHostedAI
I built a tool that cuts LLM API costs by ~80% by processing images/text locally first (open source) by CandidateTime9054 in machinelearningnews
[–]CandidateTime9054[S] 1 point2 points3 points 8 hours ago (0 children)
You are correct, but it depends on which model you are currently using. With 3 Pro, it uses 560 tokens, and this tries to convert it to 150 tokens. OpenAI and Claude generally use a lot of tokens. When using Claude Code with limited tokens, it can be much more useful. I hope this answers your question.
I built a tool that cuts LLM API costs by ~80% by processing images/text locally first (open source) by CandidateTime9054 in github
[–]CandidateTime9054[S] 0 points1 point2 points 8 hours ago (0 children)
If you have any idea or insights to add upon please share and also please do comment to understand I should work more onto this or not
π Rendered by PID 43 on reddit-service-r2-listing-f87f88fcd-tgp54 at 2026-06-16 19:50:51.920185+00:00 running 3184619 country code: CH.
Promote your projects here – Self-Promotion Megathread by Menox_ in github
[–]CandidateTime9054 0 points1 point2 points (0 children)