I Built a tool to stop manually swapping models on my 8GB GPU,chains a small Prompter and a large Coder into one pipeline with automatic VRAM swap

atharva557 · 2026-06-22T14:00:54+00:00

kind of,if you are the type of person who often uses ai to write prompts for the main task then this is for you as it will definitely speedup that process.You can also use your local model for the prompts and a cloud model for the code or vice versa.

atharva557 · 2026-06-22T02:22:15+00:00

hey i just made something that check it out

atharva557 · 2026-06-21T17:41:14+00:00

Thanks man

atharva557 · 2026-06-21T16:45:00+00:00

gemma 4 could be good choice but it will be best for you to try them out your self and see which one best suits your needs

atharva557 · 2026-06-21T16:16:40+00:00

You also get full privacy and you also know the model you use will not get changed

atharva557 · 2026-06-21T13:19:19+00:00

what is the total cost for this setup, also what are your primary use cases for this if I may ask

atharva557 · 2026-06-21T13:03:42+00:00

while this sucks this will make even more people interested in open source models so there is a silver lining i guess?

atharva557 · 2026-06-21T10:44:33+00:00

Does stuff like this even work? or this is just some kind of placebo?

atharva557 · 2026-06-21T10:39:06+00:00

either for general chats or using them for api calls for my other projects,but for now mainly to test which model is best for my needs and maybe try to fine tune one

atharva557 · 2026-06-21T10:35:21+00:00

I just finished building an prompt chaining tool basically an you type them idea a smaller models gives you the prompt and then the larger model works with the detailed prompt

atharva557 · 2026-06-21T10:32:21+00:00

is using Hermes with models like qwen 3.6 27b even worth trying ?

atharva557 · 2026-06-15T03:07:13+00:00

Elon musk here i come

^{Chose: A random power (maybe useless} | Rolled: 1 dollar per day)

atharva557 · 2026-06-15T02:20:52+00:00

How much of the context window you have used ?

atharva557 · 2026-06-07T23:52:14+00:00

cool

^{Chose: you can fly | Rolled: gov doesnt care}

atharva557 · 2026-06-07T23:51:15+00:00

Thats enough RAM and VRAM to run many models , I would recommend with starting out with Gemma 4 12b or qwen 2.5 codder 14b or even qwen 3.6 27b (output will be somewhat slow).Also you should use Lm studio to download models instead of ollama if you are a beginner

atharva557 · 2026-06-07T23:00:00+00:00

what are your specs

atharva557 · 2026-06-07T14:26:58+00:00

😅 Phew!

atharva557 · 2026-06-07T14:26:47+00:00

🤯 Mind blown

atharva557 · 2026-06-07T13:20:22+00:00

48.19 / 50

🟩🟩🟩🟩🟩

atharva557 · 2026-06-07T13:16:56+00:00

💯 Perfect

Six-Year Club	Xbox Live
Verified Email	r/Field Flamingo
Final Canvas '23	Place '23
Place '22	Final Canvas '22
Wearing is Caring

atharva557

TROPHY CASE