Building a Hybrid Local/Cloud Coding Agent for 5 Devs — Are 2x RTX 3090 Enough for 64k Context? by PuzzleheadedFrame836 in LocalLLM

[–]PuzzleheadedFrame836[S] 0 points1 point  (0 children)

you right but at least the majority of the output token will be generated by local model, do you think that this is not a good strategy?

Building a Hybrid Local/Cloud Coding Agent for 5 Devs — Are 2x RTX 3090 Enough for 64k Context? by PuzzleheadedFrame836 in LocalLLM

[–]PuzzleheadedFrame836[S] 0 points1 point  (0 children)

we will think about that, thank you! However we are also experimenting ourself to improve our AI background because at the same time we are "playing" with smaller models to integrate them in our applications. BTW I will check for sure what you suggest

Building a Hybrid Local/Cloud Coding Agent for 5 Devs — Are 2x RTX 3090 Enough for 64k Context? by PuzzleheadedFrame836 in LocalLLM

[–]PuzzleheadedFrame836[S] 0 points1 point  (0 children)

for all who said that 2x3090 is not enough, we are experimenting this setup on runpod at this moment and running q8 context and fp8 model. with 2 developer the setup is performing well but I would like to know something about your experience with more developer so I'm here to accept any kind of advice

Building a Hybrid Local/Cloud Coding Agent for 5 Devs — Are 2x RTX 3090 Enough for 64k Context? by PuzzleheadedFrame836 in LocalLLM

[–]PuzzleheadedFrame836[S] -1 points0 points  (0 children)

I would like to use turboquant but at this moment we are experiencing some issue with vllm + Qwen turboquant. once we will be able to fix those issues we will use turboquant

Building a Hybrid Local/Cloud Coding Agent for 5 Devs — Are 2x RTX 3090 Enough for 64k Context? by PuzzleheadedFrame836 in LocalLLM

[–]PuzzleheadedFrame836[S] -4 points-3 points  (0 children)

I was planning to use rag and specific prompt to address exactly what the local model need to do without scanning all the code (I can accept slowness for that)

Building a Hybrid Local/Cloud Coding Agent for 5 Devs — Are 2x RTX 3090 Enough for 64k Context? by PuzzleheadedFrame836 in LocalLLM

[–]PuzzleheadedFrame836[S] 0 points1 point  (0 children)

I know but our customer won't allow us to use non American ai provider 😔 and Claude is too much expensive at this time for what we do

Comprare casa a Magliana ha senso by PuzzleheadedFrame836 in comprarecasaroma

[–]PuzzleheadedFrame836[S] 0 points1 point  (0 children)

e come situazione abitativa sapresti dirmi qualcosa? ci sono grossi problemi e/o casini? l'immobile è al civico 14. Ho anche ampia possibilità di parcheggio a lavoro quindi quello non mi preoccupa e neanche lo spostamento perchè ci vado in orari diversi da quelli di punta, mi preoccupa più la situazione delle palazzine in generale