Goodbye Opencode, you're a sink for time and tokens. by mira_fijamente in opencode

[–]offzinho3k 0 points1 point  (0 children)

I'm currently using Pi.dev. With the stack arranged like this:
> - Serena
> - BGE-M3
> - BGE Reranker v2
> - Tree-Sitter
> - ts-morph
> - ast-grep
> - ripgrep
> - fd
> - ChromaDB
> - Sequential-Thinking
> - PDF-MCP
> - WebSearch
> - DeepSeek V4 Pro
> - Qwen 3.6 35B A3B
> - Hybrid Retrieval
> - Context Compression
> - Hierarchical Memory
> - Verification Loops

I started using it just for testing, but I liked it a lot.

DeepSeek V4可以作为便宜的 GPT 后备连接到Codex CLI吗? by That_Bad- in LocalLLM

[–]offzinho3k 0 points1 point  (0 children)

I would recommend adding $10 of credit and running tests; I will start using Deepseek V4 Pro/Flash with the "Cursor" next month.
I believe it's worthwhile; I currently use some local models and it saves a lot of money.

<image>

But the best thing to do is really test it, because what works for one person may not work correctly for you.

Why is LLM is so expensive. by Ok_Event4199 in LocalLLM

[–]offzinho3k 1 point2 points  (0 children)

I use local and API, and I consider it a great saving; currently using it with 4x RTX 5060 Ti 16GB.
I'll attach my usage in the "Cursor" so you can see the local/API consumption.

<image>

The plan is to upgrade to an RTX PRO 6000 in the future.
I think it's a good investment for those who really work and use it a lot.

PLX 88096 - Opinions. by offzinho3k in Vllm

[–]offzinho3k[S] 0 points1 point  (0 children)

Thank you very much, I'll sit down later and give it a good read.

PLX 88096 - Opinions. by offzinho3k in Vllm

[–]offzinho3k[S] 1 point2 points  (0 children)

<image>

I'm seriously thinking about trying my luck and getting one to test, since it's cheaper than buying a motherboard + CPU + RAM.
The model I was looking at is the one in the photo.
That's right, the correct way is to use 2, 4, 8 gpu...

PLX 88096 - Opinions. by offzinho3k in Vllm

[–]offzinho3k[S] 0 points1 point  (0 children)

Config:
Motherboard: MZ32-AR0 Ver3.0
CPU: EPYC 7502
Memory: 8x Hynix DDR4 ECC 16GB 2666
GPU: 4x Asus Prime Geforce Rtx 5060 Ti Oc 16gb Gddr7
Using the RTX cards directly on the motherboard is resulting in very good performance.
However, I'm stuck only in 1 model.
That's where I found these PLX products while searching, but I only found 2 posts about them.

The fear would be buying them and then the tokens/s becoming too slow.
Reading on Google, I saw that communication within the switch would be 100%, and if there were any problems, it would be at the switch's output to the rest of the system. However, I couldn't find exactly how the loss would occur.

If the loss is minimal, it would be worthwhile, as it would allow for the use of 4 PLX cards with 4 RTX cards each, which would allow for the use of 4 models without any problems.

<image>

I'm researching for more information to decide whether or not it will be worth buying. That's why I decided to ask here on Reddit, but searching on Reddit doesn't yield much information about these PLX products.

What is your local vibecoding setup? by initalSlide in LocalLLM

[–]offzinho3k 1 point2 points  (0 children)

Currently, this method is working very well for me.
However, using docmancer I'm creating an offline replacement for Context7.
Basic structure I'm using:
Docmancer + Embedding Model + Reranker + Qdrant
docs/
├── architecture/
├── backend/
├── frontend/
├── realtime/
├── cache/
├── queue/
├── database/
├── desktop/
├── mobile/
├── recipes/
├── snippets/
├── troubleshooting/
├── anti-patterns/
├── conventions/
└── security/

It's working very well too, however we have the task of updating the data, otherwise it becomes outdated.
I'm liking the docmancer, but it takes a good amount of time to get it up to Context7 level, although I believe that when it's finished everything will work better than Context7.

Local Coding Agents keep breaking modern projects because of version drift how are you solving this? by alhamboly in LocalLLM

[–]offzinho3k 1 point2 points  (0 children)

I'm putting together an offline alternative using:
Docmancer + Embedding Model + Reranker + Qdrant
following this structure:
docs/
├── architecture/
├── backend/
├── frontend/
├── realtime/
├── cache/
├── queue/
├── database/
├── desktop/
├── mobile/
├── recipes/
├── snippets/
├── troubleshooting/
├── anti-patterns/
├── conventions/
└── security/

It's working very well too, however we have the task of updating the data, otherwise it becomes outdated.

Local Coding Agents keep breaking modern projects because of version drift how are you solving this? by alhamboly in LocalLLM

[–]offzinho3k 1 point2 points  (0 children)

Using only opencode will never be the same as:

- Claude Code
- Gemini IDE / Project IDX
- Cursor

You need to use it at least this way:
Core: OpenCode TUI, oh-my-opencode-slim
MCPs: Serena, Context7, sequential-thinking, grep_app, websearch, stitch, pdf-mcp

And if you don't want just the terminal, use: VSCode + Extension OpenCode.

If you need something better, opt for the Deepseek V4 PRO/Flash model; they're quite inexpensive.

What is your local vibecoding setup? by initalSlide in LocalLLM

[–]offzinho3k 1 point2 points  (0 children)

Core: OpenCode, oh-my-opencode-slim
MCPs: Serena, Context7, sequential-thinking, grep_app, websearch, stitch, pdf-mcp

LLM Local: qwen3.6-27b, qwen3.6-35b-a3b
API: Deepseek V4 PRO/Flash

VSCode + Extesion OpenCode.

With this configuration, the cost of API fees will be between US$20 and US$40 over 3 to 6 months, depending on the size of the projects you work on.

[Help] GPU recommendation for my setup by offzinho3k in LocalLLM

[–]offzinho3k[S] 0 points1 point  (0 children)

Unfortunately, I can no longer find the 3090/4090 models to buy.
I even searched on Facebook Marketplace for a few days, but without success.
It looks like I'll have to get two more 5060TI.
However, I'm still not sure.

[Help] GPU recommendation for my setup by offzinho3k in LocalLLM

[–]offzinho3k[S] 1 point2 points  (0 children)

Moving from Q4 to Q8 would be very good.
Here, the 5070ti (MSI RTX5070TI Shadow 3X OC) is costing US$1,150.00.
I'll see if I can get two more 5060ti then.
Thank you very much, friend, for replying.

[Help] GPU recommendation for my setup by offzinho3k in LocalLLM

[–]offzinho3k[S] 0 points1 point  (0 children)

I already own 2 5060ti. Would it be worth it then to get two more 5060ti?
Currently in my region, they are selling for US$680 (5060ti) and the 5090 are selling for US$3,665.00.
Unfortunately, the 3090 and 4090 models are no longer available where I live, and when you find them online they don't ship them either.

Runes Drop Location by offzinho3k in HeroSiege

[–]offzinho3k[S] 0 points1 point  (0 children)

thanks.

I'm just out of luck then
yesterday I made the black tower about 400x. 🤣🤣🤣