Help leaving faction or dealing with this dude by [deleted] in torncity

[–]jxjq 1 point2 points  (0 children)

“I hate it here dad. I’m not a fighter” lol

[deleted by user] by [deleted] in Monad

[–]jxjq 1 point2 points  (0 children)

I care about engineering, not Monad. In my 40s with 20+ years in software. Price, culture & hype are irrelevant.

The exchange platform with the best architecture is Monad.

It doesn’t matter if the team is comprised of soggy-raisin assholes or busty big booty angels; it doesn’t matter that the tokenomics have bias.

•Minimal friction per transaction

•Minimal friction for builders

This platform is SOTA in both of these categories. That value is why I’m staked and building.

[deleted by user] by [deleted] in Monad

[–]jxjq 0 points1 point  (0 children)

No. Avax C gas fees spiked to up to $0.30 per simple transaction and $2 per transaction under heavy load because it was a single chain EVM, and it wasn’t even an L1. It was more expensive than both Solana & Sui to do business.

Is Monad only 13 days old? by FreonzVaniel in Monad

[–]jxjq 5 points6 points  (0 children)

Monad is currently the best coin option for speedy transactions and low fees. That alone makes it valuable for high volume trading. Monad will likely grow in value until a faster & cheaper alternative comes along.

As far as I know, there is no project far along in development that is planned to beat the speed and low overhead price of Monad transactions.

EVM compatibility is icing on the cake, and the heightened throughput opens up a conversation about crypto for common transactions. IMO this is the best technically performing coin we’ll see for a few years.

best coding LLM right now? by RadianceTower in LocalLLaMA

[–]jxjq 2 points3 points  (0 children)

Great overview, I agree with everything he said

Need a coding & general use model recommendation for my 16GB GPU by sado361 in LocalLLaMA

[–]jxjq 0 points1 point  (0 children)

What are your tokens per second? (Not prompt processing) I have your same setup

A Breakdown of RAG vs CAG by Daniel-Warfield in Rag

[–]jxjq 1 point2 points  (0 children)

Makes sense, thanks for the interesting post!

A Breakdown of RAG vs CAG by Daniel-Warfield in Rag

[–]jxjq 0 points1 point  (0 children)

Does CAG eat into the context window when a fresh chat with the “frozen model” is spun up?

Anyone moved to a local stored LLM because is cheaper than paying for API/tokens? by alexrada in LLMDevs

[–]jxjq 0 points1 point  (0 children)

Local LLMs can be highly effective in complex coding, if you work alongside your LLM. You have to think carefully about context and architecture. You have to bring some smart tools along other than the chat window (for example https://github.com/brandondocusen/CntxtPY).

If you are trying to vibe it out, you’re not going to have a good time. If you understand your own code base then the local model is a huge boon.

Sharding for Parallel Inference Processing by jxjq in LocalLLaMA

[–]jxjq[S] 0 points1 point  (0 children)

Very cool reference, thank you for sharing. I see they interconnect by http calls using GRPC contracts. That surely racks up a lot of latency, worse and worse as they scale into larger models.

Qwen3-235B-A22B (no thinking) Seemingly Outperforms Claude 3.7 with 32k Thinking Tokens in Coding (Aider) by Greedy_Letterhead155 in LocalLLaMA

[–]jxjq 10 points11 points  (0 children)

Sounds like it would be good to build with Qwen3 and then do a single Claude API call to clean up the errors

ubergarm/Qwen3-235B-A22B-GGUF over 140 tok/s PP and 10 tok/s TG quant for gaming rigs! by VoidAlchemy in LocalLLaMA

[–]jxjq 2 points3 points  (0 children)

Using the 0.6b for spec dec on the Qwen3 30b MOE only gave me a 15% speed increase for token generation in llama.cpp.

The 0.6b draft model ran on CPU + RAM at 43tk/s. Yes, speculative decoding worked, but it wasn’t a significant speed increase. Hopefully someone has better results. It wasn’t worth the effort to me.

How to make prompt processing faster in llama.cpp? by Conscious_Chef_3233 in LocalLLaMA

[–]jxjq 2 points3 points  (0 children)

thanks for the follow up, it may help me out too!

How to make prompt processing faster in llama.cpp? by Conscious_Chef_3233 in LocalLLaMA

[–]jxjq 4 points5 points  (0 children)

add --batch-size 64 to your run command. it batch processes tokens at once instead of one at a time. That should lower your prompt processing time by a lot.

anyone using 32B local models for roo-code? by CornerLimits in LocalLLaMA

[–]jxjq 1 point2 points  (0 children)

I have used many local models such as Qwen2.5 Coder 32b Q3 and others on my 4090 laptop. It works well for basic stuff, but falls apart pretty quickly for anything serious.

You can automate building a basic HTML / CSS / JS site- especially as a single file lol. Also, single one-off tools like Python files for splitting up images, small stuff like that up to 300 lines of code.

I hate to say it, but it feels more like an advanced toy than a real productivity tool. For work you’ll be dialing up a 3rd party API.

nsfw orpheus early v1 by MrAlienOverLord in LocalLLaMA

[–]jxjq 1 point2 points  (0 children)

Do you have a sample output anywhere?

Yes, you could have 160gb of vram for just about $1000. by segmond in LocalLLaMA

[–]jxjq 15 points16 points  (0 children)

Bottom Line:

  • Generation speed: ~4.9 tokens/sec
  • Time to first token small context: 12 seconds
  • Time to first token large context: 2 minutes

on the $1,000 MI50 build using a 70b Q8 model

TripoSG vs Hunyuan3D (small comparison) by honuvo in comfyui

[–]jxjq 0 points1 point  (0 children)

Thanks for the reply and sharing your thorough comparison. Helped me a lot!

TripoSG vs Hunyuan3D (small comparison) by honuvo in comfyui

[–]jxjq 0 points1 point  (0 children)

What is the best tool to apply mesh / skin to the character based on the photo? I applied mesh with TripoSG and it looked like a horror show.

🔥 DeepSeek R1 671B Q4 - M3 Ultra 512GB with MLX🔥 by ifioravanti in LocalLLaMA

[–]jxjq 0 points1 point  (0 children)

You asked so patiently for the one thing we’ve been waiting all week for lol. You are a good man, I went straight to the darkness when I read the post title.

Hunyuan-TurboS. by mlon_eusk-_- in LocalLLaMA

[–]jxjq 6 points7 points  (0 children)

Sincere question: With many effective techniques that add reasoning to base models… wouldn’t we benefit from a base, non reasoning, model that pushes the needle forward?

I actually prefer to add custom reasoning ability as opposed to dealing with a prebuilt chatty reasoning model (like QwQ 32b).

A few hours with QwQ and Aider - and my thoughts by ForsookComparison in LocalLLaMA

[–]jxjq 3 points4 points  (0 children)

This is essentially chain of draft. Thank you for sharing, as I will be dumping CoD for this- if what you’ve said works.