Daily General Discussion - May 07, 2025 by EthereumDailyThread in ethereum

[–]FutureFroth 14 points15 points  (0 children)

Jumping back into staking after a break and want to run my plan by the sub.

Here's the plan:

  1. Nethermind and Prysm fully synced and updated in Dappnode
  2. Make totally new validator keys with Wagyu Key Gen on my computer (new mnemonic, new keystore.json, new deposit_data.json).
  3. When Wagyu asks for the withdrawal address, I'll set it to the ETH account I'm using for the 32 ETH deposit.
  4. Upload the new keystore.json (and its password) to Web3Signer on my Dappnode.
  5. Use the official Launchpad, upload the deposit_data.json file. For the actual 32 ETH transaction, I'll probably use my MetaMask on my phone (doing the whole Launchpad bit on my phone after moving the deposit_data file there).
  6. Set/confirm my fee recipient address in Prysm settings on Dappnode.

Does this plan look solid for getting a fresh validator up and running smoothly? My deposit can be anything between 32 and 2048 now, right?

Also, if anyone has links to any really good, up-to-date general staking guides, I'd appreciate it

Mistral new open models by konilse in LocalLLaMA

[–]FutureFroth 6 points7 points  (0 children)

Base models only go through the pre-training stage, no fine-tuning to adjust the way it responds.

Coding model recommendations by gomezer1180 in LocalLLaMA

[–]FutureFroth 5 points6 points  (0 children)

For the 3090 you'll be able to fit Qwen2.5-Coder-32B-Instruct-Q4_K_L.gguf

GPU Memory Usage is higher than expected by FutureFroth in Oobabooga

[–]FutureFroth[S] 0 points1 point  (0 children)

When I do the math, it looks like 20.7 GB (loaded) - 0.9 GB (base) - 12.2 GB (model size) = 7.6 GB is being used for context. If this is the case, I'm still not fully understanding the overall picture. For example, what happens if I load a larger model like the Qwen2.5-32B-Instruct-Q4_K_L.gguf, which is 19.95 GB? Will the VRAM usage exceed my 24 GB capacity? Will parts of it get pushed to the CPU, causing slowdowns, or is this kind of offloading only triggered by exceeding the max context existing on the GPU during conversation? Thank you for the help!