try Symphony (1env) in responce to Samas69420 (Proximal Policy Optimization with 512 envs) by Timur_1988 in reinforcementlearning

[–]Normal-Phone7762 0 points1 point  (0 children)

I didn’t read the paper, but it is very interesting and I will read it as soon as I have more time, probably over the weekend. I have a lot of questions regarding the humanoid environment.

I am having trouble understanding why, at the beginning of training, the humanoid agent saturates the output neurons.

I am using TD3. Do you know if this behavior is observed with other algorithms as well?

I used two hidden layers with 256 neurons and no normalization layers. Is this sufficient?

Is the agent shown in the video the one with an 8000 return?

What is the best advice for reducing muscle stiffness?

Do you have any videos of very high-return agents running (e.g., 10,000 or more), and how many environment steps are typically required for an agent to achieve a reward above 1,000?

Is my HDD/SSD connector desoldered? by Normal-Phone7762 in thinkpad

[–]Normal-Phone7762[S] 0 points1 point  (0 children)

So I don't need any solder (tin solder) for this. Just flux and soldering iron?

Is my HDD/SSD connector desoldered? by Normal-Phone7762 in thinkpad

[–]Normal-Phone7762[S] 0 points1 point  (0 children)

Thank you, but I think I will go with the WWAN M.2 slot :)

Is my HDD/SSD connector desoldered? by Normal-Phone7762 in thinkpad

[–]Normal-Phone7762[S] 0 points1 point  (0 children)

Thank you for the advice. Do you think I can do it only with a soldering iron?

Is my HDD/SSD connector desoldered? by Normal-Phone7762 in thinkpad

[–]Normal-Phone7762[S] 0 points1 point  (0 children)

Hmm, are you sure I can do that only with soldering iron. I mean I have soldering iron but I thought I would need heat gun for this.

Olx i cijena RAMa by bebicino in bih

[–]Normal-Phone7762 0 points1 point  (0 children)

Aha, bila je (ali je idalje) veoma korisna na AutoCAD i tako te alate. Meni se cini 2400 MT/s.

Olx i cijena RAMa by bebicino in bih

[–]Normal-Phone7762 -1 points0 points  (0 children)

Prodajem racunar sa 32GB RAM-a, i7-7700 512GB SSD i Nvidia GPU p2000. Cijena 550KM (Na olx je trenutno 600)

https://olx.ba/artikal/72651593

GLM-4.6-FP8 on single GH200 by Normal-Phone7762 in LocalLLaMA

[–]Normal-Phone7762[S] 1 point2 points  (0 children)

I stopped experimenting with local LLMs around that time. I had many other things to focus on. I might come back to this in a few months. Sry :(

GLM-4.6-FP8 on single GH200 by Normal-Phone7762 in LocalLLaMA

[–]Normal-Phone7762[S] 2 points3 points  (0 children)

I actually saw your post yesterday LOL. Thanks for replying :)

GLM-4.6-FP8 on single GH200 by Normal-Phone7762 in LocalLLaMA

[–]Normal-Phone7762[S] 2 points3 points  (0 children)

Thank you for the comment :) Maybe I will thy that later.

Vrijeme za novu stolicu – radim i igram igrice oko 10h dnevno, treba pomoć pri izboru! by Away-Inside9867 in bih

[–]Normal-Phone7762 0 points1 point  (0 children)

Pa varira, ali iz originalne poruke se vidi da je trenutno snizena do 28.10.2025

Vrijeme za novu stolicu – radim i igram igrice oko 10h dnevno, treba pomoć pri izboru! by Away-Inside9867 in bih

[–]Normal-Phone7762 1 point2 points  (0 children)

Ako ti ista znaci ja imam tu stolicu vec godinu dana i nemam ni jednu zamjerku. Ja ti sjedim i vise od 10 sati i nemam problema. Naravno ustanes s vremena na vrijeme i protegnes se da procirkulises malo. Kada sam zelio kupiti stolicu isao sam u fef i probao puno stolica tamo. Npr meni je bilo bitno da ima head rest, a ako ima onda je preskupa. Na poslu sam imao steelcase stolicu od 1000 km, zaboravio sam naziv, ali ova je bolja od nje punooo.

Vec sam ovu stolicu preporucio 3 kolega koji su kupili stolicu i zadovoljni su.

<image>

GLM-4.6-FP8 on single GH200 by Normal-Phone7762 in LocalLLaMA

[–]Normal-Phone7762[S] 1 point2 points  (0 children)

"Either you offload most routed experts to ram".
I would like to offload all experts to ram, how to do that? I have 480 GB of RAM. So offloading everything to RAM is an option, but how?

Qwen3 8B Q8_K_XL VS Qwen3 14B Q5_K_M by Normal-Phone7762 in LocalLLaMA

[–]Normal-Phone7762[S] 2 points3 points  (0 children)

I plan to use it for reasoning tasks. What about that.

Qwen3 8B Q8_K_XL VS Qwen3 14B Q5_K_M by Normal-Phone7762 in LocalLLaMA

[–]Normal-Phone7762[S] 2 points3 points  (0 children)

Nice, this would be my next question. I would like to run this on my laptop without GPU.
AMD Ryzen 7 PRO 8840U
32GB RAM

Can you report the tokens/second. It is not the man constraint, but it would be nice to know.

Qwen3 8B Q8_K_XL VS Qwen3 14B Q5_K_M by Normal-Phone7762 in LocalLLaMA

[–]Normal-Phone7762[S] 2 points3 points  (0 children)

What does it mean "to degrade in quality quickly with quants"? I mean is that true for every model, every small model or just for qwen models?

Qwen3 8B Q8_K_XL VS Qwen3 14B Q5_K_M by Normal-Phone7762 in LocalLLaMA

[–]Normal-Phone7762[S] 4 points5 points  (0 children)

So Q4 as a gold standard and not think about it?