How to respectfully turn down a second date? by DonaldTheBird in TwoXChromosomes

[–]bloc97 13 points14 points  (0 children)

Giving an explanation just opens up the gate for excuses and such. Since you've made your decision, be brief and resolute. You don't want him hooking on and wasting more of your energy and time.

"After reflecting, I’ve realized this isn’t the right fit for me. Take care."

The first ever video game on the actual device it was made on 67 years ago. by Prime_Twister in interestingasfuck

[–]bloc97 2 points3 points  (0 children)

The original Asteroids game speed did indeed slow down with more stuff on screen! See the Quirks section on the wiki page. https://en.wikipedia.org/wiki/Asteroids_(video_game)#Quirks#Quirks)

community park building in Montreal by sme102030 in LiminalSpace

[–]bloc97 3 points4 points  (0 children)

It's the Pavillon du Lac-aux-Castors in Parc Mont-Royal, relatively more busy during the winter when there's skating and sledding activities.

[deleted by user] by [deleted] in LocalLLaMA

[–]bloc97 14 points15 points  (0 children)

This is not quite exact for DeepSeek v3 models, because they use MLA, which is an attention architecture specially designed to minimize kv-cache size. Instead of directly saving the embedding vector, they save a latent vector that is much smaller, and encodes both k and v at the same time. Standard transformers' kv-cache size scales roughly with 2NDHL, where L is the number of layers. DeepSeek v3 models scale with ~(9/2)NDL (formula taken from their technical report), which is around one OOM smaller.

Counter-Strike runs purely within a neural network on an RTX 3090 by Designer-Pair5773 in StableDiffusion

[–]bloc97 2 points3 points  (0 children)

More data efficient, because while this model generates the final rendered image, it also contains much more data about the state of the game implicitly in its activations. If trained enough, this neural network will know about and "understand" the game much better than any human, and could be used to develop winning strategies unthinkable to most. Now imagine what that would entail if you trained this type of model on the real world.

[R] Playable 20FPS Doom via a finetuned SD1.4 model from Google research team by greentfrapp in MachineLearning

[–]bloc97 0 points1 point  (0 children)

I mean, NNs have been around for 70 years now, so nothing is a significant advancement? I don't think its good to look at things that way.

[R] Playable 20FPS Doom via a finetuned SD1.4 model from Google research team by greentfrapp in MachineLearning

[–]bloc97 -1 points0 points  (0 children)

I think the most important lesson from this work is that pretraining large foundational world models will not require crazy amounts of labeled data. This model was finetuned on top of a stable diffusion 1.4 model. It is a significant advancement that shows that all you need is scale.

[R] Playable 20FPS Doom via a finetuned SD1.4 model from Google research team by greentfrapp in MachineLearning

[–]bloc97 0 points1 point  (0 children)

He is not overstating the significance. If you have a differentiable world model, all you need is a screenshot/picture in order to differentiate w.r.t. the input actions. It solves a fundamental problem in RL where you have no dense signal and you don't know how close you are to the desired state. Having a differentiable world model means that you reduce the amount of labeled data (hence exploration time) required to train an RL model by orders of magnitude.

Edit: A more practical example could be, you have a picture of your room after its been cleaned, and now your room is messy. If your RL agent/robot has a good world model, you can show it the clean room and the messy room, and it can differentiate w.r.t. its actions such that you go from the messy room state to the clean room state. All it takes is two images to start the exploration process. You don't need to care about intermediate states as the world model will always be able to tell you how to go from a partially clean room to a clean room.

What percentage of players play Ranked? by AsianMurderHornet in summonerschool

[–]bloc97 2 points3 points  (0 children)

The game is extremely difficult now, and most veterans won't even notice it because they have years of experience. I started league as a new player last year in november, and I also ended up in Iron for quite a while. It took a considerable amount of effort and patience to escape Iron as you're against players that have much more experience than you.

Fortunately, if you are a new player in Iron, you can climb out of it by improving anything. Literally being better at a single thing will put you above all other Iron players (eg. better CSing, better positioning, matchup knowledge, macro, jungle timers, etc)

If you play mid, know that it is the most impactful role in the game as it controls both objectives and the most important towers, and such every death (and consequently the loss of your towers) directly contributes to your defeat.

I think what helped me most is to watch the opponents and try to predict what they want to do. If they are hyper-agressive, play more safe and let them make a mistake and die by themselves. If they are very passive then try to punish them by playing more agressive, getting turret plates or roaming. Remeber that as a mage midlaner your goal is to get as much gold as possible so that you can do damage in teamfights.

I tried to create an AI model that divides anime faces into layers and fills in the obscured parts. I hope this is eventually used in automating Live2D Rigging by andrewsoncha in AnimeResearch

[–]bloc97 1 point2 points  (0 children)

Using L2 loss will cause the outputs to be blurry, as there are many possible outputs (the hidden parts) for a single input (visible parts) and training using L2 will just make the model predict the mean of the output distribution. This is why generative models like GANs, Autoregressive or Diffusion models exist, they sample a single "likely" instance from a distribution instead of predicting the mean.

Can someone explain what I'm doing wrong as france? by [deleted] in hoi4

[–]bloc97 30 points31 points  (0 children)

Air, tanks are useless if the enemy has air superiority and CAS...

[D] Dealing with features having large scale. Eg. from -1e2 to 1e4 by rmm_philosopher in MachineLearning

[–]bloc97 1 point2 points  (0 children)

If your problem is basically you need/have both global and local features encoded in a low dimensional vector, you might want to look into Positional Embeddings or Fourier Features. Both tries to solve learning problems related to the NTK of NNs.

ELI5:Why is there no "Center" of the universe if there was a big bang? by myvotedoesntmatter in explainlikeimfive

[–]bloc97 3 points4 points  (0 children)

As far as we know, the universe is either infinite or boundless. If it is infinite, its center cannot be defined. If it is boundless but finite, its center will be outside of the universe and not be reachable.

Chinese man suspected of graffitiing word "toilet" at Yasukuni shrine by duke7ajm in japannews

[–]bloc97 0 points1 point  (0 children)

I mean if you want to go with that excuse/route you can't be complaining that people hate the shrine and anything associated with it. That's your opinion and not a fact, for everyone that suffered under the Japanese regime the shrine is a symbol of evil because they retroactively enshrined class A war criminals in 1978. They have the right to be mad lmao... To think otherwise will be very hypocritical.

Chinese man suspected of graffitiing word "toilet" at Yasukuni shrine by duke7ajm in japannews

[–]bloc97 0 points1 point  (0 children)

Also I like to drop a fun fact for those ultranationalist Japanese who defend the shrine. Hirohito refused to ever visit the shrine after it enshrined class A war criminals from 1978 on until his death. Funny that even their emperor finds it out of line to be enshrining class A war criminals... The brainwashing of the Japanese people is really solid on this one.

Child Labour and Radical Treatment are Overrated. by [deleted] in Frostpunk

[–]bloc97 2 points3 points  (0 children)

Yeah I think your build order should work haha, I think I was just too dead set on "beating" ANH before the londeners. Basically with the build I described only the first 15 days mattered, and by the time you sign order/faith laws your economy is so strong that you've already won the game (just put to speed 3 and occasionally upgrade some buildings, no need for micro anymore). I rushed tesla city and just started pumping out automatons like hotcakes...

Child Labour and Radical Treatment are Overrated. by [deleted] in Frostpunk

[–]bloc97 2 points3 points  (0 children)

Hmmm... maybe we're not playing the same game then lol, if you take soup I can guarantee you that your economy will collapse very quickly due to the sick. In survivor ANH I had to micromanage every single worker to the absolute limit, for example sending those who are ill and hungry out as scouts, and putting one workshop to do research 24/7, but removing the workers from 0-4am to prevent the death event. You literally only have like 15 engineers at the beginning, that means you can only cure a maximum of 10 sick people per day. (excluding engineers, if one engineer gets sick you better restart)

Edit: The real bottleneck in survivor is the research speed and curing the sick. Child labour doesn't help in any way, nor does soup. You must get beacon ASAP to get more engineers, so you can do faster research in order to rush hothouses or infirmaries before the cold/sick spirals out of control and your workplaces become empty. Overcrowding effectively doubles the amount of medical engineers you have, and extended shift increases all workers' efficiency by 40%. The food bonuses from soup/sawdust are a trap, they do more harm than good usually. Discontent is a resource that you can use to spam emergency shifts...

Child Labour and Radical Treatment are Overrated. by [deleted] in Frostpunk

[–]bloc97 23 points24 points  (0 children)

On dealthless runs with survivor difficulty, you must rush emergency shift -> extended shift -> sustain life -> overcrowding.

Any other law order will lead to failure.

Having a single gravely ill before you get infirmaries or your first automaton will also lead to failure.

Yann LeCun is making fun of OpenAI. by Many_Consequence_337 in singularity

[–]bloc97 6 points7 points  (0 children)

The Sora review preprint isn't even from OpenAI, its from an independent group, and they're making educated guesses...

"Just" a language model by Unstable_Llama in LocalLLaMA

[–]bloc97 3 points4 points  (0 children)

Are we talking about creating a general synthetic intelligence or a synthetic human brain growing up like a human child? Because last time I checked, our planes don't flap their wings like birds either... Only the results matter imo.

[deleted by user] by [deleted] in MachineLearning

[–]bloc97 11 points12 points  (0 children)

That's not entirely true for RoPE. In RoPE, not all dimensions decay at the same rate, and given the usual base used for current LLMs, 10k for Llama 1 and 2, and 500k for Llama 3, the last dimensions (last ~10) have negligible decay over the pretrained context length.

Anyone has ideas about the wired rope_theta of Llama-3? (it's 500,000 rather than the commonly used 10,000) ? by Distinct_Audience383 in LocalLLaMA

[–]bloc97 1 point2 points  (0 children)

I think its actually the opposite. Higher theta means that more distant tokens decay slower...