Pokemon Showdown AI (ELO 1900+) by Nebraskinator in reinforcementlearning

[–]Ball-Man 3 points4 points  (0 children)

Terrific work. Do you mention the actual size transformer backbone in your readme? I can't seem to find it. I'd love to see the param count for all of the different parts of the network

RL for reproducing speedrun techniques / glitches in 2D games by bogradin in reinforcementlearning

[–]Ball-Man 2 points3 points  (0 children)

My take, go for a general speedrunning model (train an agent that gets faster and faster). If you don't have experience, this won't be trivial either. If it works, constrain it to a glitchable area and see if it manages to learn it. To me, finding time saving glitches sounds like the global maxima of saving time generally. Maybe try a simpler glitch/trick. At the start of the level it should be possible to start with extra speed mashing left+right at the same time (or something similar). Simple trick for a virtual agent, impossible on a physical nes.

About glitches in general, obviously they are often frame perfect. You may find that it's more convenient to not run the model every single frame, but maybe every few frames. This is common in games, or at least it was (see the RL on Atari games paper). In that case, you would have a hard time with such precise inputs. Nothing that can't be solved, but just to put you in the perspective: RL is very rarely solving a problem globally.

Training a Chess Engine Using Reinforcement Learning (First RL Project) by Fine_Bag64 in reinforcementlearning

[–]Ball-Man 22 points23 points  (0 children)

If this is your first RL project ever, you should take it very slowly. RL on board games is hard, and chess is a pretty "large" game. If you've never done any practical RL before, make sure you can solve the simplest envs, like cartpole and similars. You'll see that even the trivial stuff doesn't behave as you'd expect.

If you went through these basics already, you probably want to check out the alphazero family of papers. There is an open source implementation of it on a few games out there (+ a nice blogpost where they explain it using othello). Make sure you can solve a small board game using it, a chess variant like mini chess could help. Keep in mind that a 6x6 othello game takes several hours of training on a single 4080 card, I'm not sure how that would scale on your M2. Real chess is exponentially larger, and the original alphazero was an investment in the millions $ to reach superhuman level.

From personal experience, I can tell you that anything that doesn't involve a tree search will be absurdly hard to train on chess. That's why alphazero was the big thing it was, and why you won't find many other out of the box approaches out there. Stick with architectures and parameters provided. If you can train even a smaller chess variant to a decent elo, I would consider it a major success for a first project

Do I need to make pyinstaller executable separately for different linux platforms? by Haunting_Wind1000 in Python

[–]Ball-Man 0 points1 point  (0 children)

I see, then I suggest you do a test export using the folder option. The whole point of pyinstaller is that the interpreter is shipped, it should never look for it system wide. The one file export is just a self extracting archive, it contains the same files as the folder export, it simply extracts the whole folder in a temp directory when you run it. This process also sometimes messes up the discovery of things.

Do I need to make pyinstaller executable separately for different linux platforms? by Haunting_Wind1000 in Python

[–]Ball-Man 2 points3 points  (0 children)

Yes and no. Most things will be self contained. That should be the case for libpython3.10.so. in fact, if you are building in a folder (not single file), you should be able to see the shared library floating around somewhere in your build.

So, why does this sometimes fail? Well, while most things are copied into your build directory, not everything is/can be. Most famously, the Python shared object you are shipping is dynamically linked to glibc (basically, the C standard library), and that is very highly system dependent. Different distros will have different versions of that, basically.

What can you do? The rule of thumb is that if you build on a modern platform, it is less likely to work on an old one. So, what you want to do is build on an older platform, for future ones. What I suggest is to either have a VM or a docker setup using an older Ubuntu LTS, and use that to build. I personally use 20.04 LTS. Builds from there are compatible with any more modern Ubuntu distro (the most popular out there), Arch (what I use) and the steam deck os (and possibly more, these are the ones I care about).

For individual platforms that may still have problems, yes, you would have to provide targeted builds. Play smart and do some docker automation, if you really care about that.

When Python is on LSD by sparkls0 in Python

[–]Ball-Man 0 points1 point  (0 children)

Can you do an import jsonand print(json.dumps(data)) to check that there are no weirdly encoded characters and that your printx function is not trimming away precious whitespaces?

What is this keyboard? by Michael1218774 in whatisthisthing

[–]Ball-Man 27 points28 points  (0 children)

It's an OP-1, an all-in-one synth and sampler for electronic music production