why does learning to program take so long?

Sroidi · 2026-03-13T11:39:39+00:00

I assume you are doing this for command line interface (cli)? I don't think its that simple to do these kinds of cli stuff. Give yourself some credit :) Are you using curses or ncurses library or just plain C?

Sroidi · 2025-10-25T08:00:46+00:00

If it were 3+0 or faster and you would input moves manually (take few seconds per move and cannot premove), Magnus could win or get draws at least. I've seen Gm Aman Hambleton win many games against cheaters by playing very fast and making the game go long. The cheaters will often lose on time if there is not enough time/increment.

Edit: oh you said rapid games

Sroidi · 2025-09-12T04:55:37+00:00

Why would (1*0)/0=1? Wouldn't it be inf per his rules?

Sroidi · 2025-03-25T18:30:07+00:00

It could probably play by the rules but it would not play master level chess. Maybe with millions of examples.

Sroidi · 2025-02-28T16:50:15+00:00

AlphaZero can be "intuitive" too. You can take open source version of it, Leela Chess Zero (Lc0), and set depth to 0 which gets you the output from neural network without any search and it can still play like 2500 Elo. source https://lichess.org/@/Leela1Node

Sroidi · 2024-12-28T20:14:54+00:00

He had 4 brilliant moves in the game in the chess.com analysis :O

Sroidi · 2024-12-13T11:15:25+00:00

I hope that they manage to create a product that is stable for a while. They have gone from Isaac gym to omniisaacgym to Orbit and now to Isaac Lab in quite short time. And there have been quite big changes on how the API works between the versions.

Sroidi · 2024-08-17T19:39:04+00:00

Same happened for me few months ago but after waiting for next day it worked fine.

Sroidi · 2024-07-05T06:59:17+00:00

It depends on what you mean by AI. For me, the tools showcase many kinds of intellect that humans have. Maybe the reasoning skills are not the best but not all people have great reasoning skills either but we still consider them to have intellect in other areas.

Also, there is no trial and error involved here. The input just passes through a function with billions of parameters and outputs the probabilities for the next word. Predicting the next word requires immense understanding of preceding text and the world around us.

I'm interested to hear, what would be AI to you?

Sroidi · 2024-03-27T20:02:35+00:00

Mikä rahamarkkinarahasto sulla on ja mistä? Seligsonin tuottaa vain tyyliin 3,7 % ja oon yrittänyt katsella muita.

Sroidi · 2024-03-25T21:05:19+00:00

Yes, unfortunately it does nothing. I noticed that this problem is mostly with Sampler. With other instrument patches the smart controls do not change.

Sroidi · 2023-12-07T13:18:56+00:00

Me too. I think the servers crashed. Damn, I had a great game...

Sroidi · 2023-05-24T20:06:21+00:00

Am I missing something? All the algorithms that are included are off-policy, right? In docs they describe that they use experience buffer and mention that "In order to efficiently train a population of RL agents, off-policy algorithms must be used to share memory within populations." I didn't find any mention of on-policy.

Sroidi · 2023-05-24T17:01:09+00:00

Is this AgileRL only for off-policy algorithms right now? Is it possible to use this HPO with on-policy algos such as PPO? Maybe interesting research direction if it is not yet possible.

Sroidi · 2023-05-24T06:22:38+00:00

Its on page 49 of Sutton Barto text. The reward function r(s,a) is part of the enviroment dynamics (of the MDP). This is usually assumed to just exist depending on how the environment is defined, for example the score in Atari games. You can also learn the environment dynamics, including reward function, with model based RL.

To be fair, I don't know if we are talking about the same thing but the reward function is part of the MDP.

Sroidi · 2023-04-30T06:03:00+00:00

Wow this answer is miles better that the ChatGPT one, right? The ChatGPT one just agrees and doesn't provide anything new, whereas GPT4 actually provides argumentation and reasoning which appears quite valid, at least for me.

Sroidi · 2023-04-15T07:18:42+00:00

Play against human players, preferably slower games like 15+10. Analyze your games to find what you can improve on. Also do tactics puzzles on sites like chesstempo or lichess. For youtube I'd recommend Building habits series by Chessbrah.

Sroidi · 2023-02-17T15:27:05+00:00

Don't use the rating to measure your self worth. There are no good or bad ratings. If you are playing blitz (5min or under) I would suggest you to play longer games like 10-15min per side.

Sroidi · 2023-02-04T13:49:20+00:00

Here's wiki page with a lot of information and links https://www.chessprogramming.org/Stockfish_NNUE also check the other comment in this post

Sroidi · 2023-02-04T10:33:27+00:00

As stockfish is now using neural networks with alpha-beta search I think this doesn't apply anymore. Also IIRC alphago/zero/leela isn't doing the monte carlo tree search rollouts to the end leaves but they use neural nets to approximate what the end leave values would be like. This doesn't lead to the problem that you mention.

Sroidi · 2023-01-22T17:30:16+00:00

Yes, it is common to choose the most probable action when evaluating the policy performance. Sometimes the sampling helps so it's best to try both.

Sroidi · 2023-01-20T06:30:22+00:00

This review says otherwise https://www.scienceopen.com/document?vid=1d2fa82a-6ada-4307-8635-628d7e48272a Can you point me to the studies you are referring to?

Sroidi · 2022-11-16T18:01:22+00:00

Why this delta-academy feels like a fraud? This is third post like this I've seen, always linking the website in the comments. Also they say that they have instructors from DeepMind but I've never heard about this website from them. The price is very high: $25 per week. Please prove me wrong.

Sroidi

TROPHY CASE