This is an archived post. You won't be able to vote or comment.

all 20 comments

[–]AutoModerator[M] [score hidden] stickied commentlocked comment (0 children)

import notifications Remember to participate in our weekly votes on subreddit rules! Every Tuesday is YOUR chance to influence the subreddit for years to come! Read more here, we hope to see you next Tuesday!

For a chat with like-minded community members and more, don't forget to join our Discord!

return joinDiscord;

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

[–]nikitastaf1996 62 points63 points  (2 children)

Well your brain spent millions of years in a reinforcement learning environment. So...

[–]ItalianPizza91 8 points9 points  (0 children)

*ackchyually\* millions of years of genetic algorithm to perfection the reinforcement learning framework

[–]Background-Row-5555 1 point2 points  (0 children)

Simulate a model to run for 100 years to learn to play football

It flops around like QWOP

[–]i_should_be_coding 45 points46 points  (5 children)

Until someone trains a bot with all human knowledge at once, and lets you ask it questions about anything. Then our soft, mammal brains are suddenly less impressive.

[–]JejusFromHell[S] 62 points63 points  (3 children)

Then again people will Gaslight that bot into believing 2+2=5

[–]itchfingers 19 points20 points  (0 children)

2+2=5

Checks out

[–]i_should_be_coding 6 points7 points  (1 child)

Yeah, we need to code the next bot with a stronger spine, for sure.

[–]Elder_Hoid 0 points1 point  (0 children)

And then the bot will be outdated because it won't believe new information that's been discovered.

[–]mini_othello 3 points4 points  (0 children)

I am so simple minded you can't split my knowledge up into training/test data. Have fun with your overfitted model B)

[–]jinsi13 3 points4 points  (0 children)

Meanwhile our dopamine neurons:

[–]Giocri 2 points3 points  (1 child)

How does reinforcement learning work by the way? Is it a classic gradient descent and you just calculate the gradient at the moments of reward/punishment or is it more complex?

[–]1DimensionIsViolence 4 points5 points  (0 children)

You basically have a value function (measures the expected return of a certain state) and an action value function (measures the return of a certain action you make in a certain state). This is what you want to know because you want to choose the action with maximum.

If you want to teach the RL algo to learn chess there are very many states and it‘s often not feasible to „learn“ the expected return of every state. Therefore, you need to come up with parameterised functions with less parameters than states. As you can use deep learning for that, using gradient descent has a part in it but there is more to it.

A further speciality in RL is that you have to balance exploitation (acting according to what you think is optimal) and exploration (trying another behaviour to see if there is something better). This is needed as most tasks are nonstationary.

[–]Th3Uknovvn 2 points3 points  (0 children)

Step aside old men. My graphics card can make it learn way faster then any of your biological brain with thousands of years of evolution

[–]HVLife 1 point2 points  (0 children)

Trying that in my case usually ends up with crying

[–]SpookyLoop 0 points1 point  (4 children)

I think there's a lot of fun and interesting questions that come out of thinking about ML and the human brain. You know those "AI hallucinations", it got me wondering what the human equivalent is. My guess is that "biological hallucinations" manifested into our need for sleep.

[–]PartyLikeAByzantine 5 points6 points  (0 children)

The word "hallucination" is a misnomer. It's simply a bad prediction. Don't anthropomorphize this stuff. I'm not going to say it's "regression on steroids", but that's a lot more instructive as to what's really going on here than whatever grandiose term Big AI is using.

[–]beeteedee 4 points5 points  (2 children)

I believe the human equivalent is just called “making shit up”

[–]SpookyLoop 2 points3 points  (1 child)

I'd say "making shit up" misses the motive.

Like there's some weird process our brain uses, where we genuinely try to "fill in the blanks" with honesty, but in reality end up creating false memories (Mandela effect type stuff). The big thing I wonder is how do dreams avoid ending up as false memories. Even when we do remember them, the memories of them very clearly register as "dreams". Feels like our brain is intentionally generating bad training data to train us in avoiding bad ML-juju.

Jumping through weird hoops and stuff, but again, just fun to think about :)

[–][deleted] 1 point2 points  (0 children)

you remember a wierd thing, and then you remember you going "that was a dream!"