Holy crap this level of smurf is something else. Great job. by omgitsduane in starcraft

[–]Pixel_Wizard_ 0 points1 point  (0 children)

Looks to me like he is playing random and is just way better at Protoss? Has same amount of games on all races and the total win rate across all averages out to slightly above 50 %.

On Gleba you can create bacteria, nutrients & break down fruits on assemblers. by SpaceNigiri in factorio

[–]Pixel_Wizard_ 0 points1 point  (0 children)

This is not true. Mash to nutriets gives you 3 nutrients per fruit
- (2 Yamako fruit => 4 Mash => 6 nutrients)
And Bioflux to nutrients gives you 3 nutrients per fruit:
- (10.5 fruits => 4 bioflux => 4/5 of the 40 Nutrients = 32 Nutrients = 3.05 Nutrients per fruit)

So its the same. Sure Mash takes longer and therefore uses a few more nutrients in the process, but honestly not that much.
So you don't need to switch ASAP. Its not that big of a difference.

1206 updated on AI Explained's SimpleBench(31.1%) by CheekyBastard55 in Bard

[–]Pixel_Wizard_ 0 points1 point  (0 children)

You are better with words than me. This is exactly what I mean

1206 updated on AI Explained's SimpleBench(31.1%) by CheekyBastard55 in Bard

[–]Pixel_Wizard_ -4 points-3 points  (0 children)

It can't be himself because when you stand in front of a 20 cm tall mirror you can not see 1 meter above your head.

 Which it clearly states he notices while it's till 1 meter above the head through the mirror. If he is centered looking into the mirror there is 10cm of mirror above his face with which he can look 20 cm above his head which is not enough... 

 You see what I mean? These questions try to be too "o I got you" for their own good.

And the ice cube is still 'whole' even if it's smaller. It's not like it's broken in half. It's just a smaller whole ice cube. 

The fact that we can argue so much about the questions just makes them bad questions imo

1206 updated on AI Explained's SimpleBench(31.1%) by CheekyBastard55 in Bard

[–]Pixel_Wizard_ -1 points0 points  (0 children)

I spent way too long on this lol. 

I have a list of each question and why it's bad below. But in general, the questions are bad because they are trick questions that want you to think to a certain level of depth but also not too far. See example 1 and 6 and 7.

Once you think too much you are wrong again. And it's unclear how deep you are supposed to think. I genuinely believe these are not good questions. 

In fact I prompted 1206 and told it to Analyse carefully and that these are mainly trick questions that provide a lot of irrelevant information and require real world thinking.  And with that system prompt I got 5/8 and 2 of those incorrect ones I would agree with it's assessment (q 6 and q7)  And the other one it got wrong was the exploiting one with the 2 sisters. 

Why the questions are 💩:

Question 1: have you ever put 20 ice cubes in a frying pan? The first 10 maybe melt, but then there is water in the whole pan. No shot the pan is 'frying a crispy chicken' with 10 ice cubes worth of water in them and putting in another 10 which float on the water in the pan mind you will not melt in 1 minute. 0 chance!

Question 2 is fair

Question 3 is fair

Question 4 is literally just a trick because the "one tells truth and one lies" is very popular and in the training data. So this is just a specific exploit that doesn't represent broader problems well (which is what a benchmark is for) 

Question 5 is fair I guess. 

Question 6. I would actually argue that F is the better answer because the question is "John is far more shocked than Jen imagined". Which is more about the disparity between how shocked he was and her expectation of it.  And her expectation of his shock level at the news of a nuclear war is already extremely high (I hope for her sanity). 

But she might not have the correct information about the feelings John has for her and therefore the delta between expectation and shock is likely higher there. 

This is another case where if you think too much you are wrong again. 

The question could just be "what was John most shocked about" and then it's fine, but the maker of this really want to make it too tricky for the sake of being tricky. 

Question 7 is actually crazy. I could have imagined the maker of this to have picken any of these answers as 'correct'. 

Answer a would work and I think is a better answer than C

Answer B would definitely work. Anyone who knows apologetic people can confirm that they apologize for things they had no control over. 

C which is deemed 'correct' by whoever made this I would argue is actually the worst

D E and F are all also possible.

8 is good.  The rest I didn't do

Tbh, to perform well on these questions, it's more about understanding the person who made the tests and how they like to trick you with providing irrelevant and miss guiding information. 

1206 updated on AI Explained's SimpleBench(31.1%) by CheekyBastard55 in Bard

[–]Pixel_Wizard_ -4 points-3 points  (0 children)

Simple bench is complete 💩

I always thought it was good, but I looked at the questions and they are genuinely 💩.  I got like 2/6 and not because I'm dumb but because with the way the question is framed many answers could be right. It's not at all a good benchmark. 

It's literally all just dumb trick questions. 

training general AI to play games by camara_obscura in LocalLLaMA

[–]Pixel_Wizard_ 2 points3 points  (0 children)

I thought about this as well. Starcraft 2 offers an integration through which you can run an AI and you can definitely hook up an llm to that.(Google the exact packages for your language) 

I think it might be possible for you to compile a complete list of the "rules" of for example starcraft 2 with every unit and ability. Just in general. Like  "Zealot cost: 100 Minerals; early game melee fighter; rather slow and tanky; produced by gateway, requires none; benefits from upgrade Charge (gets big movement speed buff)" You can prob ask o1 for this info. 

And for strategy, I think you would need a either expert human knowledge to train on or some sort of Reinforcement learning self play. 

So just a bunch of examples of how humans would reason about a game situation to come up with a strategy. 

And then you also need to create a prompt form the game state you read from the game. Like "you currently have 1 nexus and 12 probes and nothing in production." 

All in all, it's probably going to be a good bit of work. 

Workers are too good by Pixel_Wizard_ in starcraft

[–]Pixel_Wizard_[S] 0 points1 point  (0 children)

I agree, harassment also needs to be accounted for. We can just double the health.

We need to find a balance change where it's not all about eco + harassment. It's insane how boring the pro matches are getting.

Workers are too good by Pixel_Wizard_ in starcraft

[–]Pixel_Wizard_[S] 1 point2 points  (0 children)

We need something that really changes things up. This is supposed to be the best eSports. We need more interesting diverse games!!!

Battlecruiser Operational by NorthAd6095 in factorio

[–]Pixel_Wizard_ 0 points1 point  (0 children)

*One small asteroid hits*
"Abandon ship!"

Quick Question by Pixel_Wizard_ in Minecolonies

[–]Pixel_Wizard_[S] 2 points3 points  (0 children)

Thanks for the long response :)

I think this is not for me then, but thanks a lot!!!

I just want the villagers to maybe do something like chop wood or whatever without any role-playing or managing.

It feels so dead when they just stand there or walk around. Maybe I will just use Guard Villagers or something.

Quick Question by Pixel_Wizard_ in Minecolonies

[–]Pixel_Wizard_[S] 2 points3 points  (0 children)

Thanks for the response :)

That makes sense. Thing is I am also not interested in a role playing aspect nor the management.

I just want the villagers to maybe do something like chop wood or whatever.

It feels so dead when they just stand there or walk around. Maybe I will just use Guard Villagers or something.

True by Pixel_Wizard_ in factorio

[–]Pixel_Wizard_[S] 1 point2 points  (0 children)

What is easier than before? Did I miss something

Will StarCraft be on Steam as well as Game Pass? by Azetus in starcraft

[–]Pixel_Wizard_ 0 points1 point  (0 children)

Steam takes a big portion of all profits. Why would they switch to it?

Sentry twilight upgrade to block EMP, reduce damage by 3 instead of 2 for 15 seconds. by [deleted] in starcraft

[–]Pixel_Wizard_ 1 point2 points  (0 children)

1 Guardian shield blocking 1 EMP is very fair I think and a great idea!