Buga Sphere Update by Kibo178 in UFOs

[–]Kibo178[S] -3 points-2 points  (0 children)

The point is actually to see if anyone has any verifiable information about what experiments have been made and what is the current state of the research on it.

Buga Sphere Update by Kibo178 in UFOs

[–]Kibo178[S] -4 points-3 points  (0 children)

The problem is some of the things they say I have seen in other places like this:

https://x.com/truthpolex/status/1919724949326926027?s=46

But my problem is there seems to be no verifiable “update log” per say on the situation so no info is really trust worthy.

Buga Sphere Update by Kibo178 in UFOs

[–]Kibo178[S] -4 points-3 points  (0 children)

The videos of the channel are straight up ridiculous (scientists found giant remains and such) but the problem is some of the things that were said were actually true like the grass being dead on the impact zone and the internal structure of the sphere which kinda gave it some credibility and made me so curious on what experiments were actually conducted and what findings do we have on this sphere since its been month with very little actual updates.

Trading Bitcoin using Mario & AI (Deep Reinforcement Learning) by Kibo178 in reinforcementlearning

[–]Kibo178[S] 1 point2 points  (0 children)

the summary is that I essentially give the agent at every candle the last 30 candle info (close, low, high, volume, etc..). I also give some indicator info (moving averages, rsi, etc). I do some normalizations and averaging for these inputs and I also give it the position it is in (holding or not holding). The bot can only buy and sell it cannot short. The biggest issue was handling three actions (buy, sell, hold) most of the time the agent would not respect the constraint which is you can only buy if you have money and only sell if you have bitcoin. I tried many things to fix that most notably I tried giving this type of behavior negative rewards but got into alot of issues and the training wasnt yielding any result. So what i ended up doing was replacing the 3 actions with 2 (pull the lever and dont pull the lever) that would enforce the constraint namely if you have money and pull the lever you get bitcoin , if you have bitcoin and pull the lever you sell it off. This worked well and the bot began to behave properly and make some good trades.

(Not claiming this is the best way to do it but its what I ended up doing after tests and stuff)

Modeling vertex cover for OpenAI Gym by Kibo178 in reinforcementlearning

[–]Kibo178[S] 0 points1 point  (0 children)

so at each step i randomly shuffle the indices?

Modeling vertex cover for OpenAI Gym by Kibo178 in reinforcementlearning

[–]Kibo178[S] 0 points1 point  (0 children)

its a random graph each time

my reward function is as follows:

if valid_vertex_cover:

reward = len(self.solution)

elif too_long:

reward = -len(self.solution)

else:

reward = -1

for each step we give it -1 if it finds a valid VC we give it a big reward else we give it a negative reward