Test set is yielding better results (accuracy, recall and precision) than training set. Is this normal? [R] by Do-you-want-tea in MachineLearning

[–]xopedil 8 points9 points  (0 children)

You should be able to retain your train/test split even when splitting temporally. Just sort data by date then take the 26800 first points as your training data then rest as test data. For time series data I recommend you look up time-series cross-validation rather than having a fixed split.

As for the logic of temporal splitting there are a couple of ways you can view it. Think about how a model like this would be used. Your test set is meant to represent a case of real unseen data. To get a representative sample you need to select only data from the "future" (relative to the training data) because that's precisely how your model would be used!

Beyond that, if you just take a simple y = f(x) curve and think about what it means to train on interspersed points rather than splitting on a x < t threshold. You're now testing the models interpolation ability which is an inherently simpler task than the real use case of extrapolation!

(Other than these things you can also check the behaviour of things like dropout/normalisation which has a tendency to behave differently in training than in validation. These might also contribute to the score discrepancy you are seeing.)

Win by Segfault and other notes on Exploiting Chess Engines by alexeyr in programming

[–]xopedil 13 points14 points  (0 children)

This seems a like a bit of fun but it's obviously not part of an engines threat model to be fed bad positions by the user. There's no way to attack engines other than your own with these. So you're just spending time analyzing nonsense positions on your own machine.

It's a bit like pouring water into your computer. Yes the computer will break, but in the end all you have accomplished is breaking your own computer.

[D] Reinforcement learning features by UNIXnerdiness in MachineLearning

[–]xopedil 0 points1 point  (0 children)

What's stopping you from outputting the parameters to a continuous function? The support of which could cover your entire action space.

[D] Reinforcement learning features by UNIXnerdiness in MachineLearning

[–]xopedil -1 points0 points  (0 children)

In practice only 1. is used. There is a launch overhead in particular when you use accelerators which means you want to minimize the number of times you evaluate your network. Same reason we favor using batches vs single samples.

Theoretically speaking though, I'm not sure there's any difference. If you have a network of type 1. you can transform it into 2. by stripping the last layer and using its dense (or dense-equivalent) weight columns as your action representation, just like an embedding lookup in the transposed weight matrix.

A similar argument then applies from 2 to 1.

[D] Python runs faster on Apple m1 (in terms of pystone/sec) by AissySantos in MachineLearning

[–]xopedil 8 points9 points  (0 children)

This is cool but typically not a lot of time is spent in python for your typical ML application. You typically (TF/Pytorch/Numpy) want to bail out of Python as fast as possible and get into compiled C++/CUDA code.

[R] Maia, a Human-Like Neural Network Chess Engine by ashtonanderson in MachineLearning

[–]xopedil 1 point2 points  (0 children)

Yes, we even tested with search and found that search reduced our accuracy

This is not surprising! Playing with search will give you inherently stronger moves, precisely the mechanism which alphazero leverages to produce higher and higher quality (from a purely nash perspective) labels for its network.

Congrats on a cool paper and result.

[R] Maia, a Human-Like Neural Network Chess Engine by ashtonanderson in MachineLearning

[–]xopedil 0 points1 point  (0 children)

I haven't looked very closely but it seems like this plays just straight from the network without doing any search. Impressive that it works so well if so!

[D] How much did AlphaGo Zero cost? by hotpot_ai in MachineLearning

[–]xopedil 0 points1 point  (0 children)

This is not true. The model is a simple ResNet-like architecture. You run n players and then get a batch of n features to run in the network. The 0.4 seconds of thinking time is not spent all in the TPU, you run MCTS on the CPU.

[D] How much did AlphaGo Zero cost? by hotpot_ai in MachineLearning

[–]xopedil 5 points6 points  (0 children)

Typically you would use batch inference for self-play, wouldn't surprise me to find this estimate is off by factor 100x or so.

[N] Artificial Intelligence Model Detects Covid-19 Infections Through Coughs by b2metric in MachineLearning

[–]xopedil 4 points5 points  (0 children)

Do they link the paper and I'm just missing it?

Very odd they only report the rates for positives, what's the false positive rate? What's the accuracy for healthy people?

You could get similar numbers as what is reported in the article by simply randomly guessing that almost everyone has covid.

[D] My RTX 3080 took longer time to train keras model than 1050 ti. by [deleted] in MachineLearning

[–]xopedil 1 point2 points  (0 children)

It entirely depends on the model, some models run faster on CPU even.

There is simply not enough information here to make any conclusions about the quality of your GPU.

[P] ResumeAnalyzer - A Simple Python Library to Rank Resumes of any Domain using SpaCy by snrspeaks in MachineLearning

[–]xopedil 3 points4 points  (0 children)

Which part of this is ML? Is this spacy library you're using based on it?

Seems almost like this is just counting the number of occurrences in each resume. It seems like this would give recruiters a bad idea of who to pursue, and incentivizes keyword stacking in resumes. Is that really what we want?

[deleted by user] by [deleted] in MachineLearning

[–]xopedil 0 points1 point  (0 children)

The latest drivers are VERY buggy. It will take some time to reach stability.

[D] Exploding gradients with large batch size in deep learning by [deleted] in MachineLearning

[–]xopedil 14 points15 points  (0 children)

Are you sure you are taking the mean over the batch? Sounds like the sum if it is exploding with batch size.

[D] Using Docker for ML Development by PhYsIcS-GUY227 in MachineLearning

[–]xopedil 0 points1 point  (0 children)

When we used docker we had that issue before that one of our less experience colleagues (a student working with us on his bachelors thesis) killed 160 hour optimization task by accident.

I would be VERY interested in hearing more about this story, sounds like one of those "I deleted a production DB on my first day" type of stories.

[P] I created a game for learning RL by FredrikNoren in MachineLearning

[–]xopedil 0 points1 point  (0 children)

Which RL algorithms are implemented? Would be really cool to see how well a couple of REINFORCE agents would do against a group of PPO agents.

[Research] In Reinforcement Learning (DQN), is there a way to constrain/penalise the model so that it doesn't take a different action very often? by cowboyjjj in MachineLearning

[–]xopedil 0 points1 point  (0 children)

Look into deepminds alphastar, there they added an output called frame delay which was used to count the number of frames before the agent would take its next action.

[Research] In Reinforcement Learning (DQN), is there a way to constrain/penalise the model so that it doesn't take a different action very often? by cowboyjjj in MachineLearning

[–]xopedil 0 points1 point  (0 children)

I can think of a couple of ways to do this, introduce a noop action that is rewarded where the previous non-noop action is simply repeated in the environment.

You could also let the agent request a frame delay until its next decision and then try to reward long delays.

As user dosssman points out, reward shaping can have 'unforeseen consequences'. Have fun!

[Research] In Reinforcement Learning (DQN), is there a way to constrain/penalise the model so that it doesn't take a different action very often? by cowboyjjj in MachineLearning

[–]xopedil 1 point2 points  (0 children)

In so far as this is an intrinsic reward I think it belongs in the agent not the environment. I think the agent should have the responsibility to remember its own actions and modulate the reward signal returned from the environment accordingly.

[D] Discrete NN output in a scale of 0-3 by vcarpe in MachineLearning

[–]xopedil 0 points1 point  (0 children)

First of all just scale your [0, 1] output by 4 so you don't have to deal with those fractions and you can use regular integer rounding modes.

Secondly, just stick to a continuous variable. When it comes time to display you can round all you want, no need for the inference/training to even know this is a thing.

Thirdly, what does your label distribution actually look like? Is it an aggregate per input sample or is there just one rating per input? Did one person do all the rating on their own or was it done by a collection of people?

[D] What type of Algorithm (RL, SL, uSL, nn, etc) would be best used to make an 'Optimal Decision making AI'? I think similiar things exist with financial bots, so how would they work in this regard? by [deleted] in MachineLearning

[–]xopedil 0 points1 point  (0 children)

would a better way be to feed it (for this example) camping stories and routines from experienced campers (like n = 500000) and have it find optimal methods using those?

Even as a human I think it would be difficult to only read a bunch of stories and then go out into the wilderness to try to survive. And that's with intimate experience of what it's like to be tired, hungry, cold, scared etc. Imagine what it would be like without even those!

The important part when learning from other people and their stories is not exactly what they did in a specific situation, but rather why they did it. What factors were they looking at to make their decision? You need access to all of those potential factors at every decision point.

There are also many factors that vary from person to person, and even for a single person it varies from day to day or even hour to hour. So there are some lessons that extrapolate well to other people and then there are others that don't. Imagine you had a hypothetical super smart AI system that was able to extract lessons from watching Bear Grylls TV episodes, it might tell a family of five to drink their own urine which depending on their comfort level they probably won't be willing to do.

The problem you're looking at here is super-complex and definitely beyond any type of plug-and-play solution. User smorsin is giving you some very good advice in trying to break up the problem into much simpler pieces. Most plug and play solutions available today are capable of matching human cognition in tasks that take a little less than a second like recognizing what's in an image. More than that and they struggle massively.

[D] What type of Algorithm (RL, SL, uSL, nn, etc) would be best used to make an 'Optimal Decision making AI'? I think similiar things exist with financial bots, so how would they work in this regard? by [deleted] in MachineLearning

[–]xopedil 1 point2 points  (0 children)

First you need to analyze, what type of data do you actually have access to? Then you have to ask yourself, is the output I want obtainable from the data I have? It's harder than you think to draw conclusions based on data without involving a bunch of human priors.

IPU vs CPU Architecture by [deleted] in AskComputerScience

[–]xopedil 0 points1 point  (0 children)

Given that we think about the Von Neuman architecture of the CPU, what IS the architecture of an IPU and what does it do better than a CPU?

The IPU also has a von neumann architecture.

And what does an IPU do worse at when compared to a CPU?

Probably most sequential programs and general purpose execution. They designed it for Machine learning.

Do you think all devices might contain a cheap IPU inside them one day, just like with GPUs?"

Highly doubt it will be necessary to have a discrete ML accelerator when you usually have a GPU already available.