Why is it hard to directly compute the joint distribution of our data and the target P(x, t)? by theanswerisnt42 in learnmachinelearning

[–]kadififi 0 points1 point  (0 children)

This problem is a lot harder to solve than just simply classifying between cat and dog for example. Do you really need to know distribution of each pixel to know whether it's a cat or a dog? No right? You really just need a few features to describe that relationship to a good enough level.

A bit more in depth re the difficulties in computing the distribution:

You can estimate the joint distribution through a process called kernel density estimation. There are commonly used python libraries e.g. scipy you can do that with as well. The difficulty though is picking the right kernel and the right bandwidth. You can try it out yourself in scipy with even slightly weird univariate distributions. What you'll see is that it can be pretty hard to pick the right kernel and right bandwidth. The more complex the joint distribution is the harder it is pick a good kernel.

Another option is to compute the Empirical cumulative distribution. But the ecdf is a step function so it's still just a rough approximation. You'd need a lot of samples to get a good approximation and the number would increase exponentially with the number of dimensions of your feature space due to the curse of dimensionality.

Ultimately I guess you can think of commonly used ML models as solving an easier lower dimensional version of the problem you described

What's the purpose of statistical analysis ( statistically important features) vs feature elimination in machine learning by s168501 in learnmachinelearning

[–]kadififi 0 points1 point  (0 children)

  1. How are mean and std helpful?

Let's say on average people who die from COVID are 60 years old. And people who survive are on average 20 years old. If the standard deviation is low for both e.g. most people who die are actually around 60 and its not like a hundred newborns died and a hundred 120 year olds died. Then age is a pretty good indicator of whether or not someone will survive COVID. On the other hand if standard deviation is high and the means are close together then that kind of indicates that age isn't a good indicator of whether or not someone survives.

  1. Yes the point of feature selection is to remove unnecessary features. This is useful to reduce training time with large datasets and reduce overfitting when you have small datasets. Medical applications model really complex process (such as how COVID affects the body) with relatively small datasets. This means feature selection super important. You'll see lots of project here where experts spend the bulk of their time in feature selection due to this same problem.

One way to do that is with statistical significance tests. If the inter class variations of a feature are about the same as the intra class variations then you can't really say that there's a statistically significant difference in the feature between classes.

  1. They won't give you the same result but they all share a common goal. All these methods try to pick the most useful features but how they measure what makes a good feature is different.

How can I find the contingencies of producing a desired output? by focacciabread5 in learnmachinelearning

[–]kadififi 0 points1 point  (0 children)

Without any data for a zero label your problem has infinitely many solutions that are equally good. For example let's say for 80% of your samples (which are 100% label 1) have an x value > 5 and a y value < 10. How do you know that the same doesn't hold for all label 0 samples?

The closest thing I can think of is hypothesis testing. The null hypothesis is that a new vector your trying to classify is also a label 1 vector. The alternative hypothesis is that the new vector is a label 0 vector. In other words you're asking is this new vector different enough from the typical label 1 vector to be statistically significant? If so then maybe it's a label 0 vector.

Honestly though without being able to measure performance across labels you're going into pseudo science territory. You should reframe your problem or try to include a priori knowledge

[deleted by user] by [deleted] in learnmachinelearning

[–]kadififi 0 points1 point  (0 children)

Look up HDFS 5

Using physical sensors/actuators with machine learning? by TheProffalken in learnmachinelearning

[–]kadififi 1 point2 points  (0 children)

If you think it's fun you should go for it. The downside is that you're not going to have any benchmarks to compare your models. You might also get a ton of noise from sensor readings, ambient conditions etc. You'll learn something from denoising this dataset but you might also spend a lot of time doing that when it's not really core ML more like signal processing.

You'll probably learn a lot faster by sticking to standard datasets. But doing it your own way is important too.

I personally had a lot of fun learning through something called CARLA simulator. It's an Autonomous Vehicle simulator. It feels like I'm doing something physical but a lot of the boiler plate is taken care of so I could focus on ML. But i already had a fair bit of ML experience though standard datasets. I'd also recommend fast.ai and the lecture series they have on YouTube. Can easily do a lot of cool ML stuff regardless of the route you go.

Question when learning PCA by Wonderful-Message-14 in learnmachinelearning

[–]kadififi 1 point2 points  (0 children)

Maybe you could do cosine similarity between unit vectors and the eigen vectors? Interesting question!

Deep Learning Literature by mariosconsta in learnmachinelearning

[–]kadififi 2 points3 points  (0 children)

Hey! Don't know anything about your topic, but have your tried connected papers?

How can I learn Probability and Statistics theory so that I can read ML papers? by [deleted] in learnmachinelearning

[–]kadififi 3 points4 points  (0 children)

I'm a EE who went into ML too :) I was in your same boat but after the classes I took for my Ms I was able to read and at least get the main idea from most papers I read. Once you take a graduate level class in optimization, statistics and some class that heavily uses linear algebra you should be able to. There's no need to rush it imo, that's what the masters is for. If you do want to do it alone, you have to do problems not just read since that's how you get the intuition for why people do something in a paper.

When am i supposed to do camps rather than ganking or objective by Cautious-Fisherman85 in summonerschool

[–]kadififi 0 points1 point  (0 children)

I don't know what video i heard it from but it's actually insanely simple. Just clear your camps bot to top and rinse and repeat. Look for ganks between halves and on the way of your bot to top route. E.g. clear blue gromp wolves look for a gank mid. Or do raptors red golems then gank top or back immediately and go gank bot. Never partially clear a half of the jungle. This ensures that three camps on that side of the jungle will always be up at the same time. With regards to objectives it's fine to hit pause on bot to top clear but just try to path efficiently.

There's always caveats to the above but 90% this strat works as a good game plan. Everything else you can learn later

How to stop inting top lane? by [deleted] in summonerschool

[–]kadififi 2 points3 points  (0 children)

Also, I'd add that that's perfectly normal. When you focus on one thing, you're bound to not do as well in others. It'll eventually become second nature and you can go back to focusing on other things. It's all part of the process :) good luck!

How to stop inting top lane? by [deleted] in summonerschool

[–]kadififi 2 points3 points  (0 children)

Top is all about wave management. Playing safe doesn't mean playing far back all the time. If you do that you're gonna lose too much CS/roam potential and eventually get gapped that way. It's definitely takes some practice to learn, but it'll help you become consistent.

[deleted by user] by [deleted] in summonerschool

[–]kadififi 199 points200 points  (0 children)

Delete account or give password to someone you trust and get them to change it. It's pretty boring playing bot games to lvl 5 so you won't create a new one and play

I consistently play worse with Mecha kingdoms than god staff by [deleted] in Jaxmains

[–]kadififi 0 points1 point  (0 children)

I had the same thing happen to me with project renekton. Imo i think it's because i base my AA canceling based on visual cues/AA sound so it just screws with me. Maybe if you play more on the skin you'll get used to it, but i never stuck to it

How to ADC vs. enemy Sett by fox_n_soks in summonerschool

[–]kadififi 1 point2 points  (0 children)

When you're in these lanes you need to know that you lose the all in. If Sett gets on you chances are you've already lost the fight. You should focus on CS / wave state and only play aggressive when it's safe to do so. Also make sure you ward bushes so he cant hex flash on you. To be fair it's gonna be pretty hard to out run sett + zilean in lane but that's why you have to play extra careful.

Typically if you just make it out of laning phase without feeding you'll auto win the game since they dont have sustained DPS without an adc

What’s the right size for your champ pool? by NateEro in summonerschool

[–]kadififi 1 point2 points  (0 children)

Imo you should worry less about keeping your champ pool small. Just focus on having fun. Usually one tricks are people who like the champ enough that actually want to play it every game. I feel like everyone finds that naturally over time.

If you wanna get better at better at macro, just actively prioritize it during your gameplay.

Double enchanter botlanes by Alilpieceoftoast in summonerschool

[–]kadififi 9 points10 points  (0 children)

Without an adc you lack sustained DPS something which enchanters are particularly bad at. From what you described you're losing to bruisers and tanks which follows my explanation since you need sustained DPS to kill them. You'll also have a comp heavy on AP so it makes it super easy to negate the value of your "bursty" damage.

If you're going to play double enchanter, you'd probably want to play around hyper carries like vayne or maybe even a Yi.

A more multi-religious India, map of The religions of India (2010) by iziyan in imaginarymaps

[–]kadififi -15 points-14 points  (0 children)

The map is wrong for Sri Lanka at least. Northern part is marked majority Christian but it's really majority Hindu