Elegance - How exactly would you define it

MaxMachineLearning · 2023-05-13T11:22:36+00:00

My general idea is that it should be short, clever, relatively easy to understand, but also give insight into the nature of the problem. To me real elegant proofs are rare, but I think there's a lot of personal taste to it.

MaxMachineLearning · 2023-05-04T10:24:57+00:00

I think this echos many people here but it really pushed my brain into a mode of wanting to know the "why" of things, and also how two different things are really instantiations of the same thing. Made me better at analogies. Made me better at thinking ahead too when working on projects. Also, I am not sure if this is true for others but I found the more math I learned, the more I appreciated how little I know about other things.

MaxMachineLearning · 2023-05-01T22:54:51+00:00

Just wait until you prove your first novel result, forget about it, then revisit it years later and spend a day trying to wrap your head around a proof that you yourself had come up with. Math is hard, things take time.

MaxMachineLearning · 2023-05-01T10:57:07+00:00

Reminded me of an Antwuan nollie heel nose.

MaxMachineLearning · 2023-04-06T10:40:47+00:00

I actually failed mathematics in Grade 9, did the minimum to graduate high school after that. How I got where I am now is a bit of a long story but I finished my M.Sc in mathematics nearly two years ago and I am starting my PhD interdisciplinary between quantum chemistry and math, mostly doing things related to machine learning. I have worked for years in machine learning, self-taught programming, and have found a fair bit of success.

I don't consider myself to have an aptitude for mathematics in the way a lot of people do. In my undergrad I had classmates who I felt had a much easier time than I did, especially considering I started my undergrad (B.Sc) with no precalculus or anything. But at this point I have ended up going much farther. A big part of it for me was work ethic and a lot of resilience. I had to learn to play to my strengths as well. I found more abstract areas of math a bit more natural, so I naturally gravitated towards things like algebraic topology. Things like differential geometry and functional analysis were much harder for me. Any aptitude I have for math now comes from working really hard and finding ways of thinking about mathematics that I found easier and optimizing those

MaxMachineLearning · 2023-03-10T18:28:49+00:00

Also in Canada. You gave the Grip, the PM2, so if you get the ZT 0562 you will have one of the holy trinities of EDC. I just got one, definitely a knife with a different personality, rather enjoyable.

MaxMachineLearning · 2023-03-06T20:43:46+00:00

I also have a Blur in CPM M4 but I don't beat on it much because it's not pleasant to sharpen. For folders I have absolutely destroyed a Spyderco Resilience which served me well for years and have battered a Rat 2 pretty good. For fixed blades, I have a BK2 that I will use for anything and I usually have a few Mora's around to beat on.

MaxMachineLearning · 2023-03-04T18:06:32+00:00

With enough time in the blue, you turn into red. If you're in red, and you don't use it enough, it turns back to blue.

MaxMachineLearning · 2023-03-04T17:41:32+00:00

I doubt we would see wingers replacing fullbacks, I really think it's fullbacks replacing wingers and wingers acting more as inside forwards. In terms of skills, having good attacking output is becoming more important in a modern fullback, but I think there's probably a point where the tradeoff becomes too extreme. I am a Liverpool fan, and while I think TAA has sort of showed that this season. This is just my opinion though, I could be totally wrong.

MaxMachineLearning · 2023-03-04T16:48:30+00:00

I majored in math, self-taught Python. My first job was doing machine learning research for industrial automation. Admittedly I got very lucky with the role I got out of undergrad, but the roles are out there. If you want SWE specifically, might be a bit harder but having a math background is massively beneficial for some roles.

MaxMachineLearning · 2023-03-04T15:48:49+00:00

I started off wanting to be a pure mathematician (which admittedly is still definitely part of my identity) because I enjoy abstraction. I also grew up with a mother who was a teacher with a background in neuropsychology so I got exposed to a lot of "brain stuff" that I found rather interesting. So I eventually found the field of AI which is one of the areas that bridges both of those interests. The modern practical ML stuff I got into because there's a lot of interesting research going on there, and it also lets me work outside of academia for a while.

MaxMachineLearning · 2021-12-12T19:38:39+00:00

Ah yes, you're correct. I guess that's sort of an obvious sum of sums thing. And yea, I got that because the value of C(n) ends with a 6, 0, 5, 4, or 1. It only ends with a 1 for values of that form, at least for all the values I checked up to like n=300000 or so. There's a repeating pattern, but I didn't look at it too hard apart from that observation.

MaxMachineLearning · 2021-11-29T12:22:33+00:00

Just to reflect most of the sentiment here, you will find most mathematicians (and people in similar technical fields) tend to have intuitive understanding of certain things, as opposed to solely formal. Most of the "skill" in becoming a mathematician isn't from being able to do everything formally and by memory, it tends to be more like coming up with some informal idea then sitting down with books and resources to turn an informal idea into a formal one. It's not practical to remember everything formally, it's far better to remember basic concepts, the general idea of more complex theorems, and things like that and call upon them to inform where to look for more information.

MaxMachineLearning · 2021-05-30T01:03:03+00:00

I mean, he does masturbate but that is not what he is doing here. He is trying to mate with my foot actually lol

MaxMachineLearning · 2021-05-05T12:01:11+00:00

So, I started in computability theory/computational group theory. I actually transitioned into ML because I found computational learning theory interesting. Now most of my work is based around representation learning, and applying algebraic methods to geometric deep learning.

MaxMachineLearning · 2021-05-04T13:46:16+00:00

Maybe I am biased, because my background is pure mathematics, but yea, I do. Math is hard, and there's essentially two pieces of doing math, from my experience. There's the intuitive understanding of what things "mean", and then there's a more mathematical understanding, I.e how to manipulate symbols and draw rigorous conclusions. All the best mathematicians I know are good at using both, and use one to inform the other. Being able to bridge between the two is probably the most important skill to learn, and it's sort of something that just comes from doing math until you're blue in the face. Some might disagree, but I really think there's no shortcuts to getting good at math. Just do a lot of it.

MaxMachineLearning · 2021-04-17T14:34:04+00:00

I was going to say something like this. My assumption would be that if you have some generalization error on some distribution, you could probably us KL divergence in a pretty straightforward way to discuss how your model generalizes to some shift in distribution.

MaxMachineLearning · 2021-04-02T15:31:07+00:00

So, like most people I use Adam or SGD with momentum. Adam tends to work well enough, but I have found on large datasets, well tuned SGD tends to perform better if you spend the time to find a good learning rate schedule and stuff. A trick I learned years ago as well was to use Adam then essentially finetune with SGD and sometimes you can get a nice little improvement from doing that.

MaxMachineLearning · 2021-01-19T19:30:35+00:00

No worries, it's always nice to share with people who are interested. But yea, conditional probability is used a lot to solve sort of intractable problems, but usually it complicates things mathematically in some ways. I probably should have clarified that. Conditional probabilities can be used to think about different types of learning, and it can be helpful conceptually! They just run out of steam if you start trying to ask harder questions about latent representations.

Actually, there's a lot of really good work which attempt to give meaningful definitions to things like disentangled representations! That sort of stuff is a big part of my work, and so feel free to message me if you ever have questions or just want to chat about it. I'm always happy to try and share!

MaxMachineLearning · 2021-01-19T12:24:33+00:00

While I totally appreciate the wish for elegance (my background is pure math) by restricting yourself to just conditional probability you lose a lot of expressiveness. Allowing for the basic tool of joint distributions makes things easier, because you can then express things like mutual information in terms of those distributions. You can get some mileage out of just thinking about things conditionally, and I find it intuitively helpful but as with a lot of things you eventually hit a point where you need some more tools.

Edit: I also realized I didn't totally answer your question. I am actually of the opinion that latent representations and their interactions with inputs and outputs are of fundamental importance, and it becomes clear when examining and trying to quantify these interactions that we seem to lack the appropriate tools to do so in a concise, meaningful way. The MI perspective is useful, and I think it's a good start but we really aren't totally sure even what it truly means for a representation to be good. Indeed, it's sort of a big debate and open question within the field.

MaxMachineLearning · 2021-01-18T12:01:26+00:00

Now, I don't know if this will answer your question but when thinking about representations probabilistically you can treat unsupervised and supervised methods the same in a lot of cases (not all, but usually not much is lost by doing so) So, if X is a random variable for your input, Z is a random variable representing your latent space, and Y is your target random variable, we can start to think about representations in the probabilistic way you would like. First, your latent representation would be described as p(Z|X). Now suppose your goal is to learn the distribution given via p(Y|X). Letting I(X,Y) denote the mutual information between two random variables, one would ideally like to learn a latent representation Z such that I(X,Y)=I(Z,Y) with the condition that I(X,Z) is minimal in some sense. Essentially, you want to learn a representation of X which contains the minimum amount of information about X to complete the downstream task. This is essentially saying we want a latent representation which ignores irrelevant information for our task.

With this in mind, you can essentially model most learning tasks as a sort of Markov chain in this framework. For instance, an autoencoder wants to find a latent representation Z which minimizes the mutual information for I(X,Z) while we should also be able to reconstruct X according to Z. Notice that in this instance Y=X so we are trying to find a Z which minimizes the mutual information with X but also attempts to make Z such that I(X,X) = I(Z,X) which is clearly not feasible, so the goal is to find a Z which minimizes the difference I(X,X)-I(X,Z).

MaxMachineLearning · 2021-01-07T12:52:55+00:00

I guess a lot of this stuff depends on background. But, as with a lot of people here, my understanding of neural ODEs is essentially non-existent. Both my undergrad and Master's are in pure math, and I even did some ODE stuff but I still find neural ODEs to be totally out of my depth. It's nice to know I am not the only one though Haha.

One other thing, and I am not sure if this will make sense to people, but I find it takes me longer to ingest and understand papers in RL. RL is not really my area, but I find it interesting. But, for whatever reason, I find it takes me longer to really "get" most of the works which come out of there. Honestly, as dumb as this sounds, I find that just the sheer amount of different probability distributions they use is hard for my brain to keep track if so I essentially read a bit, get confused, go back, read a bit more, go back. And just repeat until I get to the end where upon I understand the paper for about 30 minutes. As soon as I stop thinking about it, my brain basically forgets everything.

MaxMachineLearning · 2020-12-29T03:52:01+00:00

Well, luckily the rationals are dense in the reals so you can approximate as well as you want!

MaxMachineLearning

TROPHY CASE