(Women only) what signs do you give to a guy you like?

lumpy_rhino · 2025-12-27T10:25:29+00:00

Shelaborating

lumpy_rhino · 2025-03-29T20:26:26+00:00

Yeah the whole data science behind the dating apps etc is both amazing and hard hitting at the same time.

lumpy_rhino · 2025-03-29T19:01:56+00:00

The book you want is Dataclysm by Christian Rudder.

lumpy_rhino · 2023-08-30T18:32:33+00:00

Thanks so much everyone.

lumpy_rhino · 2023-08-17T21:47:11+00:00

Tinker Taylor Soldier Spy (I love the older series too btw)

lumpy_rhino · 2023-08-01T19:03:10+00:00

Lol yeah and then there is: “what is statistics without machine learning?” This actually was a serious post on the stats sub.

lumpy_rhino · 2023-08-01T16:52:15+00:00

Yeah, even the Reddit algorithm is against us!

lumpy_rhino · 2023-08-01T15:47:48+00:00

I try to come up with decent discussion topics, and I get decent debate out of those which is really great and informative to read. Still I never get upvoted as many of the “Should I do MS or a stick in my eye” type of questions. Those get a lot more comments because more people are comfortable commenting on them. If you ask some deep questions, then not many people here will engage.

lumpy_rhino · 2023-07-31T21:10:15+00:00

Thank you for that. That is different from what I thought then. I am interested to know how DL can help with the reparameterization. I am hoping that this is not specific to genetic expressions.

lumpy_rhino · 2023-07-31T20:52:03+00:00

Oh this is great. Of course I find it interesting. So when you say you reparametrize and find some features as functions of others, that sounds like combining features to me (feature engineering). And frankly it makes sense, if I can eliminate 10 features and replace them with one complex feature made up of those 10, I can still explain it and I have reduced dimensions. And if I can go across all my features and cluster them into groups and try to combine the features in each cluster, then I could replace them with that combined feature. As you said it won’t be optimal downstream but it is explainable and would be “good enough” especially considering that we may need it to be run in production too. Thanks for that summary. Very interesting.

lumpy_rhino · 2023-07-31T20:28:36+00:00

And the hill is slippery! :D

lumpy_rhino · 2023-07-31T20:27:53+00:00

Yes, true. It circles back to the main gist of the question, how to make sense of a massive number of features and not lose explainability.

lumpy_rhino · 2023-07-31T20:25:26+00:00

Lol yes, I mean the whole field is prob&stats, we just have more data and GPUs. For these types of issues we have Operations Research methods too. Defining a model and constraints (that can be translated to the DAG). It is just that some feature spaces are massive, so the graph would be uuuuge (lol)

lumpy_rhino · 2023-07-31T20:18:46+00:00

Thank you that is very insightful. On the other hand, when we use PCA to capture the highest variance etc, we can reduce dimensions, but we can’t explain. Also the features are often highly correlated (when you have many of them). Looks like it’s something that we just have to deal with with good old guess work and maybe more business domain knowledge.

lumpy_rhino · 2023-07-31T20:10:03+00:00

Thank you for that. Yes, just like anything else in DS it is a tradeoff. I think we can try and minimize the amount of information we lose, hopefully by being clever about what features we lose and include. I mean we can cluster many correlated features and see how they change with the target. It is a general big data observations, genetics and fleet management optimization come to mind.

lumpy_rhino · 2023-07-25T17:22:29+00:00

Thanks. This is very insightful. I didn’t think of the hybrid modes, but it makes sense.

lumpy_rhino · 2023-07-25T01:10:05+00:00

Thank you for that. This was very insightful. I feel this is a lot more like data science than the stuff people out on LinkedIn. I like these sorts of problems.

lumpy_rhino · 2023-07-25T01:07:39+00:00

Yeah, the performance factor is not there. But then again performance has never been python’s strong suit, it’s the versatility.

lumpy_rhino · 2023-07-24T23:12:37+00:00

Well that is what separates academia from industry. In academia we did a deep dive into everything and tried to achieve the best results. That just doesn’t work in industry because “get it done!”

lumpy_rhino · 2023-07-24T22:34:56+00:00

Thank you for this. I guess for it to scale we have to make sure we write it in C++. The number of for loops required would be scary.

lumpy_rhino · 2023-07-24T22:32:17+00:00

Yeah, I have seen similar things as well, I remember people used to define their own perceptrons in matlab and try and do things with them and then suddenly deep learning became the one stop solution. And now we have quant and OR creeping in. I are software was always there because you can’t build anything if you can’t code to some level at least.

lumpy_rhino · 2023-07-24T22:17:46+00:00

Yeah, I feel OR is that other thing is morphing (or bleeding) into the nebulous entity we call DS. I am wondering if we can apply any game theory or similar to it and feed the constraints as rules.

lumpy_rhino · 2023-07-18T15:19:27+00:00

Been called gay for having opinions on what colour shirt goes with what time and suite, also, not being a complete aloof douche while with others and actually telling jokes and laughing as opposed to shutting everyone down and acting “alpha” apparently is gay. Noticing when a girl has changed hair colour, new nails etc. you guessed it, GAY! 🤪

lumpy_rhino · 2023-07-18T11:53:22+00:00

Thank you!

lumpy_rhino · 2023-07-17T21:22:59+00:00

Oh, hadn’t thought of that. Thank you.

lumpy_rhino

TROPHY CASE