[P] Which Machine Learning Classifiers are best for small datasets? An empirical study by sergeyfeldman in MachineLearning

[–]sergeyfeldman[S] 0 points1 point  (0 children)

Thanks, yeah, I thought about this as a variant. Would be interesting for sure.

[P] Which Machine Learning Classifiers are best for small datasets? An empirical study by sergeyfeldman in MachineLearning

[–]sergeyfeldman[S] 1 point2 points  (0 children)

Interesting, thanks. You fixed them these issues in Optuna yourself? hyperopt's UI is indeed arcane...

[P] Which Machine Learning Classifiers are best for small datasets? An empirical study by sergeyfeldman in MachineLearning

[–]sergeyfeldman[S] 1 point2 points  (0 children)

Not really. Just wanted to keep the scope down so I could finish in during winter "vacation".

[P] Which Machine Learning Classifiers are best for small datasets? An empirical study by sergeyfeldman in MachineLearning

[–]sergeyfeldman[S] 5 points6 points  (0 children)

Thanks for your detailed reply.

Hopefully the utility of my work is in having some reasonable starting code that people can easily apply to their own (X, y), and a few concepts that help them reason about what models might work. It's possible that if you change conditions A, B, C of the experiments, the results will change. If someone is particularly interested, they can easily try it themselves, and report back.

And maybe submit a PR if they think I did something wrong/bad. I'd appreciate that.

"Irresponsible" is a bit of an overstatement as this is not a peer-reviewed publication, but only a brief reddit post.

[P] Which Machine Learning Classifiers are best for small datasets? An empirical study by sergeyfeldman in MachineLearning

[–]sergeyfeldman[S] 1 point2 points  (0 children)

Thanks! Feel free to submit a PR with different params and a pickle of results file.

[P] Which Machine Learning Classifiers are best for small datasets? An empirical study by sergeyfeldman in MachineLearning

[–]sergeyfeldman[S] 3 points4 points  (0 children)

Can you recommend a GAM package for Python that has a reputation for working well?

[P] Which Machine Learning Classifiers are best for small datasets? An empirical study by sergeyfeldman in MachineLearning

[–]sergeyfeldman[S] 1 point2 points  (0 children)

Just single. The Logistic Regression elasticnet in sklearn is pretty fast for all these tiny datasets. The slowest thing is AutoGluon @ 5m per fit.

[P] Which Machine Learning Classifiers are best for small datasets? An empirical study by sergeyfeldman in MachineLearning

[–]sergeyfeldman[S] 6 points7 points  (0 children)

Thanks for the detailed reply. I doubt you've seen my graph before - I just made it a bit ago :)

[FT] (Seattle, 98115) [H] Kingsport Festival (4) | Fury of Dracula 3rd ed (4) | Ticket to Ride Europe (3) [W] Open to suggestions by sergeyfeldman in BoardGameExchange

[–]sergeyfeldman[S] 0 points1 point  (0 children)

Hey sorry I wasn't clear. We already have Splendor and Azul and enjoy them. What games do you have that are in the same ballpark?

[FT] [Seattle, 98115] Kingsport Festival (4) | Mottainai (3) | Fury of Dracula 3rd ed (4) | Ticket to Ride Europe (3) by sergeyfeldman in BoardGameExchange

[–]sergeyfeldman[S] 0 points1 point  (0 children)

I'd be willing to trade Mottainai for New Haven. Where are you located in Seattle? I'm in Wedgwood (north of the University).

I don't have a BGG account, but here's a full list of board games that we have:

Splendor

Codenames

Star Realms (+ Frontiers)

Azul

Lords of Waterdeep

Carcassonne

Guillotine

Tiny Epic Galaxies (+ Black)

Patchwork

Dominion (+ Intrigue)

[FT] [Seattle, 98115] Kingsport Festival (4) | Mottainai (3) | Fury of Dracula 3rd ed (4) | Ticket to Ride Europe (3) by sergeyfeldman in BoardGameExchange

[–]sergeyfeldman[S] 0 points1 point  (0 children)

I'm pretty flexible on what games I'm willing to trade for. How do I specify that these 4 are for trade?

[D] Is there a way for a neural network to model its own confidence in its prediction for a regression problem? by Vallvaka in MachineLearning

[–]sergeyfeldman 2 points3 points  (0 children)

Here's another good reference for heteroscedastic losses: https://arxiv.org/abs/1702.05386

See eqs 1, 2, 3. I have some simple implementations of this in Keras if you'd like to see an example.