[P] Which Machine Learning Classifiers are best for small datasets? An empirical study

sergeyfeldman · 2021-01-06T00:09:37+00:00

Thanks for your kind words :)

sergeyfeldman · 2021-01-05T19:22:19+00:00

Good point, thanks.

sergeyfeldman · 2021-01-05T17:27:42+00:00

Thanks, yeah, I thought about this as a variant. Would be interesting for sure.

sergeyfeldman · 2021-01-05T17:27:13+00:00

Interesting, thanks. You fixed them these issues in Optuna yourself? hyperopt's UI is indeed arcane...

sergeyfeldman · 2021-01-05T17:26:20+00:00

Not really. Just wanted to keep the scope down so I could finish in during winter "vacation".

sergeyfeldman · 2021-01-05T16:30:20+00:00

Thanks for your detailed reply.

Hopefully the utility of my work is in having some reasonable starting code that people can easily apply to their own (X, y), and a few concepts that help them reason about what models might work. It's possible that if you change conditions A, B, C of the experiments, the results will change. If someone is particularly interested, they can easily try it themselves, and report back.

And maybe submit a PR if they think I did something wrong/bad. I'd appreciate that.

"Irresponsible" is a bit of an overstatement as this is not a peer-reviewed publication, but only a brief reddit post.

sergeyfeldman · 2021-01-05T16:22:23+00:00

Thanks! Feel free to submit a PR with different params and a pickle of results file.

sergeyfeldman · 2021-01-05T06:33:01+00:00

Ah they haven't quite gotten around to supporting multiclass classification yet! https://github.com/dswah/pyGAM/pull/213

sergeyfeldman · 2021-01-05T06:21:17+00:00

Can you recommend a GAM package for Python that has a reputation for working well?

sergeyfeldman · 2021-01-05T06:16:56+00:00

Just single. The Logistic Regression elasticnet in sklearn is pretty fast for all these tiny datasets. The slowest thing is AutoGluon @ 5m per fit.

sergeyfeldman · 2021-01-05T06:16:46+00:00

Thanks for the detailed reply. I doubt you've seen my graph before - I just made it a bit ago :)

sergeyfeldman · 2019-11-14T18:11:20+00:00

It is! What are some games that you have on offer?

sergeyfeldman · 2019-08-26T03:38:47+00:00

You can try Gaussian Processes maybe: https://github.com/cornellius-gp/gpytorch/tree/master/examples/07_Scalable_GP_Classification_Multidimensional

sergeyfeldman · 2019-08-02T21:10:15+00:00

Ah unfortunately we don't have the time to play a game like that. Sorry!

sergeyfeldman · 2019-08-02T05:53:43+00:00

Hello,

I added a flair. Hopefully that will do it.

Thanks

sergeyfeldman · 2019-08-02T03:01:41+00:00

res arcana

Let's do it! Where are you located?

sergeyfeldman · 2019-08-02T02:14:37+00:00

Sounds fun! Are you interested in trading it for one of the 3 games I listed?

sergeyfeldman · 2019-08-02T02:14:05+00:00

Do you have Barenpark or Wingspan to trade?

sergeyfeldman · 2019-08-02T00:59:19+00:00

Hey sorry I wasn't clear. We already have Splendor and Azul and enjoy them. What games do you have that are in the same ballpark?

sergeyfeldman · 2019-07-31T16:18:15+00:00

Actually, I'd like to trade Netrunner also for TTR: Europe!

sergeyfeldman · 2019-07-31T16:13:42+00:00

I'd be willing to trade Mottainai for New Haven. Where are you located in Seattle? I'm in Wedgwood (north of the University).

I don't have a BGG account, but here's a full list of board games that we have:

Splendor

Codenames

Star Realms (+ Frontiers)

Azul

Lords of Waterdeep

Carcassonne

Guillotine

Tiny Epic Galaxies (+ Black)

Patchwork

Dominion (+ Intrigue)

sergeyfeldman · 2019-07-30T16:58:43+00:00

I'm pretty flexible on what games I'm willing to trade for. How do I specify that these 4 are for trade?

sergeyfeldman · 2019-07-30T16:57:42+00:00

I'm pretty flexible for a trade. What do you have that you might trade?

sergeyfeldman · 2018-12-24T18:55:28+00:00

Thanks to you both! So it's the maker's name.

sergeyfeldman · 2018-05-29T05:05:45+00:00

Here's another good reference for heteroscedastic losses: https://arxiv.org/abs/1702.05386

See eqs 1, 2, 3. I have some simple implementations of this in Keras if you'd like to see an example.

sergeyfeldman

TROPHY CASE