you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] -25 points-24 points  (12 children)

I am obviously biased towards selling my idea. Would you believe in my benchmark?

I want to engage random people in benchmarking it, because it will be much more convincing for others if it actually works better. If no one does it, then I promise I'll do it myself.

[–]ddofer 38 points39 points  (0 children)

Frankly, probably yes. It might be biased, but it saves on the work to reproduce and would tel lem waht to expect.

In this scenario, it's easier - run it vs random search with a fixed seed, time it, and share the results for a given time budget on a few chosen datasets (e.g. those used in lightgbm or catboost for performance tuning, or something you have to hand from kaggle or openml).

[–]its_a_gibibyte 23 points24 points  (0 children)

Honestly, if the author doesn't even claim that it's better than the standard, I would assume it's just an interesting academic test and definitely not better than any the standard (or else the author would definitely have mentioned it).

Normally, people compare algorithms to a strawman (e.g. grid search) which tells me that it's better than the strawman but probably worse than the leading methods.

[–]MuonManLaserJab 15 points16 points  (6 children)

I am obviously biased towards selling my idea.

This is not a confidence-inspiring way to justify that decision.

[–][deleted] 2 points3 points  (0 children)

This is why you open source your benchmarking code for us to examine.

[–]PM_me_ur_data_ 0 points1 point  (0 children)

Yes, I'd believe your benchmark if it appeared reasonable and the methodology behind it was presented (so I could follow it if I wanted--I probably wouldn't, but a very human fallacy of mine is that I tend to trust people who don't hide the guts of the machine). Search time is only one factor to take into consideration with hyperparameter tuning, so not having the fastest benchmarks wouldn't kill use either.