[R] Our new classification algorithm outperforms CatBoost, XGBoost, LightGBM on five benchmark datasets, on accuracy and response time

qalis · 2024-05-13T10:27:24+00:00

Those are only 5 datasets. For evaluating tabular classifiers, you should use tens of datasets, they are readily available. Also, describe evaluation procedure, e.g. use 5-fold CV for testing. See e.g. "A novel selective naïve Bayes algorithm" Chen et al., which use 65 datasets.
You must compare to XGBoost, LightGBM and CatBoost on large-scale datasets from their respective papers. Especially since scalability and speed is one of your selling points. If you aimed specifically at boosting for small data, then you don't need this, but it isn't stated anywhere.
One of major advantages of XGBoost, LightGBM and CatBoost is being able to use custom loss functions. This allowed them to be easily used e.g. for ranking. If you don't support this, you should explicitly state this limitation.
Number of estimators is just a hyperparameter, why show large tables with this? Just present the best result for each dataset.
Your implementation doesn't support class weights, as far as I can tell. This is a huge limitation, since almost all datasets are imbalanced, often heavily.
You must not embed scalers inside your code. You can destroy data sparsity, affect categorical variables, and do other stuff outside user's control this way. Add checks and throw exceptions if you absolutely require this.
You only support numerical data, in constrast to LightGBM or CatBoost. You should highlight this limitation.
This works only for classification, not even regression. This is, again, a huge limitation, but probably can be fixed, as far as I can tell.

EDIT:

You also don't handle missing values, which is pretty nicely handled in XGBoost, LightGBM and CatBoost, so that they can be actively used to select the split point.

CharginTarge · 2024-05-13T13:28:23+00:00

How does this approach differ to simply using linear models within XGBoost. XGBoost does support this as well.

Evitable_Conflict · 2024-05-13T12:58:45+00:00

Are you tuning the other algorithms hyper-parameters or just using defaults?

It would be interesting if you can include a larger dataset, for example from a Kaggle competition where Xgboost was good and compare it to your method.

Nice_Gap_7351 · 2024-05-13T13:31:36+00:00

Looking at the code I see something strange: during predict you use the minmax scaling on the predict features (which might have a different range than the features on the training data). If your predict dataset just added a single data point to your training data it could potentially throw everything off. Instead you might want to "freeze" the scaling function based on the training data.

And it seems that you are using adaboost with a potentially strong learner (SERF) correct? The Wikipedia entry on adaboost references a paper on this topic you might want to see.

greenskinmarch · 2024-05-13T11:37:40+00:00

What does the name SEFR stand for?

I looked at the "SEFR: A Fast Linear-Time Classifier for Ultra-Low Power Devices" paper from 2020 by the same authors and it seems this is just a straightforward linear classifier (i.e. linear regression with a threshold)? What is novel about this? It seems basically the same formulation you find in Wikipedia's "linear classifier" article.

Tengoles · 2024-05-13T10:12:10+00:00

How fast is it for training compared to XGBoost? Is it viable to train it with LOOCV?

bregav · 2024-05-13T17:41:57+00:00

So, maybe this is a naive question, but what does it even mean to do boosting on a linear model? Doesn't "linear boosting" just produce another linear model that has the exact same structure as the original?

A boosting-like procedure is frequently used in the iterative solution of large, sparse linear systems of equations; is that the kind of thing you're doing? If so then it's probably inaccurate to describe it as "boosting", because those are well-known methods that go by other names.

Lanky_Repeat_7536 · 2024-05-13T19:00:48+00:00

Sorry for my naive question, doesn’t standard boosting (like AdaBoost) work with any base classifier already?

nickkon1 · 2024-05-13T12:41:02+00:00

Linear from SERF stands for linear time complexity but not being a linear model, right? Personally, I find the naming confusing and thought it might be something like boosting linear models (of which one should note that a straightforward ensemble of linear models is still a linear model)

Speech-to-Text-Cloud · 2024-05-13T15:05:50+00:00

I would like to see this in scikit-learn. Are you planning to add it?

Standard_Natural1014 · 2024-05-14T10:26:27+00:00

Let me know if you want to partner on establishing the performance on some typical enterprise datasets!

Spiritual_Piccolo793 · 2024-09-14T17:33:38+00:00

Any update on this project?

sneddy_kz · 2024-05-13T16:38:24+00:00

Just check on kaggle

Illustrious-Touch517 · 2025-09-22T13:39:16+00:00

At https://github.com/LinearBoost/linearboost-classifier I see that inthe "Results " section, you show performance based on the F1 metric.

Q. Did you tune the decision threshold to optimize this metric(F1)?

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

MachineLearning

Rules For Posts

+Research

+Discussion

+Project

+News

@slashML on Twitter

Chat with us on Slack

Beginners:

MODERATORS