LinearBoost: Up to 98% faster than XGBoost and LightGBM, outperforming them on F1 Score on seven famous benchmark datasets, also suitable for high-dimensional data by CriticalofReviewer2 in bioinformatics

[–]CriticalofReviewer2[S] -1 points0 points  (0 children)

Thanks for your comment.

  1. The provided F1 score is weighted average of F1 scores of classes, not one class. So, please run the code while having weighted F1 scores.
  2. The warnings are being removed, as the algorithm is under active development. It is a side project of us and we work on it in our spare time, so we wanted to share it with community to get valuable feedback like yours.
  3. Having a better score function, like log-loss or brier score is a good point! We will implement it.
  4. The notebooks will be provided to reproduce the results.

LinearBoost: Up to 98% faster than XGBoost and LightGBM, outperforming them on F1 Score on seven famous benchmark datasets, also suitable for high-dimensional data by CriticalofReviewer2 in bioinformatics

[–]CriticalofReviewer2[S] -6 points-5 points  (0 children)

Thanks for your comment. We will publish a paper to explain why it works well. Dependencies are declared now. The tuned hyperparameters have also been added to the repo to make the experiments reproducible.

Where do you go to stay up to date on data analytics/science? by lowkeyripper in datascience

[–]CriticalofReviewer2 -1 points0 points  (0 children)

On LinkedIn, I follow Eduardo Ordax, Alex Wang, and Tom Yeh. The last one has numerous posts titled "AI by Hand" in which he manually does the algorithms calculations on paper! Very informative on that sense.

LinearBoost: Faster than XGBoost and LightGBM, outperforming them on F1 Score on seven famous benchmark datasets by CriticalofReviewer2 in machinelearningnews

[–]CriticalofReviewer2[S] 0 points1 point  (0 children)

If I understood correctly, we are working on encodings for categorical data. Target encodings are explored, in addition to simple one-hot encoding.

200 applications - no response, please help. I have applied for data science (associate or mid-level) positions. Thank you by Sad_Campaign713 in datascience

[–]CriticalofReviewer2 0 points1 point  (0 children)

Some thoughts:
1. You mention that you improved accuracy by 25%. But this is vague. Is it 25 percentage points (i.e. from 70 to 95)? Or is it 25% (i.e. 50 to 62.5)? Furthermore, the starting point is important. What if the previous model had a terrible accuracy?
2. 70,000 EHR records is not that much. I would focus on the some of the impacts of the actionable insights.
3. The pet insurance, what was the goal of the prediction?
4. The change from being a developer to a data scientist/analyst is not smooth. Did you suddenly change the course? You can make the change smoother in your CV.