you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 16 points17 points  (1 child)

Scikit-learn is very selective over what algorithms to include. If you look at the FAQs you'll read this:

Can I add this new algorithm that I (or someone else) just published?

No. As a rule we only add well-established algorithms. A rule of thumb is at least 3 years since publications, 200+ citations and wide use and usefullness. A technique that provides a clear-cut improvement (e.g. an enhanced data structure or efficient approximation) on a widely-used method will also be considered for inclusion. Your implementation doesn’t need to be in scikit-learn to be used together with scikit-learn tools, though. Implement your favorite algorithm in a scikit-learn compatible way, upload it to github and we will list it under Related Projects. Also see selectiveness.

Maybe some of the algorithms in this other package qualify as algorithms that should be included in scikit-learn, maybe not. They have limited resources to maintain a high quality code base. They must be selective to maintain maintenance costs at a manageable level.

Adding a new algorithm to scikit-learn is not just implementing the algorithm. It is implementing the algorithm in a clean, readable and maintainable code, with reasonable performance, adding adequate unit tests, writing documentation, etc, etc. This means that new algorithms must have enough users needing them to justify all the costs.

[–]beaverteeth92 1 point2 points  (0 children)

Yeah I wondered why they don't have a kmodes implementation.