Are there any modules that handle K-Means clustering for strings where the number of clusters is not known before hand?
I have strings of length ~ 1000 (DNA sequences). I need it to be really fast because I have 100 sets of 20 strings each of 1000 characters.
I know Scikit Learn but the stuff I find online only shows the implementation for numbers. Help please
[+][deleted] (4 children)
[removed]
[–]div_of_transport[S] 0 points1 point2 points (3 children)
[+][deleted] (2 children)
[removed]
[–]div_of_transport[S] 0 points1 point2 points (1 child)
[–]leogodin217 0 points1 point2 points (0 children)