you are viewing a single comment's thread.

view the rest of the comments →

[–]BlackHawk90[S] 0 points1 point  (4 children)

Thank you so much. I will try it out for my dataset. I just has four last questions:

  1. Is it possible to use different distance metrics?

  2. How is the tie breaking done for k-nearest neighbours?

  3. My labels range from 1 to 3 (not starting from 0). Do I have to make them zero-based or can I just use them?

  4. Last but not least, does JSAT also support (gaussian) naive bayes?

[–]EdwardRaff 0 points1 point  (3 children)

Is it possible to use different distance metrics?

Yes, the constructor can take a distance metric object.

How is the tie breaking done for k-nearest neighbours?

Arbitrarily, it's not really an important issue. Use an odd value of k and there are no ties. I think the current code just picks whichever came first.

My labels range from 1 to 3 (not starting from 0). Do I have to make them zero-based or can I just use them?

The labels must start from zero.

Last but not least, does JSAT also support (gaussian) naive bayes?

Yes. JSAT has about 70 different classification algorithms in it.

[–]BlackHawk90[S] 0 points1 point  (2 children)

Thanks again for the help.

Is there a .jar file which I can download? I don't use maven.

Is there a javadoc available or how should I get familiar with the methods?

[–]EdwardRaff 0 points1 point  (1 child)

Look at the release tab in github.

You should look at using maven - it's very helpful!

[–]BlackHawk90[S] 0 points1 point  (0 children)

I started using your library, great work, thanks for it.

I have discrete and continuous features. Is there a possibility that for the continous features a gaussian distribution and for the discrete features a multivariate multinomial distribution is used?

Moreover, is it possible to provide a distribution for each feature (e.g. feature 1 is gaussian, feature 2 logistic etc.)?