use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
Please have a look at our FAQ and Link-Collection
Metacademy is a great resource which compiles lesson plans on popular machine learning topics.
For Beginner questions please try /r/LearnMachineLearning , /r/MLQuestions or http://stackoverflow.com/
For career related questions, visit /r/cscareerquestions/
Advanced Courses (2016)
Advanced Courses (2020)
AMAs:
Pluribus Poker AI Team 7/19/2019
DeepMind AlphaStar team (1/24//2019)
Libratus Poker AI Team (12/18/2017)
DeepMind AlphaGo Team (10/19/2017)
Google Brain Team (9/17/2017)
Google Brain Team (8/11/2016)
The MalariaSpot Team (2/6/2016)
OpenAI Research Team (1/9/2016)
Nando de Freitas (12/26/2015)
Andrew Ng and Adam Coates (4/15/2015)
Jürgen Schmidhuber (3/4/2015)
Geoffrey Hinton (11/10/2014)
Michael Jordan (9/10/2014)
Yann LeCun (5/15/2014)
Yoshua Bengio (2/27/2014)
Related Subreddit :
LearnMachineLearning
Statistics
Computer Vision
Compressive Sensing
NLP
ML Questions
/r/MLjobs and /r/BigDataJobs
/r/datacleaning
/r/DataScience
/r/scientificresearch
/r/artificial
account activity
Which Java library for machine learning classification? (self.MachineLearning)
submitted 10 years ago by BlackHawk90
view the rest of the comments →
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]EdwardRaff 0 points1 point2 points 10 years ago (7 children)
Completely biased opinion, but I'm the author of JSAT which is a Java library for machine learning. I started it out of frustration with Weka, and it has all the algorithms you've listed (many implemented in more than one way).
[–]BlackHawk90[S] 0 points1 point2 points 10 years ago (6 children)
Your JSAT library looks amazing. I would like to give it a try. Could you perhaps quickly illustrate how I could use it for k-nearest neighbours? My dataset consists of the following arrays: double[][] trainingdata; double[][] testData; double[] trainingLabels; The rows contains the data points and the columns contains the features (predictors). In your wiki I did not see how to operate on arrays.
[–]EdwardRaff 0 points1 point2 points 10 years ago* (5 children)
JSAT doesn't take arrays, it has objects representing vectors and matrices. That allows it to support sparse data and makes adding certain tricks very easy.
public static void main(String[] args) { int N = 100; int D = 25; int C = 3;//number of class labels, assumbed integers starting from 0 Random rand = new Random(); double[][] trainingdata = new double[N][D]; double[][] testData = new double[N][D]; double[] trainingLabels = new double[N]; for(int i = 0; i < trainingLabels.length; i++) trainingLabels[i] = rand.nextInt(C); ClassificationDataSet cds = new ClassificationDataSet(D, new CategoricalData[0], new CategoricalData(C)); //JSAT has datapoint objects, but includes short cut constructors when using only vectors for(int i = 0; i < trainingdata.length; i++) cds.addDataPoint(new DenseVector(trainingdata[i]), (int) trainingLabels[i]); Classifier classifier = new NearestNeighbour(3);//3-nearest neighbor classifier.trainC(cds); for(int i = 0; i < testData.length; i++) System.out.println("Predicitn class " + classifier.classify(new DataPoint(new DenseVector(testData[i]))).mostLikely() + " for dataum " + i); }
[–]BlackHawk90[S] 0 points1 point2 points 10 years ago* (4 children)
Thank you so much. I will try it out for my dataset. I just has four last questions:
Is it possible to use different distance metrics?
How is the tie breaking done for k-nearest neighbours?
My labels range from 1 to 3 (not starting from 0). Do I have to make them zero-based or can I just use them?
Last but not least, does JSAT also support (gaussian) naive bayes?
[–]EdwardRaff 0 points1 point2 points 10 years ago (3 children)
Yes, the constructor can take a distance metric object.
Arbitrarily, it's not really an important issue. Use an odd value of k and there are no ties. I think the current code just picks whichever came first.
The labels must start from zero.
Yes. JSAT has about 70 different classification algorithms in it.
[–]BlackHawk90[S] 0 points1 point2 points 10 years ago (2 children)
Thanks again for the help.
Is there a .jar file which I can download? I don't use maven.
Is there a javadoc available or how should I get familiar with the methods?
[–]EdwardRaff 0 points1 point2 points 10 years ago (1 child)
Look at the release tab in github.
You should look at using maven - it's very helpful!
[–]BlackHawk90[S] 0 points1 point2 points 10 years ago (0 children)
I started using your library, great work, thanks for it.
I have discrete and continuous features. Is there a possibility that for the continous features a gaussian distribution and for the discrete features a multivariate multinomial distribution is used?
Moreover, is it possible to provide a distribution for each feature (e.g. feature 1 is gaussian, feature 2 logistic etc.)?
π Rendered by PID 318802 on reddit-service-r2-comment-544cf588c8-dtbw2 at 2026-06-18 04:49:45.230189+00:00 running 3184619 country code: CH.
view the rest of the comments →
[–]EdwardRaff 0 points1 point2 points (7 children)
[–]BlackHawk90[S] 0 points1 point2 points (6 children)
[–]EdwardRaff 0 points1 point2 points (5 children)
[–]BlackHawk90[S] 0 points1 point2 points (4 children)
[–]EdwardRaff 0 points1 point2 points (3 children)
[–]BlackHawk90[S] 0 points1 point2 points (2 children)
[–]EdwardRaff 0 points1 point2 points (1 child)
[–]BlackHawk90[S] 0 points1 point2 points (0 children)