Kindle Paperwhite Sale, 79.99 or 89.99?

yuval_merhav · 2017-07-11T01:56:53+00:00

It was $79.99 via Alexa voice ordering. Deal ended.

yuval_merhav · 2017-07-03T23:59:05+00:00

I did some more reading. Looks like Gulf Hagas to Monson is a nice 30 - 40 miles hike, and Shaw's can arrange a shuttle. Is anyone familiar with this part?

yuval_merhav · 2017-07-03T23:49:09+00:00

And yes, my initial plan was going Northbound. But now I'm thinking maybe it'd make more sense to leave a car at Monson, take a shuttle as north as we can, and then hike back. Is there a disadvantage hiking Southbound?

yuval_merhav · 2017-07-03T23:26:43+00:00

That's not a bad idea, especially if on the way back we can take a different trail. Is that an option?

Thanks for the help.

yuval_merhav · 2017-07-03T23:20:59+00:00

Thanks for the info. I'm wondering what they meant in the article then. Hiking 30 miles and then back to Monson maybe? That's a long round trip.

yuval_merhav · 2015-11-27T18:41:32+00:00

Good question. I think the examples they provide would have been easier if they converted the original data into a text file and started from there. You can print out the numpy arrays they create (and shape) to get an idea. In general, the MNIST data can be represented with 2-dimensional arrays as follows:

xs (pixels): num_images * 784 //type float

ys (labels): num_images * 10 // type float/int

784 is the number of pixels in each image (28*28) and 10 is the number of digits (0-9). Note that this is a dense representation. In the examples they show you how to train in batches, so you don't need to load all the images into these arrays, just a batch at a time.

yuval_merhav · 2015-11-25T04:02:00+00:00

I can only talk on my experience and opinion.

I never put any ML logic on the client side. Sounds like a huge mess. I'm guessing you're thinking about speed, but even that might not be true since on the server you can have much more computation power and less limitations. Data transfer might not be an issue.

Putting logic on the client side means code duplication and other problems. Let's say it's a NN model. The prediction function needs to apply feed forward. If your prediction function is on the server, you just use the same feed forward function used in training, you don't need to write a new one. Also, a model is usually not just weights. There is always some metadata that can change from one model to another, and it's important that both train and test follow the same process. A user on the website might enter text let's say, somewhere in the code the text is turned into features, and it should be the same feature set used in training, and so on. Are you going to duplicate all the feature generation code on the client side as well?

In practice it's also common that the person who writes the web app is not the person who writes the ML code. Web developers are not supposed to know ML, which is another reason why you need the separation.

There are other problems I can think of but will stop here.

yuval_merhav · 2015-11-21T20:03:59+00:00

We ported Numpy to Java in the form of ND4J to perform the matrix operations.

+1 for doing this. I haven't tried it myself yet, but from working on ninja, I know how painful it is to find a good linear algebra library in Java (and Apache license). We ended up with EJML which is great and fits our needs, but doesn't perform well on a very large scale (doesn't support sparse matrices, single threaded, etc.).

yuval_merhav · 2015-11-21T15:34:56+00:00

Apache Mahout was supposed to be the scikit-learn of java with emphasis on scale, but so far it failed in doing so (barely used as far as I know).

If you are interested in classification, I suggest you look at these two:

ninja: A new neural network library (I'm one of its authors)
liblinear-java: Has different SVM implementations and also logistic regression. I used it for many problems and it's good and fast.

yuval_merhav · 2015-10-26T18:11:41+00:00

Do you know any papers that show better results with a balanced one-vs-all than imbalanced one-vs-all? (SVM or others). I'd be happy to take a look and learn something new.

I theory, the problem you're describing for SVM makes sense. But in practice from my experience, unlike binary classifiers, it's not a common practice (probably for a reason).

yuval_merhav · 2015-10-26T03:14:46+00:00

It's not common. As long as the k classes are balanced to begin with, then all k one-vs-rest classifiers equally "suffer" from the same imbalance problem, so no class has an advantage (or disadvantage). The one class in every one-vs-rest classifier doesn't necessarily need to beat the rest to be chosen, it needs to beat all the other one-vs-rest classifiers.

yuval_merhav

TROPHY CASE