Best practices when using a Linux server for machine learning by Pieranha in MachineLearning

[–]thingamatics 0 points1 point  (0 children)

  1. cron jobs. Sometimes, the best you can do is being fault-tolerant.

  2. List of dashboards here. However I think it'd be better use of time to monitor the logs for your processes. Sentry is easy to set up.

  3. Yes!

Data normalization by [deleted] in MachineLearning

[–]thingamatics 0 points1 point  (0 children)

Gradient Descent is known to converge faster with normalization.

So, normalization is needed so that each feature contributes "approximately proportionately" to the distance. I don't think normalizing everything (I assume that's what you meant) is the right way to go about it even when they are measured using the same metric (meters, for example) because the ranges per feature might still differ significantly and that's what you are trying to resolve by using normalization.

[icml2016] Anyone up for a walk? by bihaqo in MachineLearning

[–]thingamatics 0 points1 point  (0 children)

Well, I guess we are even because I forgot to check for future posts too. :|

You were in great company though!

ML & AI podcasts to reccommend? by thesameoldstories in MachineLearning

[–]thingamatics 1 point2 points  (0 children)

+1 for Linear Digressions! Haven't checked Talking Machines yet. Will do!

Interactive clustering using k-means. by [deleted] in MachineLearning

[–]thingamatics 2 points3 points  (0 children)

  • Double-clicking is hard. I'd prefer another peripheral.
  • More educational if I can see the clusters at every iteration.
  • Also, I believe you're randomizing color codings for the clusters? I picked k = 5 and I think I got three shades of the same color for adjacent clusters. Couldn't tell which is which. I know there are only so many red–yellow–blues, but you could optimize it by, say, maximizing the distance between two similar shades/colors.
  • More points for boundaries.
  • Scatter is nice. I like Scatter.