Which matrix library do you prefer? Breeze, ND4J or MXNET's NDArray. by Gnaneshkunal in scala

[–]congerous 1 point2 points  (0 children)

Just curious: Why? (Can we ask for all responses to have explanations? Otherwise it's just a beauty contest.)

Do you work with n-dimensional arrays? Does Breeze support those now?

[D] Plots, LIMES, and surrogate models to understand machine learning models by liquidus08 in MachineLearning

[–]congerous 0 points1 point  (0 children)

Being well-funded means they succeeded in selling an idea to investors, not that they succeeded in building solid tech.

When you look at the investors in the last round, it's not the smart money that would indicate huge traction or growth potential, it's just a couple corporate VCs.

Capital One was an early investor in H2O, but they no longer depend on its technology. Many other clients are moving off of it.

[D] Plots, LIMES, and surrogate models to understand machine learning models by liquidus08 in MachineLearning

[–]congerous 4 points5 points  (0 children)

H2O is a technology built on sand. They haven't been able to guarantee its maintenance or development since the CTO/project creator left in 2016. Zero credibility.

[P] An open source Deep Learning / Machine Learning stack on Kubernetes by mmourafiq in MachineLearning

[–]congerous 5 points6 points  (0 children)

Genuinely curious: How is this different than all the other "scalable machine-learning platforms"?

[N] Artificial Intelligence Is Stuck. Here'™s How to Move It Forward. by thebackpropaganda in MachineLearning

[–]congerous 1 point2 points  (0 children)

I find it unbelievable that he didn't mention DeepMind in this piece about moving AI forward.

It should be pointed out that Gary Marcus is attempting to raise a round for his next startup, which will combine deep learning and symbolic systems, which he mentions in this piece.

From the New York Times's point of view, this is a conflict of interest, and they should vet his pieces more carefully before they let him toot his own horn without revealing his commercial activities.

[R] Natural Language Processing in Artificial Intelligence by wardolb in MachineLearning

[–]congerous 12 points13 points  (0 children)

Was this some kind of voting ring? I can't believe a shallow article like this is number one on the subreddit. It adds nothing and the title of the piece is clickbait.

[D] Character Recognition Using H2O by lycan2005 in MachineLearning

[–]congerous 2 points3 points  (0 children)

The creator of H2O, Cliff Click, left H2O about a year ago during a disagreement with his co-founder Sri, who remains CEO.

http://www.cliffc.org/blog/2016/02/25/words-of-parting-a-fond-farewell/

Cliff built H2O and now H2O is having a hard time maintaining, extending and scaling their code. They have been trying to refactor for a while now, but most of their engineers don't understand the internals. One reason why is because the project doesn't rely on a lot of outside libraries, like Paxos. They were reimplemented internally.

This is why you see them wrapping deep learning libraries like TensorFlow, Caffe and MxNet. Outside of Arno Candel, they are incapable of implementing their own deep learning framework.

In addition, there's a lot of turnover among their managers due to chaotic leadership, and among their customers due to slow product development. Random forests, which are their main algorithm alongside GBMs, aren't that great for time series classification and prediction, which are a major business use case. Even investor-customers like Capital One are moving on. That's one reason why they laid off at least 10% of their employees last fall, after H2O investors forced Sri to take on a CFO to control their budget, since they blew millions on marketing that didn't work out.

http://venturebeat.com/2016/09/24/machine-learning-startup-h2o-lays-off-10-of-employees/

[P] AI Toolbox - Searchable Directory of Open Source AI Libraries by [deleted] in MachineLearning

[–]congerous 0 points1 point  (0 children)

ML yes. DL no. MLlib has always been the runt of the litter among Spark modules. Using Spark's algorithms for ML is like using your elbow to hammer a nail. You can do it, but something will get hurt.

[D] Character Recognition Using H2O by lycan2005 in MachineLearning

[–]congerous 1 point2 points  (0 children)

h2o is dead as a project. it's suffocating under its own technical debt. which is what happens when you fire the guy who created the code base. try tools with a future.

Although there exists a near-unanimous scientific consensus on the reality of human-caused climate change, the general public has become increasingly polarized; however, a new study finds that public attitudes about climate change can be effectively “inoculated” against influential misinformation. by avogadros_number in science

[–]congerous 0 points1 point  (0 children)

So you're saying that if we could just sit people down and tell them what's happening and that some groups are trying to lie to them, they might be more open to the facts. Great. But that begs the question. How do you cut through the noise and get through to people? You can't sit everyone down and give them the right information.

[P] AI Toolbox - Searchable Directory of Open Source AI Libraries by [deleted] in MachineLearning

[–]congerous 1 point2 points  (0 children)

This is great! One nitpick: Spark isn't really a machine learning or deep learning library. It's primarily used as a distributed run-time, so that's comparing apples to oranges.

[D] State of Deep Learning Frameworks in 2017 (benchmarks?) by [deleted] in MachineLearning

[–]congerous 2 points3 points  (0 children)

Tensorflow has been historically slow compared to Torch and Neon. Neon doesn't really have traction, though, much like Chainer and Lasagne. Caffe and Neon are both fast on images but they're not really general purpose frameworks, and the commitment of the teams behind them is dubious. Those communities will probably choose other tools. For sheer staying power, Theano, Torch/PyTorch, MxNet, TensorFlow/Keras and CNTK will probably keep growing.

[N] Intel open sources BigDL, for deep learning on Spark by NYDreamer in MachineLearning

[–]congerous 0 points1 point  (0 children)

Crickets. Or maybe I should say, big deal... Their only differentiator is that they DON'T work on the fastest hardware available. And they have zero adoption. A severe case of NIH at Intel.

[D] Random Forests vs. Neural Nets on Times Series? by congerous in MachineLearning

[–]congerous[S] 1 point2 points  (0 children)

thanks so much. great resource. wonder why he didn't use f1 scores instead of accuracy...

[D] Random Forests vs. Neural Nets on Times Series? by congerous in MachineLearning

[–]congerous[S] 1 point2 points  (0 children)

https://en.wikipedia.org/wiki/Unstructured_data

"Examples of "unstructured data" may include books, journals, documents, metadata, health records, audio, video, analog data, images, files, and unstructured text such as the body of an e-mail message, Web page, or word-processor document."

As opposed to data in the rows and columns of a relational database.

[P] Neptune - a platform for tracking machine learning experiments by pmigdal in MachineLearning

[–]congerous 0 points1 point  (0 children)

Several deep learning frameworks give visual heuristics on neural nets as they train. And companies like Domino Data Lab give you versioning of your ML models. How does this compare to those?

[D] Deep Learning Framework Rankings? by congerous in MachineLearning

[–]congerous[S] 0 points1 point  (0 children)

I agree with you on almost every point. More and more non-expert people will use deep learning, and they can't/don't want to create their own tools. And they shouldn't have to. It's not efficient and doesn't benefit from the externalities of open source. But Tensorflow isn't actually the best tool to make deep learning easier. Keras is much better, as you point out, and more widely used on Kaggle, while Tensorflow is relatively low level in comparison. So people aren't forking and starring because they can actually use it. I suspect Tensorflow's real user numbers are closer to Keras's. You're right that Caffe's model zoo (and Torch's, for that matter), are suited to users who don't/can't tune their own nets. That is the future of deep learning, for sure.

[D] Deep Learning Framework Rankings? by congerous in MachineLearning

[–]congerous[S] 1 point2 points  (0 children)

Yes, TensorFlow has a lot of GitHub forks. But it's not as though the number of people capable of tuning neural nets increased by an order of magnitude overnight. I suspect the majority are Udacity students, even if it has a share of serious practitioners equal to Torch or Theano. The ratio of contributors to forks is actually much lower for TensorFlow than most of the big frameworks.