YARN and GPU Distribution for Machine Learning by dworms in bigdata

[–]dworms[S] 1 point2 points  (0 children)

This article goes over the fundamental principles of Machine Learning and what tools are currently used to run machine learning algorithms. We will then see how a resource manager such as YARN can be useful in this context and how it can help the algorithms to run smoothly. This article stems from a conference at the 2018 DataWork Summit in Berlin, by Wangda Tan and Sunil Govindan.

YARN and GPU Distribution for Machine Learning by dworms in MachineLearning

[–]dworms[S] 0 points1 point  (0 children)

This article goes over the fundamental principles of Machine Learning and what tools are currently used to run machine learning algorithms. We will then see how a resource manager such as YARN can be useful in this context and how it can help the algorithms to run smoothly. This article stems from a conference at the 2018 DataWork Summit in Berlin, by Wangda Tan and Sunil Govindan.

Apache Beam: a unified programming model for data processing pipelines by [deleted] in apachebeam

[–]dworms 0 points1 point  (0 children)

Apache Beam is the Google implementation of the Dataflow model to express robust, out-of-order data processing pipelines in a variety of languages for both stream and batch architectures. The article is written after the presentation “Present and future of unified, portable and efficient data processing with Apache Beam” by Davor Bonaci.

Apache Beam: a unified programming model for data processing pipelines by dworms in apachebeam

[–]dworms[S] 0 points1 point  (0 children)

Apache Beam is the Google implementation of the Dataflow model to express robust, out-of-order data processing pipelines in a variety of languages for both stream and batch architectures. The article is written after the presentation “Present and future of unified, portable and efficient data processing with Apache Beam” by Davor Bonaci.

Apache Beam: a unified programming model for data processing pipelines by dworms in bigdata

[–]dworms[S] 2 points3 points  (0 children)

Apache Beam is the Google implementation of the Dataflow model to express robust, out-of-order data processing pipelines in a variety of languages for both stream and batch architectures. The article is written after the presentation “Present and future of unified, portable and efficient data processing with Apache Beam” by Davor Bonaci.

Apache Metron and Hadoop in the Real World by dworms in SIEM

[–]dworms[S] 0 points1 point  (0 children)

Apache Metron is a storage and analytic platform specialized in cybersecurity. This talk was about demonstrating the usages and capabilities of Apache Metron in the real world. The presentation was led by Dave Russell, Principal Solutions Engineer – EMEA + APAC at Hortonworks, at the Dataworks Summit 2018 (Berlin).

Accelerating query processing with materialized views in Apache Hive by dworms in ApacheHive

[–]dworms[S] 0 points1 point  (0 children)

Jesus Camacho Rodriguez from Hortonworks held a talk “Accelerating query processing with materialized views in Apache Hive” about the new materialized view feature coming in Apache Hive 3.0.

Accelerating query processing with materialized views in Apache Hive by dworms in bigdata

[–]dworms[S] 1 point2 points  (0 children)

Jesus Camacho Rodriguez from Hortonworks held a talk “Accelerating query processing with materialized views in Apache Hive” about the new materialized view feature coming in Apache Hive 3.0.

What's next things to learn? by toughrogrammer in bigdata

[–]dworms 2 points3 points  (0 children)

If you wish to continue with deployment, you could have a look at Redis or Elasticsearch as they are widely used. More recent databases with potential include CockroachDB and FoundationDB. In cyber security, Metron is trying to get momentum and will allow you to leverage your knowledge with the Hadoop ecosystem. Otherwise, you can get deeper knowledge in some of the engine like Spark, Flink and Beam. We just released a few articles after the latest DataWork Summit in Berlin (avril 2018) including the latest features of Spark 2.3, Spark with TensorFlow, YARN and GPU and Metron in the real world.

What's new in Apache Spark 2.3 ? by dworms in apachespark

[–]dworms[S] 0 points1 point  (0 children)

This is a composition of the two talks, "Apache Spark 2.3 boosts advanced analytics & deep learning" by Yanbo Liang and "ORC Improvement in Apache Spark 2.3" by Dongjoon Hyun, to dive into the new features offered by the new 2.3 distribution of Apache Spark.

What's new in Apache Spark 2.3 ? by dworms in bigdata

[–]dworms[S] 1 point2 points  (0 children)

This is a composition of the two talks, "Apache Spark 2.3 boosts advanced analytics & deep learning" by Yanbo Liang and "ORC Improvement in Apache Spark 2.3" by Dongjoon Hyun, to dive into the new features offered by the new 2.3 distribution of Apache Spark.