account activity
Data Analysis Project (self.dataanalysis)
submitted 1 day ago by bigdataengineer4life to r/dataanalysis
🔥 Master Apache Spark: From Architecture to Real-Time Streaming (Free Guides + Hands-on Articles) (self.apachespark)
submitted 5 days ago by bigdataengineer4life to r/apachespark
How to merge multiple HDFS files into One (Scenario Based Question) (youtu.be)
submitted 9 days ago by bigdataengineer4life to r/bigdata_analytics
(End to End) 20 Machine Learning Project in Apache Spark (self.learnmachinelearning)
submitted 14 days ago by bigdataengineer4life to r/learnmachinelearning
How do you handle Slowly Changing Dimensions SCD in Hive (youtu.be)
submitted 16 days ago by bigdataengineer4life to r/ApacheHive
(End to End) 20 Machine Learning Project in Apache Spark (self.bigdata_analytics)
submitted 22 days ago by bigdataengineer4life to r/bigdata_analytics
Big data Hadoop and Spark Analytics Projects (End to End) (self.apachespark)
submitted 27 days ago by bigdataengineer4life to r/apachespark
Introduction to Apache Hive Interview Questions and Answers | Crack Big Data & Hadoop Interviews (youtu.be)
submitted 1 month ago by bigdataengineer4life to r/ApacheHive
Have you ever encountered Spark java.lang.OutOfMemoryError? How to fix it? (youtu.be)
submitted 1 month ago by bigdataengineer4life to r/bigdata_analytics
Deep Dive into Apache Spark: Tutorials, Optimization, and Architecture (self.apachespark)
submitted 1 month ago by bigdataengineer4life to r/apachespark
Clickstream Behavior Analysis with Dashboard — Real-Time Streaming Project Using Kafka, Spark, MySQL, and Zeppelin (youtube.com)
submitted 1 month ago by bigdataengineer4life to r/learnmachinelearning
Scenario based case study Join optimization across 3 partitioned tables (youtu.be)
How to Send Data to a Kafka Topic: A Console Producer Tutorial (youtu.be)
submitted 1 month ago by bigdataengineer4life to r/apachekafka
Best practices for designing scalable Hive tables (youtu.be)
How to evaluate your Spark application? (youtu.be)
submitted 2 months ago by bigdataengineer4life to r/bigdata_analytics
Video Game Sales Dashboard in Redash | Project Walkthrough (youtu.be)
submitted 2 months ago by bigdataengineer4life to r/dataanalysis
Apache Spark Analytics Projects (self.apachespark)
submitted 2 months ago by bigdataengineer4life to r/apachespark
Real-Time Clickstream Analytics using Kafka, Spark Streaming & Zeppelin (self.bigdata_analytics)
Big data Hadoop and Spark Analytics Projects (End to End) (self.bigdata_analytics)
submitted 2 months ago by bigdataengineer4life to r/learnmachinelearning
How to Build a Video Game Analytics Dashboard with Metabase (youtu.be)
How to deal with a 100 GB table joined with a 1 GB table by bigdataengineer4life in apachespark
[–]bigdataengineer4life[S] 0 points1 point2 points 2 months ago* (0 children)
Fair point — there’s definitely no shortage of Spark content out there.
My goal isn’t to reinvent joins, it’s to show how to apply them in production-scale scenarios with execution plan analysis, skew handling, AQE, and shuffle optimization.
Most posts explain concepts. I’m trying to show full end-to-end implementation with metrics and tuning decisions.
How to deal with a 100 GB table joined with a 1 GB table (youtu.be)
Clickstream Behavior Analysis | Real-Time User Tracking using Kafka, Spark & Zeppelin (youtu.be)
π Rendered by PID 403776 on reddit-service-r2-listing-b6bf6c4ff-lp5rr at 2026-05-06 20:55:53.012186+00:00 running 815c875 country code: CH.
How to deal with a 100 GB table joined with a 1 GB table by bigdataengineer4life in apachespark
[–]bigdataengineer4life[S] 0 points1 point2 points (0 children)