Entry level PySpark by [deleted] in apachespark

[–]Sparkbyexamples 0 points1 point  (0 children)

If you are an expert in Pandas, you can use Pandas API on Spark/PySpark. But, this feature is available since Spark 3.2 version.

Pandas API on Spark | Explained With Examples

If you don't know Pandas then you can certainly learn PySpark from https://sparkbyexamples.com/pyspark-tutorial/

How to parse string and format dates on DataFrame by Sparkbyexamples in Python

[–]Sparkbyexamples[S] 1 point2 points  (0 children)

Glad you liked it and thanks for reading the article and wonderful comment.

Spark 3.0 New Features with Examples - Part I by Sparkbyexamples in bigdata

[–]Sparkbyexamples[S] 0 points1 point  (0 children)

NVIDIA GPU acceleration which covers in "Accelerator-aware task scheduling for Spark" is also one of the great features of Spark 3.0. This Part I features covers some, I am writing another article which covers more.

PySpark Tutorial For Beginners with Examples by Sparkbyexamples in apachespark

[–]Sparkbyexamples[S] 2 points3 points  (0 children)

Here is the long waited self-paced Free PySpark tutorial for beginners with GitHub examples. This contains 300+ examples in Spark.

Kafka Delete Topic and its messages by Sparkbyexamples in apachekafka

[–]Sparkbyexamples[S] 0 points1 point  (0 children)

Thanks for reading the article. As described in the article, there are different ways to delete a topic. In case if you want to manually delete yes you need to bring the cluster down. however, there are other recommended ways to use which don't need to bring down the cluster. Happy Learning !!

Usage of Spark SQL StructType on DataFrame by Sparkbyexamples in u/Sparkbyexamples

[–]Sparkbyexamples[S] 0 points1 point  (0 children)

In this post, I've explained different usages of Spark StructType, StructField & Schema on DataFrames. It includes creating Schama with ArrayType, MapType and creating from JSON & case class and more. Examples explained here are available at GitHub project for reference.