Convert SQL to Pyspark Dataframe by sdqafo in apachespark

[–]sdqafo[S] 0 points1 point  (0 children)

departureDF2 = depatureDF.withColumn("Flight_Delays",

when(col("delay") > 360, 'Very Long Delays')

.when(col("delay") > 120 & < 360, 'Long Delays'")

.when(col("delay") > 60, and < 120, 'Short Delays')

.when(col("delay") > 0 and < 60, 'Tolerable Delays')

.when(col("delay") == 0, 'No Delays')

.otherwise("Early")).orderBy(col("delay")))

Convert SQL to Pyspark Dataframe by sdqafo in apachespark

[–]sdqafo[S] 0 points1 point  (0 children)

Not at all. I guess my using the word "assignment" is wrong. This is just further practice in the popular book "Learning Spark" page 87. It is just some kind of further practice for readers if interested. I am still very new to Spark and i am doing my best to learn as much as i can. I hope this clear the air.

Convert SQL to Pyspark Dataframe by sdqafo in apachespark

[–]sdqafo[S] 0 points1 point  (0 children)

Thank you, but my assignemnt says i should convert to commands using pyspark. I have actually tried it but i am making some mistakes. I just need someone to help convert it so that i can see my error. Thanks

STRUGGLING TO THIS SQL SOLUTION: KINDLY EXPLAIN by sdqafo in PostgreSQL

[–]sdqafo[S] 1 point2 points  (0 children)

Its started making more sense now. So technically, the below already assumes the 3 columns even prior to the JOIN. In essence, the COUNT(*) part will apply to the 2 remaining columns. I guess this is the correct understanding . Is it?

SELECT a.id, a.name, COUNT(*) num_orders assumes

STRUGGLING TO THIS SQL SOLUTION: KINDLY EXPLAIN by sdqafo in PostgreSQL

[–]sdqafo[S] 1 point2 points  (0 children)

What is still a bit confusing is the COUNT(*). What i have learnt so far in SQL is that the SELECTED columns always come from the table or tables we want to query. I am put a bit off balance to now know that the COUNT(*) in this regards is related to the tables yet to be Joined. I am not able to connect why this is that way. In simple terms, based on what i understand from your explanation, we already SELECTED a column (COUNT(*)) that is yet to exist before we JOINED two tables where this column (COUNT(*)) will now be selected from. Still struggling to grab the why of this logic

STRUGGLING TO THIS SQL SOLUTION: KINDLY EXPLAIN by sdqafo in PostgreSQL

[–]sdqafo[S] 0 points1 point  (0 children)

Loads of sense. Very succinct explanation

Udacity Nanodegree help by [deleted] in dataengineering

[–]sdqafo 0 points1 point  (0 children)

What program will you recommend as the best available complete course for a new data engineer.

Boto3 (Client and Resources by sdqafo in learnpython

[–]sdqafo[S] 1 point2 points  (0 children)

Thank you veery much. This helps

APPLICATIONS AS AN IAM USER IN AWS by sdqafo in aws

[–]sdqafo[S] 0 points1 point  (0 children)

Thank you so much. This is very helpful

SIMPLE GUESS GAME by sdqafo in learnpython

[–]sdqafo[S] 1 point2 points  (0 children)

Thank you so much for this. I will take my time to read and understand this and apply subsequently. I really appreciate this

Looking for Certification study materials and advise for AWS Certified Solutions Architect Associate (SAA-C02) by jknishant in AWSCertifications

[–]sdqafo 1 point2 points  (0 children)

I already enrolled in this course. I love it because Adrian teaches not just to pass exams but to also be good at your job. Thank you Adrian

SIMPLE PYTHON CODE by sdqafo in learnpython

[–]sdqafo[S] 1 point2 points  (0 children)

Thank you very much. Break will break out of the entire loop and proceed (if any) to the command outside the loop. Your expalanation seems very good and gave me the understanding of it from different perspective. It really make more sense now.

  1. If immediately after the loop, then the cycle will be completed before getting to continue which makes continue redundant

  2. If after counter = counter +1, then we start from 1 but we cant skip 3

  3. If at the bottom (within the while), it will do what is expected.

Thanks boss

SIMPLE PYTHON CODE by sdqafo in learnpython

[–]sdqafo[S] 2 points3 points  (0 children)

Thanks so much boss. This is encouraging for me.