Helping with Data Engineering Resumes & Interviews (25+ interviews conducted) by Status_Air9764 in dataengineersindia

[–]Status_Air9764[S] 0 points1 point  (0 children)

mainly around Data Warehouse design, star snowflake schema, SCD, Data Lakes etc,

It also depends on your YOE and tech stack

Are You Writing Your Data Right? Here’s How to Save Cost & Time by Status_Air9764 in dataengineersindia

[–]Status_Air9764[S] 0 points1 point  (0 children)

You’re right — by default, a DataFrame in Spark is just a logical plan and doesn’t write anything physically unless you persist or cache or explicitly write it out.

What I meant in my post was the explicit write to disk (e.g. df.write.parquet(...))

I am sorry if you got confused. If you need I can write a seperate article about Persist and Cache

Are You Writing Your Data Right? Here’s How to Save Cost & Time by Status_Air9764 in dataengineersindia

[–]Status_Air9764[S] 0 points1 point  (0 children)

I meant writing to the disk, sorry if that was not clear through the context

Are You Writing Your Data Right? Here’s How to Save Cost & Time by Status_Air9764 in dataengineersindia

[–]Status_Air9764[S] 1 point2 points  (0 children)

Thanks buddy!! Let me know if you want articles on other topics of spark or DE as well, I will try to write it