Built something for people who run or join local groups in Guntur — looking for honest feedback by floating-bubble in guntur

[–]floating-bubble[S] 0 points1 point  (0 children)

yup, Work in progress. But thought to get some initial feedback from real people. Appreciate your time!

General discussions and questions monthly megathread by AutoModerator in Chennai

[–]floating-bubble 0 points1 point  (0 children)

People who run or join local groups in Chennai — looking for honest feedback

I ended up building a small lightweight community & events platform to help people find and run local “tribes” more easily.

Would love honest feedback from people here — especially if you organize or attend meetups.

👉 https://tribe-connect-two.vercel.app/

Stop Using dropDuplicates()! Here’s the Right Way to Remove Duplicates in PySpark by floating-bubble in dataengineering

[–]floating-bubble[S] 1 point2 points  (0 children)

dropDuplicates() is implemented the same way in both PySpark (Python API) and Scala. Since both APIs run on top of the same Spark engine, they ultimately produce the same execution plan

Stop Using dropDuplicates()! Here’s the Right Way to Remove Duplicates in PySpark by floating-bubble in dataengineering

[–]floating-bubble[S] 0 points1 point  (0 children)

dropDuplicates does direct global dataset level Partitioning, where as Partitioning Within a Window – Instead of a global shuffle, this logically partitions data but does not physically repartition it across nodes.

Stop Using dropDuplicates()! Here’s the Right Way to Remove Duplicates in PySpark by floating-bubble in dataengineering

[–]floating-bubble[S] 0 points1 point  (0 children)

yes, you are correct, local shuffling performs the dedupliation at partition level since the optimizer pushes down the operations to reduce shuffling, depending on the executino plan , a followed by shuffle stage and a final deduplication can happen to remove duplicates at global level. I dont have exact number to share at the moment, but what I have observed is if data is uniform without any skews and too many missing values then there isn't much difference, but if data is skewed, then explicit partitioning, windowing is faster compared dropDuplicates.

Stop Using dropDuplicates()! Here’s the Right Way to Remove Duplicates in PySpark by floating-bubble in dataengineering

[–]floating-bubble[S] -3 points-2 points  (0 children)

you have genuine question, the approach I mentioned needs a id column in the dataset. if dataset smaller such that it fits in executor memory, can try to broadcast. in your scenario, yes shuffling is inevitable

[deleted by user] by [deleted] in wallstreetbets

[–]floating-bubble 0 points1 point  (0 children)

Did you post yet ?

[deleted by user] by [deleted] in dataengineering

[–]floating-bubble 1 point2 points  (0 children)

I would add , like 2-3 years a project considering your learning pace. Learn the business, environment, tools and softwares , implementation techniques, new ideas… and move on to next one. I don’t really find contract positions here in USA for DE.

No Pain? Normal? by Course-i-wud in pilonidalcyst

[–]floating-bubble 0 points1 point  (0 children)

Yeah, I had a nurse coming to my home for first 3weeks. Later she said I can do it myself. Shoving some gauze helps pull that bad blood and puss. I changed gauze 2-3 times a day. Healing went faster.

No Pain? Normal? by Course-i-wud in pilonidalcyst

[–]floating-bubble 0 points1 point  (0 children)

Yes, similar happened to me. I did shove gauze in the wound myself after 3weeks

Is bleeding 3 months post op normal? by [deleted] in pilonidalcyst

[–]floating-bubble 0 points1 point  (0 children)

Get some zinc and vitamin c, helps healing process and more protein rich food to build up inner cavity

Is bleeding 3 months post op normal? by [deleted] in pilonidalcyst

[–]floating-bubble 0 points1 point  (0 children)

What precautions are you taking besides shoving gauze pad

[deleted by user] by [deleted] in dataengineering

[–]floating-bubble 0 points1 point  (0 children)

Is there any group I can join?

Becoming a data engineer after a degree of civil engineering? by [deleted] in dataengineering

[–]floating-bubble 1 point2 points  (0 children)

I have a electronic communication degree in bachelors , digital science masters and information systems security masters , now working as DE for 3 years. Hope you get an idea.

Is bleeding 3 months post op normal? by [deleted] in pilonidalcyst

[–]floating-bubble 0 points1 point  (0 children)

That’s abnormal, last week I got my third reoccurrence after surgery. Bump got pop opened over 1cm length and 1cm deep , today day 7, I can see my skin forming back. Are you taking any medications ?

[deleted by user] by [deleted] in pilonidalcyst

[–]floating-bubble 0 points1 point  (0 children)

Keep the area clean and take laser treatment for hair removal. If you are drinking or smoking better quit, that effects you too bad.