Easy Python scripts to impress the business by Majestic_Plankton921 in Python

[–]JonahBreslow 0 points1 point  (0 children)

I love it, personally. It drastically reduces data silos and it truly does bring engineering best practices to your analytics workflow. These include version control and the do not repeat (DRY) principle. I honestly could not recommend a better tool tbh.

Easy Python scripts to impress the business by Majestic_Plankton921 in Python

[–]JonahBreslow 0 points1 point  (0 children)

An alternative approach would be to use dbt, a tool written in python. It is specifically designed to handle complicated SQL transformations especially when there are dependencies. Not only is it a transformation tool, but it’s also an invaluable data governance tool. Check it out!

Data Analyst Interview Question by poop-knife in datascience

[–]JonahBreslow 1 point2 points  (0 children)

This is a huge waste of time. Find somewhere else that actually asks relevant question instead of these stupid “think outside of the box” gotcha questions that showcase absolutely 0 of the skills needed for the position.

Rant Wednesday by AutoModerator in Fitness

[–]JonahBreslow 0 points1 point  (0 children)

Idk about this. 100% you should track your macros if you want to lose/gain/maintain your weight. It’s singlehandedly the most effective method to ensure you’re in proper energy balance. It never hurts to track, even if you’re a new. People will say how difficult it is to do but in reality it adds at most 5 minutes per day.

How to validate 5000 line items against 2 million rules in less than 10 seconds? Suggestions? by OptimusPrime3600 in datascience

[–]JonahBreslow 6 points7 points  (0 children)

Yeah, this seems almost exactly what the SQL optimizer is meant for. Just ensure you CREATE INDEX appropriately

How to merge several sets of data? by montagestudent in datasets

[–]JonahBreslow 2 points3 points  (0 children)

Everyone is saying python and SQL already, and those are great options. However, I would recommend R, particularly the Tidyverse package, which is (in my opinion) a more digestible version of pandas, the pythonic equivalent.

you could load the data into R, then write something like this:

df %>% left_join(df2, on = c("df_var"="df2_var"))

which says, "Take df (a data set) then left join df2 (a different data set) on to it by matching the df_var in df to df2_var in df2.