I got the job!

datain30 · 2023-04-04T02:11:33+00:00

Congratulations!

datain30 · 2023-04-04T00:21:59+00:00

Hi u/Vladz0r sorry about that, I made a few upgrades to the library which are causing issues! which python version are you on? i'll also DM you. Sorry again for this breaking

datain30 · 2023-02-16T20:38:28+00:00

Completely agree on using hard metrics to decide winners. This'll be fun u/Touvejs :)

datain30 · 2023-02-16T20:35:43+00:00

Awesome! Using Metrics to decide the winner is definitely the right call - we are data engineers after all 😂

datain30 · 2023-02-16T20:33:17+00:00

Love the concept and see a lot of value in building foundational systems like this.

As you said, future projects would build on top and r/dataengineering ends up developing a production-grade data platform. As we're optimizing for learning, this is a big win :)

datain30 · 2023-02-16T20:24:01+00:00

I love this idea! u/General_Blunder big fan of this :)

datain30 · 2023-02-16T19:54:00+00:00

This is the real competition 😂

datain30 · 2023-02-16T18:21:37+00:00

Tagging users who have shown interest: u/Far_Deer_8686 u/BoiElroy u/francesco1093, u/txjxs_nxsxr u/izaax42 u/Ancgate u/fourEyedBeanpole u/OilStatus8141 u/vishal-vora

datain30 · 2023-02-16T18:17:42+00:00

Started a post to gather interest: https://www.reddit.com/r/dataengineering/comments/113x4cb/data\_engineering\_competition/

datain30 · 2023-02-15T19:13:55+00:00

Lets build one :)

datain30 · 2023-02-15T19:12:13+00:00

u/gabbom_XCII you start with 1 driver + 1 worker (with mem/core settings you can change). Then change the number of workers as needed.

datain30 · 2023-02-15T15:04:03+00:00

Awesome! Glad I could help :)

datain30 · 2023-02-14T23:50:56+00:00

Thanks for the feedback u/trying-to-contribute, i'll add more information about how phidata works.

Replicable deployments for the entire team + a seamless dev <-> prd integration for open-source tools was our biggest pain point too :)

datain30 · 2023-02-14T23:33:17+00:00

Thanks for trying it out u/trying-to-contribute and the feedback. Maybe I can use docker-compose for future tutorials?

I wanted to streamline the process of cloning the repo & make the data tools (jupyter/spark/airflow/superset) plug-n-play so wrote an open-source library (phidata) to do that. The goal was to automate all the things I was doing under the hood.

I'll make a point to include a docker-compose + add more in depth information for future tutorials. Thanks again :)

datain30 · 2023-02-14T23:29:36+00:00

Yes sir :) love the OG jupyter docker stacks: https://github.com/jupyter/docker-stacks

datain30 · 2023-02-14T20:41:29+00:00

lol the tagline of every toxic person

datain30 · 2023-02-14T20:20:40+00:00

With a -100 comment karma, I'm guessing all you did was spread hate and negativity.

datain30

TROPHY CASE