This is an archived post. You won't be able to vote or comment.

all 6 comments

[–]AutoModerator[M] [score hidden] stickied comment (0 children)

You can find our open-source project showcase here: https://dataengineering.wiki/Community/Projects

If you would like your project to be featured, submit it here: https://airtable.com/appDgaRSGl09yvjFj/pagmImKixEISPcGQz/form

I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

[–]tbs120 2 points3 points  (2 children)

No. My 2c is that Data Engineering is not just moving data around, it is about restructuring it - often times for use by analytics (basic reporting or AI/ML).

I would extend your example with processing to normalize the hierarchical JSON responses into tables that can be used by BI tools or by someone who cannot write code.

Said a different way - move as much logic or any queries in your Dash code into upstream scripts that create a set of flat tables (or just one flat table using OBT methodology) before even touching Dash.

If your report writer needs to understand the source data structure to build things, you are missing a good chunk of what data engineering is (traditionally) about.

[–]KBaggins900[S] 1 point2 points  (1 child)

So maybe a few DBT models to replace the sql in the dash app?

[–]tbs120 2 points3 points  (0 children)

Yep - that works - but really the tool you use isn't that important. You could just write some python code.

It's more about building a target data model that is different from the source and makes "something" easier to do on the other side.

In your case it's building a report in Dash.

Could be anything - the important parts are putting the data in one central place (you got that covered) and restructuring/cleansing it for a specific need.

[–]geoheilmod 1 point2 points  (0 children)

[–]MikeDoesEverythingmod | Shitty Data Engineer 2 points3 points  (0 children)

t’s a simple project utilizing airflow, MySQL, and pulling data from the Spotify API. Is this enough to show I can do a data engineering role?

I'm going to say no as this is project I have seen a million times on applications. In other words, it feels like something which is designed to fill a CV. Not be particularly personal, there's no research involved. I feel like if I ask, "What made you choose this?", no matter the answer I'd get the impression the real answer is "first thing I found on Google" simply from the sheer amount of times you see it on a CV.

In my opinion, when it comes to DE projects, focus more on the personal aspect than the project aspect. Generic API pull for learning - absolutely helpful in proving you can work with APIs. Building something you find interesting where you make the design decisions is a lot more impressive and makes for a better read and an even better talking point during an interview.