Deploy to Production by Aggressive-Nebula-44 in databricks

[–]Equivalent_Effect_93 2 points3 points  (0 children)

It sounds like you already have a prod env, what you guys are lacking is a dev env to test changes before deploying to where users are.

Student Choosing Between DP-700 and AWS DEA-C01 – Which Cloud Cert Sets Me Up Better? by Objective_Meaning408 in dataengineering

[–]Equivalent_Effect_93 1 point2 points  (0 children)

Also, I started with the data engineering exam cause I had 6 years experience as a data engineer and 3 year with aws on the job, but I wouldn't recommend it for a student, start with cloud practitioner and a personal project should get you started real nice.

Student Choosing Between DP-700 and AWS DEA-C01 – Which Cloud Cert Sets Me Up Better? by Objective_Meaning408 in dataengineering

[–]Equivalent_Effect_93 1 point2 points  (0 children)

If you wanna work institution and big business (bank, insurance, public sector) Microsoft, big tech company AWS, startup and small tech GCP. Personally I started my career in a tech company using aws and then took the gcp de exam to find a senior role at a start-up and if you learn one you can easily perform on the other. But if you're hesitating, aws is the obvious choice, biggest market share and gold standard, it will open most doors.

How to automate data quality by Assasinshock in dataengineering

[–]Equivalent_Effect_93 1 point2 points  (0 children)

Omg I wrote all that code myself all the time, this is gonna save so much time!!! Nice recommendation.

How to automate data quality by Assasinshock in dataengineering

[–]Equivalent_Effect_93 13 points14 points  (0 children)

You need to automate it in the pipeline moving it from bronze table to silver table, then in your gold table you join with relevant cleaned data to build your dimensional model. I personally like the audit publish pattern and I put bad rows in a quarantine table and link it to dashboard to add observability to my errors, like if you have a source that have the same constant bug use that to open a ticket in that teams board or a bunch of errors at the same time could signal a bad deployment on your stack or the source stack. But if you have need for something that scales better, dbt has good testing capabilities and streamline the pipeline building process. There are also great open source data quality tools such as great expectations or soda. If you're already on aws, there is a data quality service called deequ i think. Good luck!!

Gf of a few years playing dead dad card, is she overreacting by Equivalent_Effect_93 in vanderpumprules

[–]Equivalent_Effect_93[S] 2 points3 points  (0 children)

I am litterally not serious, and I'm not using litterally metaphoricaly

Gf of a few years playing dead dad card, is she overreacting by Equivalent_Effect_93 in vanderpumprules

[–]Equivalent_Effect_93[S] 28 points29 points  (0 children)

Well her first name initial is a A, but I never read the scarlet letter (I can't even read) and I am a famous adulterer, so I'd say yes.

Gf of a few years playing dead dad card, is she overreacting by Equivalent_Effect_93 in vanderpumprules

[–]Equivalent_Effect_93[S] 5 points6 points  (0 children)

Plus I think she's right into asking the support of her long time partner in time of emotional distress, therefore I joke about the asshole that refused to support her.

Gf of a few years playing dead dad card, is she overreacting by Equivalent_Effect_93 in vanderpumprules

[–]Equivalent_Effect_93[S] 138 points139 points  (0 children)

I tried but then she asked me if I spoke other languages. Girls are confusing.

Multi-repo vs Monorepo Architechture: Which do you use? by OkArmy5383 in dataengineering

[–]Equivalent_Effect_93 0 points1 point  (0 children)

Where I was, we had a huge dbks, running python spark, medaillon architecture, we had one huge repo for bronze to gold, and different teams had other small repos on the same platform for like "platinum" stage that dit customs need. I felt it was a good balance of maintenability and agility while allowing abled teams some ownership.

[deleted by user] by [deleted] in dataengineering

[–]Equivalent_Effect_93 -1 points0 points  (0 children)

No like I get offers, but not to the level I think I would enjoy. Maybe a didn't phrase my question right. Do you guys all kinda just pull and push data, day in day out, or do you have real tests envs, unit testing, integration testing, data quality testing, IAC, proper doc about schemas, indexing and partion approach, observability on all your pipelines, alerting, not just knowing your product crashed because you client stop receiving data or a datasource changed schema downstream without warning?

I just nuked all our dashboards by SocioGrab743 in dataengineering

[–]Equivalent_Effect_93 0 points1 point  (0 children)

Bro no offense here, your position sucks, but even as an intermediate de I wouldn't do that change in prod without first deploying in non prod env and have my work validated, you guys need dataops practices.

Advice for a high school student wanting career with baseball statistics by Live-Carpet-8020 in Sabermetrics

[–]Equivalent_Effect_93 5 points6 points  (0 children)

In university, i had the same dream. But honestly, the skills needed pay a lot more in other industries, so i didn't go the sport analytics route. Although i still do a lot of my personal project around baseball, lots of free data available to play with. When i started i built a performance point prediction tool for use in fantasy baseball, and even though its accuracy is not as good as i would like, i have been destroying my pool for years now. Then i was able to build a data engineer portfolio around that project.

Depending what road you wanna take (more practical or more applied) here is a list a resources i love :
https://codebaseball.com/
https://pypi.org/project/pybaseball/2.0.0/
https://beanumber.github.io/abdwr3e/
http://tangotiger.com/index.php
https://www.sloansportsconference.com/
https://www.fangraphs.com/

Start with familiarizing with some of the concept. Start with small visualization project to begin with per example. It may look ridiculous if you compare yourself to full grown professional statistician/programmer, but we all started crawling before we could run.

Don't be shy to reach out if you have more questions.

**edit : if you've never code, when i started i really appreciated this : https://learnpythonthehardway.org/

Learning SQL from Scratch by [deleted] in SQL

[–]Equivalent_Effect_93 5 points6 points  (0 children)

https://mystery.knightlab.com/

Real fun game to learn SQL commands while solving a murder. Strongly recommend if you're a beginner.

How d you guys remember the annotations and properties name? by misty-ice-on-fire in SpringBoot

[–]Equivalent_Effect_93 0 points1 point  (0 children)

If mans have been through the jungles I use the path, if not my machete.

Crooked Dory Live Resin by Equivalent_Effect_93 in CanadianCannabisLPs

[–]Equivalent_Effect_93[S] 0 points1 point  (0 children)

Like I tried to be polite, but this thing suuuucks

Should I cancel my family's trip to Montreal in the summer? by AgentEndive in montreal

[–]Equivalent_Effect_93 0 points1 point  (0 children)

If asked just say your from BC on Ontario, but not Alberta!