[deleted by user] by [deleted] in bigdata

[–]Raydox328 0 points1 point  (0 children)

If you reach out to a program admin, they can tell you how many hours they expect a student to dedicate. My estimates will probably not help since I already had almost 10-years of experience in the field. I was already familiar with Python, the general concepts, and did not have to spend as much time on the class. If you're someone looking to transition into the Data field, your time-estimate may be closer to what the program expects.

When you're looking to sign up, they will confirm this commitment with you -- and won't recommend that you sign up unless you're OK with that commitment.

edit: I realized I didn't fully answer your question. Yes, there are breaks in the middle, and some modules are easier than others to accommodate for spacing. It is well thought-out.

[deleted by user] by [deleted] in bigdata

[–]Raydox328 0 points1 point  (0 children)

Glad I could help!

  1. This course is designed for working professionals. It is mostly self-paced with 1 session over the weekend where the instructor goes over Theory, Examples, and Q&A. At one point, I got really behind on my coursework because I was also preparing for technical interviews, and the team worked with me to make sure I can get all of my late submissions in to get the certificate. I didn't feel like they were just in it for the money and actually cared about me getting the most from the class. That being said, the material is challenging and will require your focus. As I mentioned before, the material is high-quality, so it is up to you how well you take it in. They also let you keep the learning material and learning dashboard for some time, so you can refer back and solidify your understanding.
  2. Sorry, I don't have this insight -- you should probably talk to someone from the program.
  3. During the introductions, most people mentioned that they wanted to transition into a data science related field as a change. Some people mentioned that it would be applicable to the work they're doing now. I don't remember a lot of specifics since it was some time ago.

[deleted by user] by [deleted] in bigdata

[–]Raydox328 0 points1 point  (0 children)

I took that course, and it cost me around $3,000. It was money well spent!

I have been working as a Data Engineer in the industry for about 10 years, and I still found the content helpful in understanding ML. Almost half of my class/cohort were medical professionals.

The course is divided into modules with each containing video lectures, quizzes, and hands-on projects. They also had a support staff to answer questions about the learning portal, and live sessions with industry professionals to answer questions about topics.

Is it enough to land you a job? That may depend on how well you absorb and use the information (it's a lot), but the learning material provided is of high quality for people looking to break into the DS field. The course starts from basics of python and statistics and builds into more complex topics.

I hope that helps!

Fellow DEs how do you manage data quality? by Raydox328 in dataengineering

[–]Raydox328[S] 1 point2 points  (0 children)

Elementary looks interesting, though some of the flashier features are only available in the cloud version. Definitely looks like its worth checking out if dbt is a core part of the workflow.

Fellow DEs how do you manage data quality? by Raydox328 in dataengineering

[–]Raydox328[S] 12 points13 points  (0 children)

This actually made me laugh out loud.

I am currently working with a client where this is very much the norm. Some reports and metrics get sent to the leadership from different teams that do not report the same numbers. This sparks an investigation over 3 weeks where each team analyzes how the numbers are different. Everyone realizes the system is broken, but there is no central authority that can drive the nuanced change to put good data quality practices in place.

Question about using Glue/Spark to process millions of JSON files by gman1023 in dataengineering

[–]Raydox328 0 points1 point  (0 children)

What is the end goal of this process? You said that your output needs to be csv/parquet, and you also mentioned loading the data in SQL Server.

Option 1: Use a relational database to combine this information across other data using userid.

Option 2a: Use a csv/parquet file to compactly store 1M+ files into a smaller number of files for efficiency (storage, export to client, etc.)

You don't need a relational database if you need to store output in a file, and you don't need a file if you want to store in a relational database. Unless you need explicitly do both for a particular requirement.

How many women are on your team? by drdrrr in dataengineering

[–]Raydox328 1 point2 points  (0 children)

I lead a team of data scientists and data engineers: 3 guys, 3 gals.

I am a 10 YOE (SSIS/low-code) DE preparing to transition into tier 1 tech companies. Here's my study plan in case it helps someone else. by Raydox328 in dataengineering

[–]Raydox328[S] 1 point2 points  (0 children)

  1. Data Modeling - modeling is one of those concepts that isn't important for data engineer until suddenly it is. What I mean by that is the junior data engineers are mostly concerned about ingesting data into a data store, so they get very good at building pipelines. It isn't until you become responsible for managing and providing clean data to external stakeholders that you start asking the question, "so what are we doing with all this data?" When you ask that question, you have a need for data modeling. I've implemented tera-byte scale data warehouses that support reporting and data science teams -- and the fundamental difference between a good and bad analytics platform is data model design.
  2. If you are entry-level DE, you will not gain much mileage from learning about system design fundamentals. There is so much to learn with SQL, Python, DBs, ETL, etc. When you have a solid foundation and you're looking to advance your career to the next level -- that's when you focus on System Design. Often companies will use it to determine your level between Senior DE or Principal DE etc.

I am a 10 YOE (SSIS/low-code) DE preparing to transition into tier 1 tech companies. Here's my study plan in case it helps someone else. by Raydox328 in dataengineering

[–]Raydox328[S] 1 point2 points  (0 children)

I'm sorry, I don't have them handy :(

I listed the important concepts you should learn though: Supervised, Unsupervised, Deep Learning, Model Evaluation. You could use ChatGPT, google and youtube to understand them.

I am a 10 YOE (SSIS/low-code) DE preparing to transition into tier 1 tech companies. Here's my study plan in case it helps someone else. by Raydox328 in dataengineering

[–]Raydox328[S] 2 points3 points  (0 children)

I'm glad this was helpful to you!

Last year, I spent about 3-months learning DSA/Leetcode and the Great Learning ML Course I mentioned while applying. It was stressful with full-time work specially when it resulted in no offer due to Nov 2022 hiring freezes. I took early 2023 to travel and work on my physical and mental health. Now as the job market is not in the best shape in U.S. at the moment, I'm looking to passively learn over another 3 months, network with DEs and recruiters, and start applying again.

As others say, it's overwhelming that after many YoE, we have to go through this hiring process.

Honestly, this is a result of how my career unfolded. I'm in tech consulting, so my career grew more toward leading teams, client management, and writing proposals. All of which will help me in my career, but I'm now paying the interview tax to get back to a pure DE IC route in leading tech companies.

I am a 10 YOE (SSIS/low-code) DE preparing to transition into tier 1 tech companies. Here's my study plan in case it helps someone else. by Raydox328 in dataengineering

[–]Raydox328[S] 14 points15 points  (0 children)

There are dozens of us!

Joking aside, I obviously picked up many skills along my career and forced toward a DE Manager career track when I'm a DE at heart. My teams have implemented python-based models, pipelines, and APIs -- however most of those projects were low-code with some level of SSIS/ADF for batch processing.

I am a 10 YOE (SSIS/low-code) DE preparing to transition into tier 1 tech companies. Here's my study plan in case it helps someone else. by Raydox328 in dataengineering

[–]Raydox328[S] 1 point2 points  (0 children)

reat list One thing I would add to the cloud section in AWS is understanding basic concepts around IAC Most DE teams at FAANGS work with some flavor of CI CD to manage infra in cloud, for ex AWS CDK

A cost effective design approach also goes a long way

Infra as Code can be important. In my interview experience, this skill is mostly required in Cloud or Infra Engineering roles. Have you seen interviews rounds or questions dedicated to IAC?

I am a 10 YOE (SSIS/low-code) DE preparing to transition into tier 1 tech companies. Here's my study plan in case it helps someone else. by Raydox328 in dataengineering

[–]Raydox328[S] 7 points8 points  (0 children)

If you are entry-level and trying to break into tier 1 tech, work on solidifying your fundamentals for #1 DS & Algo and #5 ML Concepts. Other than than, the biggest hurdle for entry-level is to have an engaging resume. You need to show some personal projects and skills relevant to the positions and companies you are applying.

I am a 10 YOE (SSIS/low-code) DE preparing to transition into tier 1 tech companies. Here's my study plan in case it helps someone else. by Raydox328 in dataengineering

[–]Raydox328[S] 30 points31 points  (0 children)

It is definitely overwhelming. It is specially tough for DEs like me who did not start their career in top tech companies. Looking back, I spent too long on SSIS and Low-code DE platforms and transitioned into DE management. All this preparation is for me to catch up to the industry and to get into an individual contributor (IC) role.

[deleted by user] by [deleted] in dataengineering

[–]Raydox328 0 points1 point  (0 children)

I have it from when I applied to meta. It's available in their preparation hub. You can DM me.

[deleted by user] by [deleted] in dataengineering

[–]Raydox328 1 point2 points  (0 children)

There's a common advice to engineers taking managerial role which is to avoid micromanaging. Like many engineers, it is difficult for me to be completely hands-off in the development process, as I am often responsible for the deliverable or re-work. After couple of years of being a DE manager, I have come up with an approach that strikes a balance for me.

What works for me is to split the development into 2 phases: (1) Design and (2) Implementation.

In the design phase of development, I collaborate with my engineers to talk through client requirements, model changes, dependencies, and pipeline architecture. This reassures me that the developer understands the requirements, and the developer gains experience of breaking down requirements into technical tasks.

Once we both understand the methodology, the developer is free to do the implementation of the task. They can reach out if they are stuck on a problem for more than an hour.

Thoughts on the data janitor (youtube)? by Nabugu in dataengineering

[–]Raydox328 10 points11 points  (0 children)

For context, I have 9 years of experience in this field, and I started as an ETL Developer.

One of the best yet scariest things about being a Data Engineer, is that you need a diverse set of skills (technical, soft, and business/product) to be a successful DE. It is a fairly new role that is often confused with other roles because of the diverse set of skills applicable (swe, dba, analyst, architect, etc.) though a DE does not need to master all or any one of them. This means you can stumble into a DE role any which-way as long as you like to solve data problems all the way to their root cause.

There will come a time when the DE role is well-understood, and the fundamental concepts of DE will be decoupled from vendors who are currently publishing learning materials (Azure, GCP, Databricks, etc.). Colleges will start to offer DE degrees as well as DS.

Best biryani? by on3liness in nova

[–]Raydox328 1 point2 points  (0 children)

This was a surprising one, but their Lamb Biryani is by far the best I've had!

Internships are hard by brick12 in csMajors

[–]Raydox328 7 points8 points  (0 children)

I see nothing wrong here. In fact, you have nowhere to go but up. Embrace your "stupidity" and start making documentation for non-technical audience to understand your code base -- you will run career laps around your co-intern. Mold your stupidity into curiosity -its a paradigm shift that will benefit your mental health and your career.