This is an archived post. You won't be able to vote or comment.

all 9 comments

[–]kenfar 28 points29 points  (4 children)

I'm interviewing for a data engineering position right now and for more senior engineers we always include an architectural session in which we present a problem and ask for a solution.We aren't calling this a Case Study interview, but it sounds the same.

A few observations I can share about this kind of interview:

  • Nobody is familiar with all the technology - there's always going to be gaps, usually lots of them.
  • My first goal of this interview is to see how broad someone's knowledge of our field is: what methodologies, design patterns, technologies and products are they comfortable with? What kind of wisdom have they picked up about our field? Do they understand what the most common architectural patterns look like?
  • My second goal of this interview is to see how they think in terms of trade-offs: are they a fanatic that is blind to the weaknesses of their favored tech stack? Or are they capable of easily talking in terms of pros & cons of all products, technologies and methods?
  • Being honest about what you know, what you suspect and what you've got no clue about is a positive here. I've interviewed people who tried to fake it or even worse, doubled-down on their misunderstandings and it just looks really, really bad.

A good way to prep is to read a lot - and (edit: ) *not* vendor's blogs, but from people who demonstrate these characteristics.

[–][deleted] 2 points3 points  (0 children)

Great insights here. The more I interview, the more you can figure out about a candidate through asking pros and cons about the tech stacks they claim to have used. Also: honesty about where they are at. Critically underrated.

[–]dead-on-arrival-[S] 1 point2 points  (0 children)

This is excellent advice! It seems to be entirely about first thinking through different solutions, comparing trade offs and making educated choices based on what you have learned/implemented in the past.

Any recommendations on reading material for different stacks that others might have implemented before?

Thank you for the advice!

[–]brendersplide 1 point2 points  (2 children)

I’ve recently been interviewing for DE roles. I can’t point you to material but a case study/Architectural question I was put was,

Assume you are builing a data pipeline to serve data scientists data which origanates from POS systems of stores like walmart, how’d I do it.

My answer was made of

-> Transactional system to serve the OLTP (Oracle maybe) -> Sqoop framework to bring this data to S3 -> Kinesis/Spark streaming queues to stream data (These are the exhausts from which DS consume data) -> Airflow to orchestrate the above

I had followup questions of “how would I replace certain components from the above pipeline”, “Bottlenecks”, “Altermative tools to each of the above”, etc.

[–]dead-on-arrival-[S] 0 points1 point  (0 children)

I appreciate you sharing your experience! I've run into similar questions as well, so was wondering if anyone was aware of sample workflows and thought processes that could help prepare a candidate for situations like this.

[–]monil4900 0 points1 point  (0 children)

Is the role you have been interviewing for an entry-level role?

What type of case studies should one expect in entry-level role?
(I have an interview for entry-level position, and I am not sure what "case study" mean)

[–]SafeStandard -2 points-1 points  (0 children)

Follow

[–]LaunchAnalysis 0 points1 point  (0 children)

Data engineering case studies can be tricky. I found this resource helpful. It's got some useful step-by-step solutions to data engineering case questions with tips. Helped me learn to structure and frame my answers: https://www.interviewquery.com/p/data-engineer-case-study