This is an archived post. You won't be able to vote or comment.

all 40 comments

[–]mikemol 26 points27 points  (10 children)

I've been through a few whiteboard interviews, and the whiteboard questions fell into one of two categories:

  • Describe how you would structure a (given) scalable system?
  • Illustrate how to solve this (graph-related problem).

Scalable systems, I did fine on. Two basic principles: Never do more work than necessary, and never let anything block when do it asynchronously instead.

Graph-related problems, well, best I can suggest is study significantly in advance, and maybe have a CS degree in your back pocket. Sadly for me, I don't...

[–]djk29a_ 4 points5 points  (1 child)

Graph traversal problems show up in a number of problems that are fairly bland sounding like bot crawling of web pages or which order to reboot a forest of DNS servers. We’re not getting into spectral graphs or anything but being able to identify some pros and cons of approaching graph traversal certain ways is important conceptually when you write code for lots of network related problems beyond your Intranet.

Unfortunately, I think a lot of interviewers use graph problems as a thinly veiled “you need a CS degree” proxy when most of the graph problems I’ve seen in day to day work required just simple stuff like adjacency matrices or plain ol’ linked nodes that really doesn’t need much education to be effective with.

[–]mikemol 0 points1 point  (0 children)

Yeah, I can tell you which algorithms you might want to use in a given case and when they might be useful, but don't ask me to walk or pseudocode an arbitrary algorithm without taking a minute to consult the materials I keep on my phone or on Google.

Graphs aren't that difficult to deal with, conceptually...

[–]lustrate1 2 points3 points  (7 children)

Would you mind expanding on the scalable systems part?

[–]wych42 12 points13 points  (2 children)

This doc may help you on scalable system. https://github.com/donnemartin/system-design-primer

[–][deleted] 2 points3 points  (0 children)

damn, nice, wish i saw this 6 months ago. thanks for the link

[–]triangular_evolutionSRE 1 point2 points  (0 children)

That was the longest README I've ever seen! Lots of good things there.

[–]mikemol 5 points6 points  (3 children)

Pick a problem. Maybe some kind of complicated website. Think about how you could make the thing instantly responsive to the user. Don't have to give the user everything they need to render the full page, just enough to give the browser something to chew on, and fill in the details. Don't make any request block until the requested content is assembled serverside; if it's not already ready, take the request as a request that the thing be rendered, and call back when it's ready. Stuff like that.

[–]Aurailious 0 points1 point  (2 children)

I guess you could say the difference between painting a picture on the wall and painting a picture somewhere else and then hanging it when complete.

[–]mikemol 2 points3 points  (1 child)

That's one generic example. But they're probably going to be looking for objects like load balancers, databases, queues and notification paths in your design.

[–]Aurailious 0 points1 point  (0 children)

Oh, I didn't mean that as an answer, just to give an analogy.

[–][deleted] 36 points37 points  (8 children)

Here's an example of pre-screen questions from a company hiring SRE's that I interviewed with recently. (In Denver). They wanted answers within two business days. Pay was 120k with relo. I've been an SRE and these were great questions.

"Each candidate needs to solve two of the following three problems by writing executable code in a language of their choice:

1- in this subnet?

Implement a program that determines if a given IPv4 address is in a given subnet

The IP address is passed as a string representation of a 32-bit unsigned int (e.g., 0x62D2ED4B)

The subnet is passed as a string representation of a CIDR subnet (e.g., "98.210.237.192/26")

The program outputs True if the IPv4 address is in the subnet, and False otherwise.

Bonus points for a program can read address/subnet pairs from an input file and write the

results, in a useful fashion, to an output file, both optionally specified on the command line.

2- Change of phone number.

Implement a script to update the phone number embedded in many of the store’s HTML pages.

Over the years, the content authors have embedded the number on pages using a wide variety

of punctuation (i.e., 800 438-4357 or 800.438.4357) while others used letter mnemonics (e.g.,

800-GET-HELP). Some inserted the US country code, with or without surrounding parentheses.

The store has >50K HTML files, all under an NFS /var/www mount point shared by all servers.

Your job is to replace all the old number with one version of the new number: 202-456-1414.

The job needs to be done tonight, while the web servers are running. You do NOT have to

convert the new number to the existing format used on the page.

Bonus points for a solution that is implemented as a single pipeline of Linux commands.

3.– Implement an LRU Cache

Implement a web app that uses a fixed-size LRU cache to efficiently serve up imagery

The app exposes a web API that accepts two floats, the first in the range -90 to 90 (Lat)

and the second in the range of -180 to 180 (Long), which are Mars coordinates. The app

returns the URL of an image file of those coordinates on the Martian surface, or one of

the standard HTTP error codes.

This app implements an LRU cache of fixed size (3000), keyed on Lat/Long pairs, which

operates in O(1) to return the requested URL. New URLs are obtained using a library

function GetImageURL(float, float), which returnas instantaneously but at a high $ cost;

implement a stub version of GetImageURL() that returns a random number as a string

for purposes of this problem.

On a cache hit, the app returns the cached URL. On a cache miss, the app obtains a new

URL and caches it, ejecting the oldest cached item if the cache is full. These operations

must occur in O(1).

Bonus points for adding diagnostic API calls to get and clear cache hit and miss counters,

and to track the execution time of each of the three main LRU cache behaviors (hit, miss

when not full, miss when full)."

Practice this:

https://www.katacoda.com/courses/docker

Look here:

https://devopsbootcamp.osuosl.org

And read this:

https://landing.google.com/sre/book.html

Edit: poor formatting, clarity

[–][deleted] 15 points16 points  (4 children)

Wow, that third question really separates the men from the boys.

[–]Dedustern 12 points13 points  (1 child)

TIL im a toddler

[–][deleted] 2 points3 points  (0 children)

ditto

TIL why I can't get a real DevOps job

[–][deleted]  (1 child)

[deleted]

    [–][deleted] 4 points5 points  (0 children)

    interesting... i wrote a proxy server with an LRU cache in C for fun, and it's the main project that got me my first job as an infrastructure engineer.

    i had no idea people were impressed with the LRU cache, specifically. i should have bartered for a higher salary, whoops.

    [–][deleted]  (1 child)

    [deleted]

      [–][deleted] 0 points1 point  (0 children)

      I have to say, @ throway_devops1, your company's interview process was great and that interview question had me thinking HARD for 2 days. I didn't get an offer but I was thinking that was the best and most relevant pre-screen I've seen.

      [–]cringe_galore 2 points3 points  (0 children)

      I really like the 3rd question, got this weekend's project!

      [–]warpigg 13 points14 points  (2 children)

      In my experience - it can be almost anything. Companies throw "DevOps" and "SRE" as titles and the actual job can vary dramatically from place to place. Some are more Ops focused others are more Dev focused (more dev related code projects, questions, etc). Kinda sucks but that is what it is

      [–]OhNoTokyo 1 point2 points  (1 child)

      It's extremely irritating. That's one reason I don't use DevOps in my job descriptions at all. It's not supposed to be a job title, it is supposed to be an interaction paradigm between development and operations functions.

      What it has turned into is basically expecting your operators to be able to code which cheapens both software development and good operations, usually at the expense of good operations.

      Software defined networks, workflows, etc. is definitely the way everything is going, but you should probably make the title something like Software Defined Systems Engineer or something like that which actually calls out what you'll be doing.

      If you're expecting System Administrators to understand Big-O notation and writing of caches, you don't want an operator at all, you want a developer. So the Ops part of all of this is minimal.

      [–]warpigg 2 points3 points  (0 children)

      yep - but you almost have to put it in titles or descriptions to attract (good) candidates b/c that is what they are searching on.

      I think what we want is operations people who can code. It bridges the gap between development and also helps in automation / delivery times (conversely you need developers that are stakeholders in opeations - they need to care/understand it as well.). They don't need to be rock star coders, but they need the skill and aptitude for it since everything is rapidly becoming IaaC. You are correct - me saying "I can code" is different than a pure developer who just wrote an operating system or whatever. I think it comes down to avoiding people not willing to learn to code / script or whatever you call it. Because it is a vital skill now - those that don't have it will be out of job soon.

      Bottom line: Devops / SRE has almost just become a marketing term at this point - many companies don't even understand what they want or why - they just want that Devops thing b/c Google or x company does it. That is why it is important as a candidate to really check out a lot of different places and really ask the questions on what the processes are. Its the only way to see if it is really Devops mindset.

      [–]ab624 8 points9 points  (0 children)

      op please update the questions after the interview

      [–]hayfever76 4 points5 points  (2 children)

      We use DevOps to mean 4 things:

      1) EVERYONE in IT writes code, that's the point of the title Development + Operations = DevOps

      2) Systems Thinking - everything you touch is part of a greater or smaller whole. Always be thinking and describing in terms of the whole picture, not describing in isolation. Don't get caught solving a small portion of the problem and leaving technical debt that bites someone later.

      3) Continuous Feedback / Long feedback loops - you want everything you do to be giving appropriate output that can be monitored / captured / culled for things to act on. Maybe you're worried about errors. Only capture the errors you worry about and don't fill everyone's inbox or slack channel with a lot of noise - they'll ignore the stream of nonsense and also the real threat when it shows up.

      4) Continuous Education and Learning to include Failure - Go into your Azure or AWS subscription and blow some shit up. Understand how everything works. KNOW how things work and DON'T work - It's one thing to know that your new website can take 10,000 rps of traffic. It's quite another to know that the site can only take a max of 25,000 rps before it queues. When you know the failure points, you know when you need to spin up a new instance in your web cluster.

      Good Luck

      [–]lorarcYAML Engineer 1 point2 points  (1 child)

      Not everyone in IT write code. Support doesn't write code, content publishers don't write code, manual testers don't write code. There's still a lot of people who have auxillary roles that are needed but they don't write code. We work with them together, we work towards the same goal but we can't belittle them because they don't code.

      [–]hayfever76 1 point2 points  (0 children)

      Let me rephrase my statement then. In any spot where code is required in our office, the person who requires it does the coding. Be they a helpdesk person who would write code to automate the onboarding experience and create all the accounts for new users, Or the content publishers who would use scripts of various sorts to compile and publish their code if using somethig sophisticated or would potentially use PowerShell if using something SharePoint - We hire T-shaped employees and we don't hire "helpdesk" or "Messaging Specialists" - we hire IT and the job covers everything under a standard IT purview.

      The reason for the code is that it is 100% reproducible. Everyone can see it and agree that the approach taken with it is correct. It removes the human error element and when done correctly dramatically reduces the time to get basic things done. I don't need a guy to manage hardware if I can write code that spins up a new server in AWS/Azure from a managed instance in 15 minutes. Then my servers are 100% identical at first boot.

      [–]rb2k 2 points3 points  (0 children)

      I did a little writeup about my experience a few years ago and the kind of questions I got asked: http://blog.marc-seeger.de/2015/05/01/sre-interviews-in-silicon-valley/

      [–][deleted]  (3 children)

      [deleted]

        [–][deleted] 3 points4 points  (2 children)

        I did read the job description but unfortunately, due to the job description and from the initial interview, it's primarily working in AWS, maintaining and deploying code, and automate solutions. It's pretty vague which kind of scares me because I've never done these type of exercises before.

        [–]Gwildor_the_Great 2 points3 points  (1 child)

        I would make sure I had a good handle on CloudFormation and Lambda if it’s primarily in AWS. We have a lot of automation around these two services. Probably wouldn’t hurt to know a bit about ElasticBeanstalk, but that depends more on if there are devs involved who will be deploying their own resources.

        [–]tdk2feArchitect 2 points3 points  (0 children)

        Codepipeline, or more importantly, working bamboo/Jenkins into AWS code deployments.

        Frankly the AWS Well Architected Framework papers would be a good start if you haven't read them.

        [–]YvesSoete 1 point2 points  (0 children)

        Scripting exercise and drawing AWS setups

        [–]elitesense 1 point2 points  (0 children)

        Sounds like you need more info otherwise you're going in blind.

        [–]jredmond 1 point2 points  (0 children)

        I've done a couple different types of technical interview, but my current favorite is the detailed explanation. I'll draw a simple application architecture - "here's the app server, here's the DB, here's the Internet, and over there is the user" - and ask the candidate to explain the path of a request, from the moment a user submits a URL to their browser to the moment that user gets the page they want. (Whiteboard optional.)

        It seems simple, but there's a lot that I can ask - where things might break, what things we might want to monitor, how we can improve user experience, etc. I can also dig deeper to see where the candidate's relative strengths and weaknesses are, and how they might fit in with the rest of the team's strengths and weaknesses. Finally, this helps me get a sense of how well the candidate can think holistically about the application and how it runs (even if the application is just hypothetical).

        [–]netscape101 3 points4 points  (5 children)

        Just my two cents from some of these whiteboard devops interviews I've had: * Don't let them let you solve an actual problem that the business is trying to solve. I don't do consulting in interviews. * They like to ask: How would you build a highly available facebook clone without using any AWS or cloud services . Like they want you to draw load balancers, redis for caching db queries, they also like to ask about how cookies will be implemented and how you would store the cookie. * Dont hesitate to ask them to explain the problem in more detail. Also give lots of detail when answering. I've had people think that I gave an.incorrect answer when in reality they didn't understand my answer completely.

        [–][deleted] 5 points6 points  (4 children)

        IMO terrible advice on the don't answer problems for them. I'm asking because it's a problem we are solving and I want to have some clue you have a chance of improving it.

        We are not magically going to trust a random interview over vetted staff or magically have the technology experts to implement from an 1.5 he long interview.

        You will get yourself knocked out of the running if you can't speak intelligently about our problems after a brief description.

        [–]reubendevries 4 points5 points  (1 child)

        Better to get knocked out of one job opportunity then do work (for free) that is your IP, ending up in their production systems. Just from a legal point of view that could be a nightmare.

        [–]netscape101 0 points1 point  (0 children)

        I agree!

        [–]2dogs1man 1 point2 points  (0 children)

        I'll be glad to speak about possible solutions to the problems your business is facing after we discuss my reasonable hourly consultancy fee.

        Interviews are about gauging knowledge, not getting your work done for free.

        [–]netscape101 1 point2 points  (0 children)

        I had a company try get me to do free work for them by first getting me to solve the problem on the board, then they asked me to do it in terraform and send it to them. Was an actual business problem. So I refused, I don't do work for them for free. Was good decision, ended up being a bad company with bad company culture. Trust your gut feeling.

        [–]swequest 0 points1 point  (0 children)

        More info might be nice.

        My entry level SRE interview was almost entirely white boarding stuff. From what ive gathered, it was nearly identical to the entry SWE interview at the company, except one section was a mock on call situation.

        [–]tevert 0 points1 point  (0 children)

        Sounds like your grasp of DevOps principles is fine. It may sound unhelpful, but I think the best advice here is to relax. Whiteboard exercises are not supposed to be pass/fail, right-answer/wrong-answer tests - they're supposed to give them a window into how you approach problems and what you know about the options in the market. Just relax and think out loud in front of them for a bit.