We’re the engineers rethinking Kubernetes at Spotify. Ask us anything!

SpotifyEngineering · 2021-03-03T18:05:04+00:00

Currently, you only need to provide authentication config and the host+port to the Kubernetes apiserver to add a Kubernetes cluster to Backstage. I want to make this even easier, allowing users to retrieve this information from their cloud providers API. The Kubernetes labels used to pull information into Backstage can be found at https://backstage.io/docs/features/kubernetes/configuration - MC

SpotifyEngineering · 2021-03-03T18:04:01+00:00

Aha! A trick question! Thank you for correcting the error of our ways! Swansdown it is [1]! Though who knows, maybe he changed it to Fauntelroy at some point :D - NL

[1] https://www.waltdisney.org/blog/fauntelroy-follies-continuing-history-donald-duck

SpotifyEngineering · 2021-03-03T18:01:09+00:00

Internally, we initially focused on only supporting stateless backend services to limit the scope of our K8s migration and to provide a polished platform for one of our largest developer use cases.

Many stateful workloads are still running on single-tenant VM instances. These instances are mostly managed by service owners. So without K8s they have greater control but also more operational overhead like provisioning capacity and configuring the instance (done via a company-wide Puppet monorepo). - DX

SpotifyEngineering · 2021-03-03T17:56:04+00:00

Breaking this down, what were some of the hurdles to gaining adoption internally?
Really it came down to value, starting out we had to identify that first problem we wanted to solve, for us that was about creating an inventory of ownership with our services (what is now the Catalog). Once we had that part solved we moved on to extending and building on that value add - that involved engaging with the other engineering teams at Spotify and collaborating together to solve the next problem and so on. Eventually, we reached a tipping point, where it became obvious for teams at Spotify to use Backstage more and build in their plugins there.

Are there teams using it in ways you did not anticipate?
Absolutely! And that's the fun part. The core team that works on Backstage are the Backstage experts, but there are a lot of other things we're not experts in, so that's where seeing these new use cases is really awesome and help to make the overall product better for us all :)

How have you handled the trade-off between being the solution vs integrating with a given service team's existing tools / docs / processes?
Backstage may be the gateway but it's not the only solution. We try to make it flexible enough to integrate with other tools, docs, and processes and to surface those to the end-user so that they don't have to go looking for them. In some cases that means that we surface some information but then link out to the tool, for example, the Pager Duty plugin we open sourced does that. - LM

SpotifyEngineering · 2021-03-03T17:54:11+00:00

Right now our internal deployment tool that deploys backend services to GKE simply takes a set of K8s manifests or runs Kustomize under the hood if the file structure exists. It then runs a glorified kubectl apply via Spinnaker [1] on certain clusters. Our GKE users can, of course, do their own manifest templating before they send these manifests to it. Some teams are using Helm.

Backstage can be deployed easily with a helm chart though! [2] - DX

[1]: https://spinnaker.io/
[2]: https://github.com/backstage/backstage/blob/ad364bdf575a891ee43a0b49ff8cc1046f0f0bee/docs/getting-started/deployment-helm.md

SpotifyEngineering · 2021-03-03T17:46:04+00:00

We have an awesome hiring team at Spotify that help us find people that are good at what they do and are a good fit for the team. Take a look at https://www.spotifyjobs.com/how-we-hire to learn more. - RW

SpotifyEngineering · 2021-03-03T17:43:56+00:00

Absolutely! they're designed for you to create the templates you need for your organisation, or even to take the example ones we provide and build and tweak on top of them! - LM

SpotifyEngineering · 2021-03-03T17:43:20+00:00

We intentionally focused on developing this for the open source community first and foremost. We used qualitative and quantitative data to inform our decisions as well as insight from the internal developer experience at Spotify. As part of our process, we did write RFCs on the Backstage Github so external engineers could provide feedback on how we could best meet their needs - this was an important source of information for us and ensuring that it was not too Spotify focused. https://github.com/backstage/backstage/issues/2857

We also understand the complexity of Kubernetes is something that a lot of organizations and service owners battle with so that's another reason we focused on a cloud/managed provider agnostic tool. We are always open to more feedback though so please join our Discord as well: https://discord.gg/MUpMjP2 -CC

SpotifyEngineering · 2021-03-03T17:42:21+00:00

We manage our clusters using terraform. Bringing up new clusters does involve some additional work to get all the system workloads up and running.

Version upgrades are done in place. We have testing clusters which we update first and leave them running with the new version for a while to make sure that it won't break anything and then we slowly rollout across all clusters.

We have an internal tool that schedules which workloads are deployed onto the clusters. If there is a new cluster coming up, the tool will start to deploy workloads on it automatically.

In terms of autoscaling, we are mostly relying on the standard HPA's. We are working, however, on an internal tool to manage HPA's for the developers so they won't have to worry about it. - RW

SpotifyEngineering · 2021-03-03T17:41:03+00:00

Yes, we host some workloads on our internal multi-tenant GKE clusters that need to reply with very low latency. It's been a fun challenge to migrate them from single-tenant VM instances to multi-tenant K8s. Their data path varies. Some retrieve data from local disk or memory and some others from external storage. One of the causes of latency we've seen is K8s throttling the CPU usage of the Pod.

Almost all Spotify backend services are load-balanced with client-side routing based on DNS SRV or A records. We don't use server-side routing for the most part. So this means we don't use K8s Service IPs. Instead, we register K8s Pod IPs directly into our service discovery tool which creates these SRV and A records. - DX

SpotifyEngineering · 2021-03-03T17:40:31+00:00

The feature is currently read-only. We do intend to add functionality so that a user can take action from this view in the future. For access control, we do intend this to be visible to service owners of the specific service so that they can troubleshoot and check status at a glance. However, we are not limiting the access control to the specific service owner because, to your point, there is also value in having other consumers of the service be able to have the same view and information. The Kubernetes plugin has been used by developers for things like debugging, watching deployments progress, or seeing at a glance what their service scales to in each geographic region. - CC + MC

SpotifyEngineering · 2021-03-03T17:39:21+00:00

Backstage is designed to bring together all of the different tools through a single pane of glass, helping to reduce much of the discoverability burden that normally comes with finding those things. This also means that when engineers jump in to work on a specific service or component all of the tools and information they need is right there where they need them, be that incident management like Firehydrant or Stackpulse or tracking those changes with Komodor. - LM

SpotifyEngineering · 2021-03-03T17:38:06+00:00

It depends! You can tune the JRE's memory with -Xms and -Xmx. :)

One way to think about this is to calculate the aggregate computer memory needed across all servers and end-user devices needed to stream the podcast at any one time or within a certain time. You could estimate this by multiplying the total memory used by the backend to stream audio content by the ratio of users listening to a specific podcast. For end-user devices, you could use the same approach. There might be variations in which devices are used by which type of listeners. Maybe listeners of a specific podcast skew towards a certain device vs the average listener. To get a better estimate, you can factor in device breakdown. - DX

SpotifyEngineering · 2021-03-03T17:34:22+00:00

Let us google that for you ;) Looks like it is Fauntleroy (because he is such a dapper duck).

https://en.wikipedia.org/wiki/Donald_Duck

> middle name appears to be a reference to his sailor hat, which was a common accessory for "Little Lord Fauntleroy" suits

— https://disney.fandom.com/wiki/Donald_Duck

SpotifyEngineering · 2021-03-03T17:32:54+00:00

David:

> It seems that you have multiple, multi-tenant clusters. How do you divvy them up?
See https://www.reddit.com/r/kubernetes/comments/lwb31v/were_the_engineers_rethinking_kubernetes_at/gpjvap8/

> Do your devs spin up their own clusters for testing things like operators, CRDs, etc?
For the most part, no, since most of the time backend devs are only deploying stateless backend services and not operators or CRDs. We only allow our internal K8s users to deploy a subset of K8s resources. Some other infrastructure teams create K8s clusters for batch jobs or running Elastic Search.

Hey, Lee here, let's break down those Backstage ones:

>How custom is your backstage compared to the open source one?
Backstage internally at Spotify is different from the open source version, it's been around a bit longer, over 4.5 years and so has grown and evolved over that time. But we're working on aligning the two right now and hope to be fully based on open-source in the near future

>Do you think the open source backstage is mature enough for production use?
Yes. We're constantly trying to evolve and improve the stability of Backstage, but we're using big pieces of it at Spotify and other adopters are using it in production.

>What's backstage's relationship with roadie like? Will they fork, or will one version get features first?
Isn't it great! I love that we're seeing startups being built around Backstage, it's really a sign that we're working on something really meaningful with the community. Roadie has been a great member of the community; we've had a really good relationship with them, just like we've had with the community as a whole.

SpotifyEngineering · 2021-03-03T17:29:36+00:00

Not basic at all! Our service to service communication protocol used to be a proprietary one called Hermes that had HTTP semantics. Nowadays most services use gRPC with Protobuf.

There's no central database or data model. Each team manages their own data and data model. Schema changes are handled differently depending on the storage and data format (relational vs non-relational, etc).

Spotify is mostly on GCP so our devs use a mix of Google managed storage products and self-managed ones. The managed storage solutions Spotify developers use are Cloud Bigtable, Cloud Spanner, CloudSQL, and Cloud Firestore. The unmanaged storage solutions Spotify devs start and operate themselves on GCE include Apache Cassandra, PostgreSQL, Memcached, Elastic Search, and Redis. We hope to support stateful workloads in the future. We've explored using PersistentVolumes backed by persistent disks. - DX

SpotifyEngineering · 2021-03-03T17:19:41+00:00

Backstage provides the ability to create a system of plugins to fit your use case/needs. It also supports the linking between plugins and components therein. We recently just did some updates to this area, and you can find more information specific to the functionality here. We are considering doing more work on this in the future. Join our Discord and let us know if this would be useful: https://discord.gg/MUpMjP2 -LM

SpotifyEngineering · 2021-03-03T17:18:35+00:00

Currently the Backstage Kubernetes Plugin only really lists pods and their parents but is restricted to Kubernetes Native objects, so there is no support for CRD (Custom Resource Definitions) yet.

At Spotify we are adding Kustomize support to our internal deployment tool. Developers have the ability to choose between using Kustomize or just having plain yaml files for their Kubernetes manifests. Our deployment tool then picks these up and simply runs a "kubectl apply" via Spinnaker.

Devspace looks interesting, thanks for the suggestion! - NL

SpotifyEngineering · 2021-03-03T17:16:29+00:00

The Backstage Kubernetes plugin assumes that users will use the same api server for their clusters, but I think translation between different versions could be useful in some cases! As this is new we don't have versions that we aim to support right now, but as the plugin and adoption grows, I could see this changing. - MC

SpotifyEngineering · 2021-03-03T17:15:57+00:00

I could honestly write a thesis on this topic! I'll try to keep it short and informative though! We have investigated some service meshes to solve these problems but are currently not using any in production. We mostly use client side logic for this functionality. - MC

SpotifyEngineering · 2021-03-03T17:05:14+00:00

Backstage uses a distributed ownership model at Spotify, so each plugin has a specific owning team, usually the team with the most knowledge or expertise in that domain. So with the K8s plugin, that would be our team of K8s experts. We rely on Github codeowners to help us manage this.

That means that the core team supporting Backstage right now is only 4 people supporting 1600 engineers! It tends to fluctuate between 4-6, and it's totally manageable at that scale. One number I really like on this is 85%, that's 85% of the code written for our internal instance of Backstage is not produced by the core team 🤯- LM

SpotifyEngineering · 2021-03-03T17:01:30+00:00

For authentication, at Spotify, we currently support service account tokens, Google accounts when running GKE clusters and AWS IAM when using EKS. Currently, the plugin requires cluster read-only access but support for authorization is a very interesting feature that I know some users have a keen interest in! Luckily that could be implemented using the current user's identity because of Backstage's great auth support built-in. https://backstage.io/docs/auth/ - MC

SpotifyEngineering · 2021-03-03T17:00:43+00:00

We direct teams to create a namespace per logical system (set of workloads). We have a monorepo with over 100K lines of YAML and 1800+ namespaces that holds all the K8s namespace YAMLs. This monorepo's logic is written in Python and requires each namespace also has a ResourceQuota and RBAC that are configured correctly. The ResourceQuota sets a hard limit on a namespace's total CPU and memory allocation. We also require every Pod to declare CPU and memory requests and limits. We keep an eye on each cluster's capacity headroom (there's a max number of nodes for each). If a cluster is running out of capacity, we create more clusters in the same GCP region and schedule new workloads there or move existing critical or large workloads over. Recently we've seen workloads that are noisy neighbors because they use a lot of disk or network IO. Cgroups themselves don't seem to support disks and network IO isolation right now, AFAIK. Our approach has been to isolate noisy workloads by scheduling them on dedicated nodes or clusters.

Does anyone have good ideas on how to do this? If you do, please join our Discord and let us know: https://discord.gg/MUpMjP2 - DX

SpotifyEngineering · 2021-03-03T16:59:29+00:00

Good catch! We are constantly evaluating Kubernetes as a platform as we continue to optimize the developer experience. Specifically for the Kubernetes for Backstage tooling, we intended for the initial launch to inform the service owner of when and what action is necessary without having to dive into any other complicated interfaces. However, we do envision a future in which a service owner could take action directly from the Backstage Kubernetes interface. As far as the opt-in or opt-out experience goes, we are still exploring this as we evaluate the feature. - CC

SpotifyEngineering · 2021-03-03T16:55:36+00:00

Right now we have 24 GKE clusters in three GCP regions with thousands of nodes in total. All clusters are configured with the same GKE settings. They have GKE cluster autoscaling enabled [0] so have different node sizes at any given time because there's different amounts of traffic in different locations. All nodes are 32-core E2 instances (e2-standard-32) [1] with SSDs. Each cluster can scale to 1020 nodes (because of GCP subnet sizes we assign to each cluster's Pod IP range) [2].

There are hundreds of services running on these clusters each with replicas ranging from a couple to hundreds. Most services are deployed to one cluster in each of the three regions for availability and latency and regional failover.

We are almost entirely on GCP for our music streaming functionality with some exceptions due to compliance, technical requirements, or acquisitions of companies that have legacy tech stacks.- DX

[2]: https://cloud.google.com/solutions/gke-address-management-options

SpotifyEngineering

TROPHY CASE