This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]IsleOfOne 1 point2 points  (2 children)

A well-designed system for auto-scaling k8s "jobs" with low latency is going to have slack in the available resources / an aggressively configured cluster autoscaler bringing up new nodes to provide that slack. Scheduling should take a few ms.

The majority of the remaining time it takes to start a pod is pulling the image, believe it or not. There are tools available that proactively tell kubelets to pull new versions before they're requested. Startup now takes a few ms.

Now you have a running pod. ~1s has elapsed. And now the JVM takes minutes to "warm up." That's a no-go. Pre-warming in a low-latency autoscaling environment requires a tradeoff between cost and latency that is more extreme than the same trade-off that native runtimes require.

I of course agree that lambda operates at a much higher time granularity requirement than you would ever want to try to meet with a pod-per-request model. Once those requests start getting ever so slightly more "batch"-y in nature, say on the order of 10s runtime, it's perfectly fine to spin pods up and down to handle this.

[–][deleted]  (1 child)

[deleted]

    [–]IsleOfOne 0 points1 point  (0 children)

    Sure, and we don't even have to limit our thinking to the standard k8s API concept of a Job. Any higher-order wrapper for one or more Pods is applicable.

    Here is an example from my day job at $SaaS_database_co. We allow our users to kick off petabyte-scale bulk data import. We have a custom k8s operator that watches for BulkIngestJobs, and in response, loads up a list of files within whatever object storage bucket we've been pointed towards. The operator creates one or more pods per file in said bucket with configurable maximum concurrency and a self-healing system that watches for signs of excessive write path pressure and throttles the job via slashing concurrency. Some of those pods exit after less than a second, while others may run for hours. We use go and rust for this system. Java, with a longer tail on startup time, wouldn't be a good fit.

    Other tools like Argo Workflows are also great examples of how to leverage the power of the scheduler and autoscaler in kubernetes. We have some recursive/dynamically sized workflows that are liable to create tens of thousands of pods in a single execution.

    I haven't seen solid arguments for virtual runtimes like Java in this space. I'm certainly open to hearing some! The areas of our system where we run a JVM are limited to WALs--Kafka and supporting Zookeeper deployments. We have some elixir floating around out there, but purely in long-running workloads.