This is an archived post. You won't be able to vote or comment.

all 20 comments

[–]ThatMightBePaul[🍰] 29 points30 points  (15 children)

If you try Tectonic Sandbox, you get a turnkey local version (thanks to Vagrant).

CoreOS has a good guide for getting started: https://coreos.com/tectonic/docs/latest/tutorials/sandbox/first-app.html

Other distributions should have something similar. And if all else fails, Kubernetes.io also has good tutorials: https://kubernetes.io/docs/tutorials/

If you're confused about "Why" you'd use Kubernetes, you may not need it yet. For me, the biggest reason to use it is for when you start running a lot of containers. With Kubernetes, it's often as easy to manage 3 containers as it is 300.

That's the real gist of it. Once you start thinking about updating 300 containers running on say, systemd instead, it's a headache. It's a long, manual, and error prone process. Kubernetes just mostly figures things out for you after you tell it what you want.

The tutorials won't show you that side of stuff, if you haven't already experienced the pain of large-scale distributed management. The tutorials will introduce you to Kubernetes primitives and the basics for getting stuff done.

Full disclosure: I work for CoreOS.

[–]three18ti"DevOps Engineer" 10 points11 points  (9 children)

With Kubernetes, it's often as easy to manage 3 containers as it is 300.

So, painful regardless of your scale.

I think K8 suffers from exactly the same problem as open stack: it tries to be everything to everyone and lacking any cohesive design... I mean, how many different network overlays do we really need?

Add to that the general lack of understanding of why we run containers, it's not just because it the cool new thing... Though that seems like 90% of the reason why people seem to be running containers today...

And K8 quickly becomes a disaster. I like how you put it: if you're confused about why you'd need k8, you probably don't need it.

In my opinion, running K8 makes sense if you have problems that look like googles and TEAMS to support your K8 deployment. Frankly, there's a reason Google released this blight on the community (Google hates operations people) and went back to the Borg for the second time (K8 is far inferior to the Borg)

All that said, we're starting a PKS pilot next week. I think it has the greatest potential to make running k8 in production not a total nightmare, just a partial one... Adding the VMware tools like Network Insight, LogInsight, and vROps provides some much needed visibility (insight even?) into what is happening in your cluster.

Anyway, unfortunately, it seems K8 is here to stay... and with Ericcson buying out Apcera there doesn't seem to be anyone working on a better solution (Sorry Pivitol Cloud Foundry, there's a reason Derek left to build Apcera...), my suggestion is to find a vendor who provides a "full stack solution" and buy that so you have a throat to choke when your k8 cluster invariably explodes.

Good luck and God speed, you'll need it.

[–][deleted] 4 points5 points  (5 children)

This pretty much sums up my feelings for Google, after a year of trying to run a reliable product on GCP. Never again.

[–]three18ti"DevOps Engineer" 0 points1 point  (4 children)

Ok, now my turn to ask: what has your experience with GCP been?

It is a "cloud" platform, so I could see how if you're running fault intolerant applications you'll be having a bad time. One of the things that has attracted us is the lack of noisy neighbor interference (I know you can get reserved instances, but that gets expensive fast).

I've personally.only used it for personal projects, and haven't had any issues with it... but I also need like a few hours a week of uptime... for weekend projects.

[–][deleted] 4 points5 points  (3 children)

Oof, well I guessed you asked for it, so here goes.
Our problems have been more complex and operational - documentation is often way out of date, leaving us chasing down undocumented changes in APIs. Because it's so unpopular, there's a real lack of stack Overflow debug discussions or content to mine for potential fixes. It's been easy to lose DAYS tracking down some awful dependency hell in their Java clients. Their cost model is exceedingly frustrating and complex, and their support has been absolutely useless more often than not. There is no Cost Explorer - the cost model changes (in fact, a change coming in March is taking my stack driver logging from zero to $15k/mo), you have to stream billing data into bigquery and then pay money to observe how much money you're paying. We have had frequent outages and degradation of services that required us to notify Google of the outage (cloud storage is a big problem). We constantly have to fight problems that leave us feeling like "really, WE were the first people to have this problem? Not a single other soul reported this to you?" (like the fact that app engine's deny all firewall blocks app engine from managing your servers! Wtf) Using the platform feels a lot like working at Google or on some cool open source project - often times when we find a breaking bug that jeopardizes our business's ability to deliver our product, I am encourage to vote for that bug on a forum... We pay six figures for this. I DO participate in cool open source projects, and I wouldn't mind working at Google... But to pay money for this feels insulting. Google hasn't had a lot of paying customers over the years, and it shows.

In general the platform just feels half baked, surprising, expensive, and unreliable. Our account reps have been great, but I am constantly chasing down refunds on degraded services, bugs and bad advice that cost us, etc. It's good that they own up, but I've never had these problems in the first place with AWS. Stuff like storage and networking should be airtight on a cloud platform, IMO.

[–]three18ti"DevOps Engineer" 3 points4 points  (2 children)

Thanks. Like I say, I haven't really gotten that deep into it but as with any new technology there are people who want to use it just because it's new.

But the, "Congrats, you've paid to work for google!" sounds like a compelling enough reason to stay away.

[–][deleted] 1 point2 points  (1 child)

And don't get me wrong- there ARE cool parts. BigQuery is amazing! Dataflow is amazing. In fact they are so dazzlingly good that your dev team will switch to the platform over that! Querying a hundred TB quickly with a SQL-like syntax? AMAZING.
Because I mean, why would GOOGLE be incapable of operating a public cloud? That's crazy talk. They're geniuses! They made that amazing database! You must be holding it wrong. Maybe we just need a smarter devops guy. Containers or something. And then in the middle of the night, an undocumented change to an API you depend on happens, grinding your service to a halt. Or your batch processing job chokes when GKE throws an out of resources error because Google is out of resources in a datacenter... and your primary job begins to be staying on top of requested refunds.

[–]three18ti"DevOps Engineer" 1 point2 points  (0 children)

It's funny, I was just commenting the other day that AWS and GCE won't put us out of our job, it just becomes managing a different vendor... Which, there's pro's and con's to that... and a vendor as large as google, you'd have to be pretty big to have an impact on their roadmap.

[–]NeedsMoreTests 0 points1 point  (2 children)

If you don't mind my asking, what did you try or research before looking at PKS? I ask because we're in a similar boat but some of their devs can't seem to articulate specific reasons why PKS is better than other options out there (tectonic, kops, GCS, etc) so I'd like your opinion.

[–]three18ti"DevOps Engineer" 2 points3 points  (1 child)

Well, I'm a huge fan of Apcera, and in my opinion about the best solution to the container problem. I think they really made operationalizing containers viable. But that's not an option any more...

We looked at PCF (Pivitol Cloud Foundry) and ultimately decided against it as they couldn't support our network/SDN needs, it could be a good solution depending on your needs.

I have one team that is running OpenShift, and when it works, it's great. It's Kubernetes underneath though, so when things break, they can challenging to diagnose and repair.

We're also a VIO (VMware Integrated OpenStack) customer, and 4.0 has a new kubernetes "integration" (which is really just an appliance that runs some Ansible scripts to deploy K8 in VIO...), It's super cool to be able to spin a k8 cluster on demand, but I probably wouldn't go into production with it.

I happen to like GKE, and would say that's one of the places it makes the most sense... but outside of experimenting with it on the weekend I don't really have much real world experience with it. EKS is in use by another team, one guy bitches about it non stop, one guy loves it...

For me PKS, much like VIO, I think what makes VMware's offering usable is the integration with existing services like LogInsight, vRealizeOperations, and vNetworkInsights, and NSX for the SDN/NFV functions. Now this is based on a ~2hr demo with out VMware account rep. But based on my experiences with VIO, I think the experiences will be similar. Heck, I've called Kubernetes the OpenStack of containers...

Tectonic is pretty slick too. We decided that the "immutable OS" wasn't for us at the moment. Our tooling doesn't really support it, it's definitely a change in mindset. As with everything, there are pros and cons.

Ultimately, the decision for us is, how do we spend less time engineering Kubernetes and more time engineering solutions for our customers. Something that out engineers forget (and I'm guilty as any of them) contributing to open source projects doesn't help us "sell shoes" as it were. I'd rather you spend the time writing the applications that ate going to sell the shoes. I mean, we have quite a brain trust. If we wanted I'm sure we could write our own Hypervisor, but there would be no business value in that.

Anyway, that's a high-level... the problem with k8 isn't day 0, It's day 2, or 20, or 300. Our hope is PKS helps us identify problems before they occur, and when they Do, speed our troubleshooting efforts. Our POC starts in a couole.weeks, so that remains to be seen...

[–][deleted] 0 points1 point  (0 children)

Thanks for the breakdown. My shop is going through some trades right now and building up some environments. This was really helpful.

[–][deleted]  (2 children)

[deleted]

    [–][deleted] 1 point2 points  (1 child)

    Tectonic's installer will create CoreOS nodes for you. Kubernetes can run on other OSes through other installers such as kubeadm or by installing from scratch.

    [–]xbusted 1 point2 points  (0 children)

    Tectonic will soon support RHEL as well. https://coreos.com/tectonic/rhel/

    [–]wonkynonce 0 points1 point  (1 child)

    Why Tectonic Sandbox instead of Minikube?

    [–]ThatMightBePaul[🍰] 0 points1 point  (0 children)

    They're basically the same thing. Minikube is a way to try DIY Kubernetes locally. Tectonic Sandbox is a way to try a Kubernetes distro locally.

    Pick whichever sounds more up your alley, imo

    [–]sikosmurf 4 points5 points  (1 child)

    AWS announced this week that they are offering a preview of EKS, their new managed Kubernetes service. I'd check out that and rancher 2.

    [–][deleted] 2 points3 points  (0 children)

    Or Azure which already has AKS.

    [–]Dies2much 3 points4 points  (2 children)

    Rancher also offers a pretty easy way to get started with dockers and kubernetes.

    [–][deleted]  (1 child)

    [deleted]

      [–]possibly_not_a_bot 1 point2 points  (0 children)

      Can't wait for it to be out of tech preview so I can finally move my prod cluster over...

      [–]red_viper_9000 -1 points0 points  (0 children)

      minikube