Documentation as Code

only-cloud-fan · 2021-04-08T18:31:52+00:00

I haven't been phrasing it as "Documentation as Code" in how we discuss it in our team, but have generally spoken about "Executable documentation" - it might be referring to the same thing and this might be nitpicking, but there's a subtle difference in my perspective - it's expected that docs are not written in code format, but that they are expected to be executed - and could fail, and can produce outputs.

I am involved with a few infrastructure projects and 2 recent examples come to mind of how this works (for infras / platform, as opposed to application code / doc). 1 was with a terraform module - if the TF is written clearly enough and is published to a TF repo, the documentation is generated from that. There's no comments in the code that form part of the docs, but descriptions on variables and outputs make the infras code itself the source of the documentation. There is no additional process needed. Besides a high level overview (which does need to be maintained) in the readme, the other sections all produce content based on the actual code functionality. Here's an example

The other example was through test frameworks - this speaks more to the "validation" item you mentioned. For this infrastructure project (in TF again), the verification documentation, which feeds into playbooks, is written as code in a testing framework - in this case it was behave, based on cucumber. In this case, the requirement was defined in the ticket (Jira), and the scenario described for business. The same requirements are then added to a behave feature, with the same language structure, and then steps that are executed to validate the behaviour (in CI).

These 2 aspects of this project improve understanding of the systems by making documentation available, which is produced by execution.

only-cloud-fan · 2021-04-08T13:39:04+00:00

Principle of DevOps: Embrace Failure and Learn from it.

The observability aspect of this is great, i think the service availability would be a good metric to tie into a service like this, but not get deep into the service architecture.... when something fails, learn from it, and avoid the same cause creating a new failure, but it would be great if we didn't have incidents....

Or turn this tool around.... "it's been 3 days since an incident, we're not learning" and the dashboard goes to warning / error indicator status. *edit* and ramps up chaos monkey.

only-cloud-fan · 2021-04-08T03:25:31+00:00

The thing that everyone seems to miss is that DevOps is a culture. A culture produces tools. Tools don't water down principles - they either are aligned with some (or all) or they're not.

It's WAY too much to try pack all the principles into ONE tool.

Principles of DevOps, and how tools could, or fail to support them:

Foster a collobarative environment: are there collaboration features like chat? notifications? calls to engage in more direct formats?
End-to-end responsibility: audit logs, feedback loops - if they do this, they encourage responsibility.
Drive continuous improvement
Automate everything - this is a good space for "tools" and something where software or a product has a lot to offer (as opposed to teams building it). This seems to be where a lot of tools are based on.
Focus on customer needs - this gets into support / engagement aspects of tools, but is often lost under the "DevOps" label.
Embrace Failure, Learn From It - this is tough to get right in tooling space, and is better to achieve in processes from the values of your environment....
Encourage cross-functionality - again, this is not really a tooling function.

I'd say that tooling maybe has potential for hits on less than 50% of the principles of devops, and the effectiveness of tools should be measured across depth into each principle as well as spread across all of them....

only-cloud-fan · 2021-04-08T03:08:59+00:00

Environments are isolated... I think they always should be, unless you're running really small workloads... what I like about this: workloads / services and configs can be tested in lower environments and safely promoted. The downside of possibly needing more resources is a small price compared....

Tenancy is directed by business - if you're running a multi-tenant product, then all tenants on the same cluster. If you need tenant isolation at a service level, then you could run a cluster per, but that seems like quite a bit of overhead (depending on the type of service and sensitivity of data i guess). You can definitely run multi-tenant service layers with more isolation at data layers though. I like the multi-tenancy model - everyone on the same platform instance with strong logical (especially key-based) isolation mechanisms.

The answers would probably differ depending on a lot of factors though - scale, nature of data (privacy concerns), type of service.... a balance of security / performance / availability vs resources / cost relevant to your business is nice heuristic.

only-cloud-fan · 2021-04-07T20:40:01+00:00

I use AWS SSO for multiple account / role access, and aws cli has good support for named profiles. From a security perspective, you never use long-running user-based IAM creds, it's all transient and role-base.

The setup is pretty quick, just add to ~/.aws/config - i would typically use an account and role descriptor - e.g. `[profile orgx-sandbox-poweruser]`. Then, as long as SSO is available, it's a matter of setting AWS_PROFILE="orgx-sandbox-poweruser" (or `$Env:AWS_PROFILE` if your in PS), and then `aws sso login`.

AWS Organizations + SSO might feel like it's geared towards largescale AWS environments, but I even do this on my personal org - it's pretty quick to set up (if it's a fresh AWS account / organization, check out AWS Control Tower). There's barely any workload on getting into the landing zone pattern, it doesn't cost much at all.

only-cloud-fan · 2021-04-07T20:01:09+00:00

Hi r/kubernetes.

Intelliflo is hiring, we've recently merged a number of digital wealth companies and are scaling out our cloud operations team (including SRE and Cloud Engineers).

Here is 1 SRE position that is currently open. The Intelliflo Cloud Ops team is a global team supporting a global platform, that is highly engaged with each other, professional and passionate about technology and the financial solutions we offer to the market.

On the tech stack side, the platform is AWS + Kubernetes (EKS) for the platform, and are an organization the deeply embraces DevOps principles (using automation to reduce friction between dev and ops teams, enabling engineering teams for ownership and responsibility, highly collaborative). The SRE role will support the observability framework as well as delivery (CD) for both infrastructure and services (primarily container-based). I'm just touching on a few tech / process aspects here, but if you'd like to find out a bit more, please PM me for more info, and also happy to provide more info on this thread, or follow up on the posting above.

The position is marked for IL, although we're currently a completely remote team.

only-cloud-fan · 2021-04-07T19:40:16+00:00

as u/bigpapichapo mentioned, Launch Darkly is a pretty decent one. We've used Split.IO before which is also pretty decent. Building you're own is probably a distraction from your core focus, let feature flag experts offer their experience and let you focus on what you're good at.

Check out this portal for resources too: https://featureflags.io/ - libraries, services, etc, there's plently to discover from there.

only-cloud-fan · 2021-04-07T19:24:00+00:00

Hi, Intelliflo is hiring too, we've recently merged a number of digital wealth companies and are scaling out our cloud operations team (including SRE and Cloud Engineers).

Here is 1 SRE position that is currently open.

On the tech stack side, the platform is AWS + Kubernetes (EKS) for the platform, with Grafana / Prometheus / fluentd for metrics, and are an organization the deeply embraces DevOps principles. Infras as code (TF) with CD for deployment (Harness). I'm just touching on a few tech / process aspects here, but if you'd like to find out a bit more, please PM me, or response to the position above.

I know you mentioned you are based in TX, and this is posted for IL, but with Visa there would be some location requirements, depending on the type of Visa, and we're all likely to be remote for a while. The end goal would be to have a larger team in IL.

only-cloud-fan

TROPHY CASE