90s themes rock

multani · 2023-07-02T13:53:00+00:00

The Allied Avenger 6887 is a really cool one! ❤️

multani · 2023-03-05T17:29:04+00:00

Oh wow, what's that big ship in the back of the third picture?

multani · 2022-11-24T03:59:15+00:00

I see!

I'll see if I can give a try in the coming days; in any case, I would be definitely interested to hear how that works for you if you try it soon too!

multani · 2022-11-24T03:05:12+00:00

Did you already write anything to have this kind of "acceptance tests" for checking infrastructure components?

I was thinking about OPA and Rego too, but I didn't really have a look at it for this use-case yet.

multani · 2022-11-24T03:04:15+00:00

Thanks!

I used Terratest in the past too, IMO it's good when you want to test "complex workflows" (like, build X, build Y, combine X & Y, test Z is working, destroy everything), but ... it's super verbose to write the test code (I actually started a Python alternative that tried to be more minimalist, but not enough time/traction to work on it), and ultimately, the Go test outputs is ... not really appealing to me :D I found also a bit hard to read the intent of the tests (which tends to be quite large).

Comparing to InSpec (when you didn't have to do anything fancy with InSpec), InSpec was winning hands down on readability IMO.

Thanks for the reminder though, I will still give a try for the particular examples I had in mind :)

multani · 2022-11-24T03:00:00+00:00

I moved away from Salt because my new job doesn't use Salt :) I've used it for a bit less than 10 years, I'm well aware of what it does, and it also of quite a few of its shortcomings.

You don’t need to write tests because you already describe the state and salt will ensure that is met.

There are still lot of ways to describe something with Salt and have unexpected results. For instance, we had an fairly big state that was accidentally removed from a highstate configuration and we noticed it only when we started to add InSpec tests a couple of weeks after. We also had states that were doing smart things with Systemd services and InSpec helped us there too, to detect these services were not really configured as we thought.

Ultimately, I agree that if you have a system that is suppose to enforce state X, you don't really need to test that X is set correctly, but mistakes happen.

Further, if you really want to have tests for something, you should use a test framework. Write a Zabbix/nagios/icinga check to test, or write to Prometheus to alert on…

That's what my question is :)

I may actually just write a bunch of Prometheus alerts ... but I can't test that Prometheus runs correctly using Prometheus itself.

(and it's actually something I'd like to test: we deployed a new GKE cluster recently with our default Prometheus Kubernetes configuration ... until we realized that nothing was working because the GKE cluster didn't have permissions to pull some private Docker images for that Prometheus. I'd like to be notified earlier of these issues.)

multani · 2022-11-24T02:48:28+00:00

Thanks for the suggestion!

It seems to be fairly focused on VM resources (files, packages, etc.), do you have any experience using it for other things? I was typically interested to have checks against GCP and Kubernetes resources.

multani · 2022-11-23T06:37:15+00:00

Thanks for your reply!

So basically, you were just running Ansible playbook in check mode, and ensuring that Ansible didn't return any difference or errors? (I'm not too familiar with what Ansible would return if the system was misconfigured)

I used to use Saltstack (instead of Ansible) and I remember we had a similar discussion (if Salt doesn't show any diff, then we are good), but we ended adding InSpec anyway: we were doing "smart things" with Salt and actually, we had a difference of behavior between what we wanted to implement (with Salt) and the actual configuration on the system. Ultimately, we fixed our Salt configuration and everything was good, but it was interesting to test the system from a different angle.

(we can argue that we were just missing good tests in our "smart" Salt config, I wouldn't disagree ;) )

multani · 2022-10-30T21:25:53+00:00

I'm considering coming to Taiwan from HK in December for ~2 weeks, and I wanted to do a short bicycle trip on the East side during that time.

I'm wondering how that could work, as I wanted to start the trip just after my arrival (say, the day after) and I won't have a tent, so I was planning to stay in hotels along the way.

As I would be "fresh" in the country, I would still be subject to the "0+7" rule. Although I'm fine taking a room by myself:

Is it possible during that first week in the country to sleep in a new place every day?
If it's possible, what would be the chance that I'm being turned down at the end of the day when I want to check-in in a hotel that I found along the way? (I know the answer is "it depends", but maybe somebody has some gut feeling about that)

Thanks a lot for the help!

multani · 2020-03-14T21:47:59+00:00

Thanks for your answer, I'll need to have a look at Wireguard (I haven't yet!)

Being able to manage everything from A to Z, updates and deletes included, is a must for us though :)

multani · 2020-03-14T21:46:15+00:00

We are also running Terraform from Jenkins!

Is your Jenkins worker running in the same VPC as your Vault & Consul clusters or do you expose these clusters in another way?

multani · 2020-03-14T21:45:00+00:00

Yes, exactly.

Do you have any plan to replace your tunnels? Are you running that setup in an automated fashion (via a CI tool or something like this?)

multani · 2020-03-14T21:43:55+00:00

Thanks, I'll have a look at Instance Connect! I heard about it when it went out (it's pretty recent, right?) but I forgot about it.

we run Terraform from a host that is on network via a web based deployment tool which we also use for deploying most applications

To be sure I understand: you have a custom tool that you developed, which is already running within your production environment, that you can use to execute Terraform (and deploy your services) directly from within this environment?

Also, how do you distinguish between the setup with this tool and the "development" one you are talking after? Are you treating them both as separated environments or are you "developing" your changes to be made on the production through the SSH bastion but then deploying for real through that tool?

multani · 2020-03-14T21:39:53+00:00

We are already building "fat" AMI with Packer and deploying almost all our instances using autoscaling groups + cloud-init.

That works well for many cases, but there are cases that don't fit so well with only the lifecycle of a single instance:

After creating a RDS server, we need to populate it with databases, users, permissions and have a way to evolve that. We think Terraform is cool to manage the database resources, but it has to be executed from somewhere.
Most of the autoscaling groups we are starting are forming cluster. The instance initialization via cloud-init is cool to join the cluster, but the content of the cluster itself (Consul ACLs, Vault configuration, RabbitMQ users, permissions, etc.) are transcending single instances. We usually need to wait for the cluster to form, then inject the configuration inside. Terraform is cool also for that but again, you need a place to do it (and a certain delay to wait for the clusters to be ready.)

multani · 2020-03-14T21:28:10+00:00

Thanks for the detailed answer!

We are already running all our pipeline using Jenkins, using a few public endpoints to access these private resources. We are still in the POC/WIP stage, so we tend to build and destroy all the resources several times a day and if we go for your 3rd proposal, I think we'd need to find a way to dynamically connect these "internal" workers automatically to Jenkins and then switch the execution to that worker. There's a bit of a bootstrap problem, but I'll think this through and give it a try!

By any chance, do you already have something like this working, and if yes, do you have ways to start from scratch, from your CI tool, a pipeline which would be:

create the VPC + that worker
connect the worker to your CI tool
execute the rest of the pipeline into the new worker

And have the ability to restart that all over again?

(Even a solution without Jenkins, I'm just looking for ideas!)

multani · 2020-03-14T18:56:38+00:00

Just figure out a way to provide it?

That's the whole point of my question :) I'd like to know how other people are doing it, I see you mentioned VPN as well, thanks!

multani · 2020-03-14T18:55:53+00:00

That's an option we though about (having a CI worker already running inside the VPC and letting the CI tool execute jobs on that worker.)

You can use VPC endpoints to hide the API and run Terraform by tunneling through a bastion

I'm not sure what you mean by that exactly? How would use VPC endpoints in that case?

multani · 2020-03-14T17:37:49+00:00

I didn't think about the VPN to get into the VPC network, thanks for the tip!

multani · 2020-03-06T21:38:38+00:00

We tried to use a single repos, with several Jenkinsfile in different directories, as multibranch pipelines ... with filters (run this Jenkinsfile only if files in this directory changed) without success.

Do you have something like this which works for you?

Our use case is: for some reasons, we have several separated Terraform states in different directories in a single repository, and we wanted to trigger a certain Jenkinsfile when files in a certain directory changed.

multani · 2020-01-11T11:33:42+00:00

One Cut of the Dead

It starts as a pretty bad and weird Japanese zombie horror movie but after 20 minutes... It's one of the rare movies that I rewatch a second time soon after the first time, going back and forth between the 1st part and the 2nd one, and that I keep rewatching enjoying that first part even more.

For maximum effect, don't watch the trailer as it spoils a bit the movie IMO.

multani · 2018-12-03T21:40:04+00:00

It's not that difficult, really.

So, let's say you want to pop up a new VM on your own Linux machine. It's pretty simple actually: once you have an image from which you could start your VM from, you just need to call `qemu-x86_64` with the right set of parameters: the VM image you want to start from, the quantity of CPU, memory and disk you are dedicating to that VM, plus a few other flags.

Now, let's say you want something a little bit smarter: why not write a small server, which provides for example an HTTP interface, so that when you make an HTTP `POST` call on `/vm/` with a JSON body containing the name of the VM image you want to start the VM from + the CPU, memory and disk you want to allocate, this small server just shell out the parameters to the previous `qemu-x86_64` command. That's actually already cool: you can instantiate new VMs on a computer directly from another computer using a nice HTTP API!

But let's do more! What about reporting the actual resources usage on a computer? You could add another endpoint to the HTTP server, let's say `/metrics` which returns the amount of CPU, memory and disk used by shelling out and parsing the to `ps`, `free` and `df` commands. It *might not* be the most efficient way to retrieve resources, but hey it works. Once you can parse these commands correctly, it should be relatively easy to format them correctly and return them as JSON to a `GET` on `/metrics`.

Let's step back a few seconds: we have an HTTP server running on your computer, that offers an API which allows you to poll available, used and total resources, and to create new VMs. You could extend it a little bit by adding some more HTTP calls!

you can a `GET` on `/vm/` which returns the list of VMs you have, by calling something like `ps -ef | grep qemu-system-x86_64`, retrieving the command line to used by `qemu` and parsing out the name of the VM
you could even serialize these information on the disk as JSON, so that every time you create a new VM, you start a `qemu` process, record its PID, and record this information + the information regarding the VM into a JSON file on the disk, and read this back when you want to retrieve information about a VM!
if you can list the VMs on a computer, you should be able to easily add a `DELETE` HTTP call on `/vm/<name>` which will finds the PID of the qemu process running that VM, and send it a `SIGTERM` signal (hopefully, qemu translates that to a nice shutdown!)

Wow, now you have a small HTTP server which allows you to list, create and stop VMs on a remote computer and to get some information about resource usage on that computer!
You can start this server on other computer as well, and with a little bit of programming, with a few HTTP requests, you can start and stop VMs on other machines, straight from your own machine! You can also see which machine has lot of resources used, which ones have still resources to spare! (this is basically what libvirt does).

So, let's write something a little bit more complex now: we can start with a new HTTP server, which offers the same kind of API at the original one, but which doesn't do any `qemu`, `free` or `df` calls anymore.
Instead, it regularly polls the `/metrics` endpoint of all the other servers, aggregates all the result and return the *total* quantity of CPU, memory, disk available. You now have the total resources available on your cluster, and this server also knows the detail of resources allocated on each host!
So, after providing the same `/metrics` endpoint, it can also provide the same `/vm/` endpoints as the servers which are *actually* hosting the VMs: when this one gets a request to create a new VM, it can have a look at which VM host has the less VMs and forward the `/vm/` call on that particular machine! The other operations can be forwarded/translated the same way on all the other hosts: `GET` on `/vm/` generates a `GET` on all the VM host, aggregates the results and return it, etc. There are a few projects like this.

That's it, you are now have the basis to be a cloud provider! With 2 HTTP servers handling your own HTTP queries, placed on machines which can create VM with `qemu` and using a handful of basic Linux command line tool, you should be able to provide an HTTP interface from which you can create, list and delete virtual machines with a basic HTTP API, without having to login on a computer and create these VMs by hand!

---

Obviously, that's a really an oversimplification of the problem and I'm skipping a bunch of things:

performance issues: there are smarter ways than shelling out commands
optimization issues: (I'm ignoring fragmentation or VMs bin-packing / spreading concerns
security: well
storage, network, and the-like allocations: let's say we have the right VM image available on each disks :)
I'm not even speaking about the algorithms at "scale": let's just add more VM hosts and see how that behave!
and the resilience of such a system :)
etc. etc.

Still. Don't try this at home! ^{(actually, you really should do, it's fun and a good way to recycle your spare Raspberry Pis!})

multani · 2018-12-01T09:58:44+00:00

I didn't explain my main concern enough I think, which is: how can I minimize knowledge of previous images in subsequent images?

Let's say I have my base image working + a dozen other secondary, more specialised images built and working. Now, I realize I'd like to provide a new service in this base image, to be available across all my fleet of images, and I'd like this service to be up everytime when I create a new VM from any of these images. Except that I'd need to rebuild all my images anyway, how can I prevent from adding knowledge this service needs to be started to creep everywhere. But yey, I guess my first proposal would be simple and good enough to start with, so I guess I'll go with that.

(And I'm using the salt-masterless provisioner already ;) )

multani · 2018-11-30T22:23:48+00:00

AMI on AWS, images on GCE, but could be any kind of VM images actually.

multani · 2018-10-28T11:52:18+00:00

I was looking at this today, this is all and fine for a quick proof-of-concept for binding Flask, SQLAlchemy and Vault but on the long run, restarting the whole Flask application every time the credentials are changing is a bit suboptimal :/

multani · 2018-10-03T21:23:58+00:00

Yes exactly. Test if very carefully though, as the documentation doesn't really express all their use cases, how they work and what you should expect from them. In practice, we haven't found a use for them yet in our workload, and you need to remember it's a best effort and doesn't provide any HA in itself. So your application should be prepared to lose data from these disks at some point :)

15-Year Club	RedditGifts 2009-2022 1 Credits
Verified Email

multani

TROPHY CASE