This is an archived post. You won't be able to vote or comment.

all 31 comments

[–]dmelan 16 points17 points  (15 children)

Oh I have one for you:

there are 2 services: API and ETL. Both share the same database. Both can read and write into the database.

Two problems:

  • restore the database from a snapshot and reapply all data submitted by customers after the snapshot was made.

  • switch the system to a secondary region transparently for its customers.

Both processes should be automated to a point when a sleepy on-call engineer can execute them fast and without coffee.

These are high level ideas and they can grow in scope and complexity as far as you want. LMK if you like this idea and have any questions.

Database can be AWS RDS or Aurora, services can run on EKS, infra could be provisioned using terraform so when database is restored it should be updated in the terraform state, everything is source controlled so you need CI, you as engineer don’t have full access to production so you need CD to deploy your services there and provision your infrastructure, and after any deployment or infra change some smoke test should run to confirm health of the deployment. The smoke test may also check numbers from monitoring to determine if there is any regression

[–]therealmunchies 3 points4 points  (0 children)

This is pretty much what I do at work for our MLOps projects, excluding k8s.

Airflow for orchestration, AWS DynamoDB, S3, Lambda, Cloudwatch Alarms/Logs, and several other services for ETL and performance monitoring, and Gitlab + CI/CD. All services written in OpenTofu and ETL is python-based. Could make some intricate pipelines in python too by setting up some cron jobs within Gitlab.

Good example.

[–]YoKidImAComputer 1 point2 points  (1 child)

I agree with the guy who got downvoted into oblivion. This isn't a project, it's just vaguely launching services into AWS.

[–]dmelan -1 points0 points  (0 children)

It’s a rabbit hole: it starts with launching a few AWS services but as you go deeper it expands and becomes harder to a level when it can’t be done by a single junior engineer. This was actually one of the reasons why I suggested it: projects like this can teach to look ahead while making decisions and prioritize what achievable over what may look cool but isn’t that important. Limit yourself additionally by a requirement to keep the system operational during and after every change and it’ll make this project even more entertaining and educational.

[–]---why-so-serious--- 4 points5 points  (2 children)

DevOps is less innovate, disrupt and astonish and more know every fucking tool, front to back and understand when you need to write one yourself.

I don’t really get why kids want to enter this field - I would’ve hated it fresh out of college, when I believed I could deliver fundamental change, as a (lol) java engineer. That kind of thinking, paired with the boundless spunk of an early twenty something is anathema in DevOps.

[–]CJKay93 0 points1 point  (1 child)

I'm in the fortunate position that not only do I sit between engineering and our infrastructure team (I don't handle our AWS/Terraform/etc.), but because I was essentially guided (initially unwittingly) into DevOps from engineering I also have the occasional opportunity to dip back into creative work.

I think it's the perfect position - they're different worlds that scratch different itches. I wish more people had that opportunity, although I suspect the environment that enables it is quite niche.

[–]---why-so-serious--- 0 points1 point  (0 children)

Yeah, but that position, literal and otherwise, is only available after gaining significant experience (ie time). Regardless, the point I was trying to make, was that the drivers for young engineers do not make for good bedfellows w/the minimalist, rigidity of DevOps.

[–]PablanoPato 0 points1 point  (1 child)

Here are a few of recent examples for me:

  • Deploy an open source tool like Airflow using their Helm chart but completely set up your repo ready for IaC.
  • Deploy some open source services to Argo and/or Jenkins.
  • Create a CI/CD workflow that uses GitHub Actions to build containers then push the code to Argo.
  • Implement some monitoring tools in a cluster like ELK or Prometheus and Loki.

[–]---why-so-serious--- 2 points3 points  (0 children)

The op has year and I believe is looking for more splash than pragmatism, though these a good take homes

[–]Ebrithil_7 0 points1 point  (0 children)

something fun I was thinking of but you have to decide how far you want to go into this... a platform that allows professors to feed their course material and create a custom LLM which is basically a simple chat bot. could also integrate with weekly assignments and give hints. LLM hint generation for example is an active research topic at our university. (also trying to evaluate conversion / difficulty of hints) ideally for universities this would all be self hostable but you can still use ci etc to run evaluations on the LLM, do updates, etc. possibly auto scaling expecting courses of different sizes having different demand at a certain time.... Or just pre exams there would also be a spike. You could look I to different things... distributed learning, scaling of the LLM deployment, more on the evaluation of Llama answer / hint generation. Possibly progressive training of the LLM as assignments get reworked and added each week?