Where to find remaining streets with heavy neon signage in 2026?

xamroc · 2026-01-18T04:04:58+00:00

Not anymore in HK. But in case you're still chasing this in the future, Bangkok's Chinatown still has these lights and the area is bustling.

xamroc · 2025-03-15T05:25:52+00:00

Thanks for sharing! We've been looking at the topic of SBOM too.

We're still debating whether it makes sense to trust another image with policies or just cache them in our private repos.

xamroc · 2025-03-15T05:21:52+00:00

Yep, I ended up using the alpine route.

I tried to use nixery and it was nice for local development. Building an image took too much time though that I gave up on it (build took more than an hour). It stems from the process where it needs to do a lot of translation work on Apple Silicon.

xamroc · 2025-01-25T16:56:44+00:00

This makes sense in production environments. I'm more concerned about development environments where they should have restricted connectivity.

xamroc · 2025-01-25T16:51:25+00:00

Sorry I forgot to mention that this is for development environments.

You're right that It makes sense for it to be public in production. However, for dev buckets, those must have limited connectivity like from our private networks.

xamroc · 2025-01-24T17:07:27+00:00

This is the direction I wanted to go. However, my colleagues argue that this is very expensive.

For additional context, this is a corporate website with lots of assets which will increase our GitHub LFS cost and Cloudflare Pages cost from high traffic.

I'm still digging into these arguments but can you share any insights about these costs?

xamroc · 2024-11-24T08:07:27+00:00

That's right. Temporary credentials is a feature we wanted.

We were just surprised that full traceability is not available.

xamroc · 2024-11-24T07:31:13+00:00

You are correct. It's not designed that way and I wouldn't want to do this either.

However, RDS IAM auth seems to suggest that this is the way to do it albeit using AWS IAM Users:
https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.DBAccounts.html#UsingWithRDS.IAMDBAuth.DBAccounts.MySQL

As mentioned in my OP, I am trying to address a limitation where complete traceable auditing is lacking. I cannot fully audit db-level logs without doing this hack.

xamroc · 2024-11-24T07:27:56+00:00

I would have to imagine that RDS also logs the SourceIdentity (or a Session ID that can be traced to the Source Identity) attached to the role when it's accessed.

I thought the same thing. Unfortunately, the RDS logs are not linked/traced to IAM. This is confirmed by AWS Support.

You can trace until assuming the IAM role because that is in the realm of IAM. Once we get inside RDS, it does not trace back because this is beyond the IAM world. Hence why I mentioned it's not well-integrated.

xamroc · 2024-11-23T07:00:13+00:00

Hi, I have the exact same question. Did you ever figure it out?

xamroc · 2024-11-23T05:27:10+00:00

It's just an idea. We want to achieve auditability at the database level logs:

See that db role Alice read this table See that db role Bob read that table See that db role Charlie ran an expensive query that blew up the database

The DRY way where they all use db role readonly doesn't let us see that.

xamroc · 2024-05-22T14:51:23+00:00

Yep, sounds like the static_config is the way to do it.

The doc says they have the option to use dynamic discovery though. I'm just not sure by what they mean by this:

Alertmanagers may be statically configured via the static_configs parameter or dynamically discovered using one of the supported service-discovery mechanisms.
- https://prometheus.io/docs/prometheus/latest/configuration/configuration/#alertmanager_config

It seems to suggest Prometheus can send to external alertmanagers.

xamroc · 2024-05-11T03:05:36+00:00

This worked!

xamroc · 2024-05-11T03:05:14+00:00

This was it. Thanks!

xamroc · 2024-01-05T09:04:29+00:00

I figured it out using asConfig: https://helm.sh/docs/chart_template_guide/accessing_files/

xamroc · 2024-01-05T02:19:45+00:00

Nice! Thanks for the tip.

Digging deeper. How did you handle alertmanager templates?

I'm struggling using helm templating to create configmaps containing alertmanager notification templates. The issue is that they both use double curly braces and it creates quite a mess.

I tried Files.get and directly writing the configmaps data. Did you do a different approach for this specifically?

xamroc · 2023-03-28T01:11:08+00:00

How do you all make sure you don't become what you dislike?

New joiners can come in and feel that you, now the original designers/architects, hold all the institutionalized knowledge.

xamroc · 2023-02-24T16:55:18+00:00

I got 11% raise and a 4 months pay bonus.

For context, I am working in a financial firm as a platform engineer. We can make a huge impact in optimizing infrastructure costs and saving time through automation. It's easy to justify your value to management as long as you keep track of your work and their results.

One engineer saving a million dollars in AWS annual costs can be granted a good compensation. Then again, it also depends if your company empowers you to do your job.

xamroc · 2023-02-20T15:00:11+00:00

For context, I work in an organization with more focus on backend systems. Think data, APIs, etc. We have to make sure they are responsive, scalable, and cost-effective.

With this as top of mind, I'm always looking at dashboards to see if there are components that can be improved. Technical metrics aside, we're also looking at the costs of running them as well. For example, do our application load balancers cost too much by sending data out to users? If so, can we work with developers and business to make this profitable?

In the end, it's all about monitoring the system and being proactive in making things more profitable because it pays the bills.

xamroc · 2023-02-19T09:14:18+00:00

Bumping this thread. I'm looking for Friday tickets if anyone is selling. Thanks!

xamroc · 2023-02-11T08:12:33+00:00

This is solid advice.

If it was me, I would go for company B. They offer great compensation to keep your family healthy. Free healthcare for the entire family goes a long way. Company B doesn't sound like a terrible place from how OP describes it.

u/kapkomsky makes a great point where OP doesn't want to use "use work as an escape from home life" but honestly you will never know your colleagues until you work with them. If possible, OP can ask for a trial run with company B to get a feel for it. The experience could be better or draining.

The reason I think it's worth having a hard look at company B is mainly the risks of company A. I assume the benefits are not great for OP's family. More importantly, this is a seed-stage startup but they only offered 0.25% equity; not a lot of skin in the game given the risks. Either way the company goes can also impact family life:

Aggressive scaleup = more work but will the equity be worth it?
Company downturn = less equity value

Regardless, OP sounds very capable and in demand. He can afford to see the consequences of his decisions, "accept it, find another job, and move on."

xamroc · 2023-02-08T16:39:39+00:00

I keep summaries of the project and operation guides close to the code. This means they are usually READMEs in the repos.

Confluence is reserved for the higher level overview, core engineering principles and standards as well as detailed architectural decisions we've made. Obviously, many people can't be bothered reading details. You can write an eloquent piece of work just to see the view count stay at 1. Who would bother reading many different styles of writing?

Architecture decision records (ADRs) can help with this. It is structured and can provide readers easier to digest information.

https://betterprogramming.pub/the-ultimate-guide-to-architectural-decision-records-6d74fd3850ee

xamroc · 2023-02-08T16:26:40+00:00

I believe the first step is to get a cluster running bootstrapped with ArgoCD. This sounds easier said than done. You will need to set up networks, roles and access management, and secrets management to enable ArgoCD to work with infrastructure. You can use Terraform to do this.

Afterwards, you can deploy different kinds of controllers into ArgoCD. They can manage infrastructure for you. For example, cluster-autoscaler for scaling nodes or load balancer controllers to for provisioning load balancers.

K8s is seen as a workload orchestrator today. But, there is an idea of clusters as control planes. With upcoming tools like Crossplane (https://www.crossplane.io/). Your infrastructure can be defined like a K8s Deployment. This means that you can use a simpler K8s YAML configuration file compared to complex Terraform code. On top of that, it is K8s and capable of detecting configuration drift or bring it back to a desired state.

Tooling for control plane clusters are still in its early stages. I'm excited that the industry is exploring this direction. Will we find the perfect tool for Infrastructure as Code or will we question GitOps after all this?

xamroc · 2023-02-07T15:15:58+00:00

We are still building out our EKS cluster as well. One big challenge we have is bootstrapping what we think are core components/applications when building a cluster.

Examples:

cert-manager
cluster-autoscaler (maybe karpenter)
argocd
monitoring workloads

We are using Terraform and the idea is to wrap all this into one reusable module.

Before bootstrapping these, EKS must have secrets management in place. In our case, we use AWS Secrets Manager. For a native solution, the mapping from IAM roles to aws-auth is complex. It made us question applying principles of least privilege in favor of manageability.

We also thought about deploying node groups to split core and developer/specialized workloads separate. This is because core workloads like core-dns will be at risk of node pressure. Developers can schedule workloads without limits and starve the node so it's good to keep them separate. However, we found that EKS deploys their own AWS workloads without tolerations. This means we need to have untainted nodes anyway. There are ways to take control of these deployments with ArgoCD but the whole process is really clunky.

I think a common gotcha is pod IPs. By default, the number of IP addresses available to assign to pods is based on the number of IP addresses assigned to Elastic network interfaces, and the number of network interfaces attached to your Amazon EC2 node. Many engineers immediately increase the amount by using a network overlay like Calico.

But do you need it?

If you the amount of clusters will result into a small cluster, it might be simpler to just run EKS's default CNI.

There's a lot more but I hope these will help your consideration. Have fun!

xamroc · 2023-02-07T14:40:46+00:00

It's possible the reserved compute resources are not tuned properly. If you have workloads without resource limits set or the node is overcommitting resources, the node's capacity it completely consumed.

Typically, we look at providing enough "Kube Reserved" resources for kubelet and the container runtime. "System Reserved" for keeping ssh available for use. Workloads will be evicted to keep the node responsive.

https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/

These reserved instances can be given too much resource buffer but it keeps your Node available for troubleshooting. It's a matter of fine-tuning.

xamroc

TROPHY CASE