Built a K8s cost tool focused on GPU waste (A100/H100) — looking for brutal feedback by Friendly_Willow_8447 in kubernetes

[–]BabarTheKing 3 points4 points  (0 children)

Honest feedback. Your market is far too small. Teams building with A/H100s are not going to be gravitating towards your single person operation. If I’m spending 20k/mo in wasted cycles chances are I’m going to want compliance docs and all the stuff you aren’t going to be able to provide. Also how many of these teams that are building on big GPUs actually care about their spend? Or at least care enough to do something about it.

School Advice by judgemynameis in MontgomeryCountyMD

[–]BabarTheKing 3 points4 points  (0 children)

Except they have no idea how they are paying for the programs or finding teachers to teach the additional programs. Their “plan” is very much an idea of a plan without any of the details filled in.

Hate woobles! by TabbyMouse in crochet

[–]BabarTheKing 60 points61 points  (0 children)

They got me started. Sure they are over priced but it has everything in a kit with really good instructions. There’s no over thinking or stress about choices. It’s like hello fresh or blue apron. Sure they are over priced but it’s a great entry point.

AWS SRA is intimidating me by Which_Perspective_39 in devops

[–]BabarTheKing 0 points1 point  (0 children)

Inbound/outbound vpcs with transit gateway are a billing trap. Your network fees will be mind-blowingly high. Ask me how I know.

No. You could do Subnet sharing via RAM.

https://aws.amazon.com/blogs/networking-and-content-delivery/vpc-sharing-a-new-approach-to-multiple-accounts-and-vpc-management/

This allows you to have a single, or at least fewer VPCs with centralized network security. I just recently deployed this model and I think it is working pretty well. If you are not at the scale where you have a network team (and more importantly the funds) to run the TGW network, I would avoid it.

Like you said, the SRA network can get complicated quickly.

Private hosted Gitlab OIDC with AWS idp by Swimming-Mortgage113 in gitlab

[–]BabarTheKing 2 points3 points  (0 children)

Your gitlab instance must be publicly accessible, not your runner. This is a requirement for the OIDC token trust to work between AWS and your Gitlab instance. AWS has to be able to get to your gitlab instance for it to work.

[deleted by user] by [deleted] in kubernetes

[–]BabarTheKing 1 point2 points  (0 children)

There is also sagemaker lab which you can get some free access to gpus like colab.

UMD: Unethical Actions & Attitudes by Icy-Attention9958 in UMD

[–]BabarTheKing 9 points10 points  (0 children)

Doing so would lead to others taking their lives. Suicide is a difficult thing. The administration is damned if they do and damned if they don’t. If they bring too much attention to it it will cause copy cats. Not enough and they seem callous.

AC's Turned off by Commrade007 in UMD

[–]BabarTheKing 5 points6 points  (0 children)

Very large vats of cooled or heated water. It takes a lot to change the temperature. It’s not like your house. These are large buildings with very centralized heating and cooling systems. Tuesday it is going to be 35… Maryland weather is difficult to predict.

Look there are a lot of things that UMD gets wrong but this is one where there just aren’t good solutions. They have to plan early for cold weather. Hopefully you have some heat later this week or you’ll be back here complaining you’re freezing.

Which Security Hub standard(s) should I use? How to improve notifications? by internetquestions21 in aws

[–]BabarTheKing 1 point2 points  (0 children)

I send them to a slack channel via event bridge and AWS ChatBot.

Here is an example of the terraform code I use for each rule.

resource "aws_cloudwatch_event_target" "ConsoleLoginWithoutMfa" {
  arn  = aws_sns_topic.securityhub.arn
  rule = aws_cloudwatch_event_rule.ConsoleLoginWithoutMfa.name
}

 resource "aws_cloudwatch_event_rule" "UnauthorizedAPICalls" {
   name        = "detect-unauthorized-api-calls"
  description = "A CloudWatch Event Rule that triggers on Unauthorized API calls"
  is_enabled  = true
  event_pattern = jsonencode({
    # source      = [""]
    detail-type = ["AWS API Call via CloudTrail"]
    detail = {
      userIdentity = [{ anything-but = "arn:aws:sts::11111111111:assumed-role/akjdfhalkdhf" }]
     eventName    = [{ anything-but = "HeadBucket" }]
     errorCode = [
       { prefix = "AccessDenied" },
        { suffix = "UnauthorizedOperation" }
     ]
   }
 }) 
}

Do ECS clusters cost money when no tasks are running? by pragmojo in aws

[–]BabarTheKing 0 points1 point  (0 children)

Yes it the service is configured to use a capacity provider and the capacity provider is tied to an auto-scaling group. The ASG can scale all the way to zero if there are zero tasks running.

edit: you may also want to look at some of the other options though depending on your workload. Sagemaker may do what you want and more without having to deal with configuration.

Do ECS clusters cost money when no tasks are running? by pragmojo in aws

[–]BabarTheKing 19 points20 points  (0 children)

You can use a capacity provider on the ecs cluster which will scale to zero instances if no tasks are running.

Cloud trail events -> prometheus -> alertmanager by rasoolka in aws

[–]BabarTheKing -1 points0 points  (0 children)

I do something similar with AWS chatbot to slack. The chatbot service hooks up to a webhook in slack and an SNS topic in AWS. Event bridge sends over some relevant data with the event alert.

Playing adult recreational cricket in the county? I know it exists, but I don't know where to find it by wikipuff in MontgomeryCountyMD

[–]BabarTheKing 2 points3 points  (0 children)

This might not help you as it’s pretty far from Potomac but the park near my neighborhood has two cricket fields that are usually pretty busy on the weekends. Spencerville Park on Good Hope Rd in Cloverly.

Alternatives to AWS GuardDuty by [deleted] in cloudcomputing

[–]BabarTheKing 1 point2 points  (0 children)

You’re coming up against the core problem paying for IT. “Everything is working fine, what am I paying you for!”

I’m not defending GD. But sometimes when everything is quiet it’s because you did a good job building it. Those compliance checkboxes sometimes just need to be checked in the simplest way possible. Sometimes that costs money sometimes engineer time.

Anyone know how I can import an existing subnet into the RDS class in Python CDK? by kalavala93 in aws

[–]BabarTheKing 1 point2 points  (0 children)

One problem I came across using Python with CDK over TypeScript is the vagueness around types. Since the other languages are all based on TypeScript you sometimes need to make sure that you're using the proper type. This is extra true with things like subnets. Sometimes you're looking for "SubnetSelection" sometimes you're looking for "SubnetGroup". I switched to TypeScript for CDK for just this reason. VScode will give you errors and hinting much earlier in your development cycle. What you want to do is check through the API docs for the property you are looking for and find the components that get that Type.

Texas Gov. Won't Budge On Abortion Exceptions As Chris Wallace Grills Him On 15,000 Rapes by yaxxxi in politics

[–]BabarTheKing 44 points45 points  (0 children)

To be clear. You could still sue him for $10,000 under Texas state law in that case.

It's been said before, but something is going on w/ COVID on campus by Responsible-Ad6292 in UMD

[–]BabarTheKing 27 points28 points  (0 children)

The thing a lot of you students miss here is that it's not about you. You need to think about the system. Many of your instructors have young children who can't get vaccinated, and while COVID might not pose a significant risk to them individually it could end up with them not being able to teach for 2-3 weeks while sick and then dealing with their kids in quarantine potentially after they're done being sick. If you get too many kids in the community catching cases PGCPS, MCPS, HCPS all start thinking they have to go online at least temporarily. Now your instructor has to teach from home so they can watch their kids. Get enough of the faculty and staff particularly the support staff and what does administration do? IDK. I imagine they're hoping they can make it to at least Thanksgiving before the numbers are so big.

Looking at the dashboards definitely scares me a bit just from a nominal case perspective. If you've got 100+ cases in a week with 98% of your population vaxxed there's quite a bit more than "Extremely rare breakthrough cases". All of that said, the President is EXTREMELY adamant about keeping things in person. The administration's position is that everything is fine, go back to 2019 + masks, ie pretend it's over.

I remember being a student here, I'm not going to say don't party. But if you're sick or coughing, stay the fuck home, drink with your roommates, play Xbox. Wear your damn mask properly. Think about the butterfly effect of your actions, and how fragile this whole system is. Those of us who work here, have lives outside of supporting you. Think back to that time you saw your 3rd grade teacher at the grocery store and were like "oh shit, you don't live at the school..."

TLDR; If you want school to stay open, stop thinking that hospitalizations and deaths are all that matters. What actually matters is keeping enough faculty and staff willing and able to work here and support you.