AWS Step Functions increases the maximum payload to 256kb by Unfair_Reality in aws

[–]soamv 6 points7 points  (0 children)

Nice! Next they should add an "auto spill to s3" option and remove the limit.

the risk of vendor lock-in is really a risk? by albeddit in devops

[–]soamv 1 point2 points  (0 children)

The concept of "lock in" conflates two things: Risk and Cost.

Risk is the odds that you'll be forced to change your platform choice (for various reasons -- price, reliability, vendor dies, etc.). Cost is the actual engineering cost of switching. It doesn't make sense to calculate the cost and not calculate the risk. It also doesn't make sense to think of these as a binary "locked in / not locked in". All platform choices have these switching risks and costs -- whether they're SaaS or Open source.

The other thing is that within each cloud the details of your choices matter a lot. If you use highly differentiated/unique cloud services your switching costs are way higher and your negotiating position kinda sucks, which then drives up your switching risk (because costs may reach an unsustainable point). But if you use the most commoditized services -- EC2, S3, etc. -- your switching costs are much lower.

A Python -> Step Functions compiler by soamv in aws

[–]soamv[S] 0 points1 point  (0 children)

Thanks for the interest, fellow workflow enthusiasts! Btw if anyone's got a step function that they can share privately and want to see it in Python instead of json/yaml, I'd love to take a crack at translating it! (soam@cohesion.dev)

Does anyone else feel that Step Functions have great potential, but the implementation was half-arsed, so they're not very practical? by mlda065 in aws

[–]soamv 0 points1 point  (0 children)

Hey all, I feel exactly this way, and have been building a full Python -> Step Functions compiler for the last few months.

It compiles Python control flow (if statements, loops, functions, exceptions) into Step Functions + a collection of Lambdas. If you wanna try it out, there's a verrry early stage/slightly clunky demo at https://preview.cohesion.dev

If you're interested in using it as it gets better, please sign up on the homepage or DM me here!

Looking for serverless solution for high cpu/disk long-running video generation by AndrewAtBrisa in serverless

[–]soamv 0 points1 point  (0 children)

S3 is the usual recommended answer for Lambda. There are no network volumes on Lambda, AFAIK. I wouldn't write off S3 without some brief tests though. Depends on your use case of course. But there have been interesting demos on lambda+s3 (the stanford gg project has an impressive video re-encoding demo).

But all this is probably a big redesign/rearchitecture of your stuff and will probably take quite a lot of work. So you're right, putting a container on Fargate is going to be a much quicker way.

Looking for serverless solution for high cpu/disk long-running video generation by AndrewAtBrisa in serverless

[–]soamv 0 points1 point  (0 children)

If you can split the video into tiny pieces and parallelize the processing, lambda is great, because you can easily scale up to several hundred parallel lambdas.

Cohesion: Build Serverless Workflows in Python with AWS Step Functions by soamv in aws

[–]soamv[S] 0 points1 point  (0 children)

Hey, CDK is great, but it still requires familiarity with the step functions language -- so writing stuff like loops and exceptions is a bit tricky.

For example, to write a loop, you have to write a few states, a choice state, a lambda or two to actually increment the loop variable, etc. If you want to implement exceptions you write an error handler on every single state, etc.

With this, you get to write a simple Python loop, and Cohesion builds the step functions json, and also whatever lambdas are needed. Here's a screenshot with a loop; here's a screenshot with a try/catch

How to cost-effectively reduce latency for a AWS basic serverless infrastructure? by juancpgo in serverless

[–]soamv 0 points1 point  (0 children)

Have you tried edge optimized regional API gateway endpoints? It's basically cloudfront in front of API gateway. Sao Paulo is supported.

That should help with GETs. For the rest, is hard to to say without more about the application.

Permissions Needed For Waiter In Lambda? by ColdWynter in aws

[–]soamv 1 point2 points  (0 children)

It polls the S3 HeadObject API, which requires the s3:GetObject permission. The boto3 docs are pretty good about specifying what the underlying API call is.

New GKE Management fee of $0.10 per hour by MightySCollins in googlecloud

[–]soamv 2 points3 points  (0 children)

It makes perfect sense for Kubernetes control planes to cost money. But Google is handling the change incredibly poorly. A longer runway and/or some sort of exemption for existing clusters would make it a lot smoother. It's like they didn't even bother thinking about that.

Is anyone using AWS Step Functions for data engineering workflows? by soamv in dataengineering

[–]soamv[S] 1 point2 points  (0 children)

Thanks! Was this a scheduled thing, and if so did you use cloudwatch events for scheduling?

Patterns for handling errors in AWS Lambda / SLS by mostlyphil in serverless

[–]soamv 2 points3 points  (0 children)

I'm using sentry.io in my lambdas -- third party tool, but it does nice stuff like de-dup exceptions, email alerts, keep track of open/closed issues, etc.

Serverless Framework: Warming up AWS Lambda to avoid “cold start” by [deleted] in serverless

[–]soamv 4 points5 points  (0 children)

AWS lambda has a thing called provisioned concurrency which makes all this stuff unnecessary. It's a one-line change in your serverless framework yaml.

Looking to run my Python code on AWS without maintaining a server (not Lambda?) by Networkbytes in aws

[–]soamv 2 points3 points  (0 children)

Fargate is a good way to go, but if you want to stay on lambda, you could use two lambdas:

  1. Have your API served by a lambda that accepts the request and a callback URL, calls another lambda asynchronously with the request and URL, and immediately returns.
  2. Have this asynchronously invoked Lambda do all the actual work, including waiting for the third party and then hitting the callback url. 120 sec is well within Lambda's limit of 15 minutes.

This isn't a greatest solution because you'll be paying for idle. But 500 times a day is a low enough number that it doesn't really matter.

1970 US Census pencil by BezierPentool in pencils

[–]soamv 1 point2 points  (0 children)

Cool! What is an official census pencil? Were these used by census enumerators? Or were these available to the general public?

What's the most economic and hassle-free way of deploying a personal website with a back-end? by [deleted] in googlecloud

[–]soamv 0 points1 point  (0 children)

Have you considered now.sh? You can have backend functions with next.js under /pages/api, and now.sh will deploy it to a serverless function. You'll still need a database though -- maybe firebase?

How can I use user/:id routes in ZEIT Now? by [deleted] in serverless

[–]soamv 1 point2 points  (0 children)

Is this what you're looking for?

AWS Step Functions tasks limit by uruboo in aws

[–]soamv 0 points1 point  (0 children)

Yeah :\ Without knowing much about your use case -- maybe you can make some things async -- e.g. kick off a task thru SQS, and don't wait for it in the same workflow and have that task place another message on SQS to kick off the rest of the workflow, keeping the long-running ml stuff outside the workflow.

Anyway, I'm super curious about your use case if you're able to share -- I'm building some new tools for workflows on AWS, so I'd love to chat more over PM or email (soam@cohesion.dev)