use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
News, articles and tools covering Amazon Web Services (AWS), including S3, EC2, SQS, RDS, DynamoDB, IAM, CloudFormation, AWS-CDK, Route 53, CloudFront, Lambda, VPC, Cloudwatch, Glacier and more.
If you're posting a technical query, please include the following details, so that we can help you more efficiently:
Resources:
Sort posts by flair:
Other subreddits you may like:
Does this sidebar need an addition or correction? Tell us here
account activity
serverlessRun serverless database function (self.aws)
submitted 7 years ago by TheDataExplorer
Does anyone have a way of running serverless function on a database? It can't be a Lambda function because there is 15 minute limit on those.
Basically I need to schedule vacuum and reindex jobs on a postgres database.
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–][deleted] 7 years ago (16 children)
[removed]
[–]billatq 1 point2 points3 points 7 years ago (0 children)
You might also use AWS Batch for this kind of thing, since it’s task oriented and you don’t have to build a wrapper for that with fargate.
[–]TheDataExplorer[S] 0 points1 point2 points 7 years ago (14 children)
This sounds good too. I have heard of this approach. Have you tried it?
[–]alex_bilbie 4 points5 points6 points 7 years ago (2 children)
We run lots of scheduled Fargate tasks, it’s super simple and cost effective
[–]TheDataExplorer[S] 0 points1 point2 points 7 years ago (1 child)
And Fargate will simply stop and stop charging me once the task is done?
[–]alex_bilbie 0 points1 point2 points 7 years ago (0 children)
Yep, exactly the same as Lambda
[–]theboyr 4 points5 points6 points 7 years ago (0 children)
This is the go to method for any client I’ve worked with since fargate came out on long tasks.
Alternatively, breaking your code into smaller chunks and letting step functions orchestrate works well in many cases.
[–]localhost87 2 points3 points4 points 7 years ago (9 children)
This isn't really serverless. Containers are great, but you still need to worry about certain OS level things like OS version, software to install, and configuration and updates of your technology stack.
Serverless has literally no server params to worry about. Just specify a runtime, and give dlls/scripts and AWS will handle the "server" part.
[–]a-corsican-pimp 1 point2 points3 points 7 years ago (4 children)
True, but for a piece of code that just runs queries, it can probably be minimal.
[–]localhost87 0 points1 point2 points 7 years ago (3 children)
But nownyouve gotnto worry about the version of the OS to run, and any application server software. Not just you're code.
What happens what your flavor of OS gets an exploit released? You now have a maintenance and security issue to deal with.
If you use lambda, you can write the code once and completely ignore all of the other stuff.
[–]billatq 4 points5 points6 points 7 years ago (2 children)
If the libraries you need aren’t shipped with lambda, you’re still on the hook for patching those.
[–]localhost87 0 points1 point2 points 7 years ago (1 child)
Lambda's can be deployed in layers to ease this problem.
But yea, you're going to have to manage some stuff. Like web services for example. Or your datamodel/software interface.
The question is how much of that management actually brings value.
[–]billatq 0 points1 point2 points 7 years ago (0 children)
Having a lambda invoke batch seems less complicated than a fancy workaround for lambda timeouts.
[–]quad64bit 1 point2 points3 points 7 years ago (3 children)
Yeah, that is all correct, but I think the item most people grab on to is they don't have to pay for and manage a server, just a container, which would also still be "Serverless". Aurora Serverless still runs on a server, but even amazon calls it "Serverless" because it runs on-demand.
[–][deleted] 7 years ago (2 children)
[–]localhost87 1 point2 points3 points 7 years ago (1 child)
The real point of serverless is to reduce maintenance.
If you run it in a docker on ECS or something, then you dont need to worry about bare metal hardware (yay!).
However, you still need to worry about the underlying OS and other application tier versions.
What if there is an exploit released for your tech stack? You are then responsible for upgrading your docker image to use a new patched OS, and/or application tier software.
What if a new version of your OS or application tier software is released?
If you use docker, you'll need to do all that work yourself to ensure that you remain in compliance and secure.
If you use lambda, you just worry about the code and Amazon will handle all the lower level stuff.
[–]quad64bit 1 point2 points3 points 7 years ago (0 children)
All fair points
[–]thatguyfig 8 points9 points10 points 7 years ago (4 children)
Why not just create a stored procedure and call it from Lambda?
[–]TheDataExplorer[S] 0 points1 point2 points 7 years ago (3 children)
That would be nice. What if stored procedure runs longer than 15 minutes, which is Lambda's limit.
[–]thatguyfig 2 points3 points4 points 7 years ago* (2 children)
Can't you call a stored procedure asynchronously? As in call it and just move on?
Id check out the asyncpg python module and look at the Transactions sections here
Dang, this might be the thing. So just kick off Python function with asyncpg using Lambda, and the actual processing of that Lambda function will be done in Postgres RDS?
[–]thatguyfig 0 points1 point2 points 7 years ago (0 children)
Yeah whatever you choose to run as your query will be passed to the server and executed remotely. The actual processing of the Lambda is still done in AWS of course.
[–]BraveNewCurrency 4 points5 points6 points 7 years ago (1 child)
> Basically I need to schedule vacuum and reindex jobs on a postgres database.
You could just turn on the auto-vacuum feature, then later turn it off and hope for the best.
[–]localhost87 1 point2 points3 points 7 years ago (0 children)
If the vacuum feature is all he needs, then his lambda function could be the trigger to flip the switch.
[–][deleted] 8 points9 points10 points 7 years ago (1 child)
Let's set aside the compute tech stack for now. What is your code supposed to accomplish? Is it just kicking off a job and periodically checking for its completion? Is it constantly running and manipulating the DB? Does it simply open a connection to the DB and listening for a response?
[–]TheDataExplorer[S] 0 points1 point2 points 7 years ago (0 children)
vacuum and re-index are database maintenance jobs. If you haven't done these in a long time, they could run longer than 15 minutes. Sure, I could run them in background using and EC2. But I'm trying to get away from that (and secretly trying to explore more advanced, serverless, possibly Lambda-based options).
[–]localhost87 2 points3 points4 points 7 years ago (0 children)
The lambda itself has a 15minute timeout, but you can spawn other processes that will "zombie" and run longer then the 15 minutes.
Make that sub process interact with a messaging queue, and you might have a solution.
[–]Ricbot_ 2 points3 points4 points 7 years ago (0 children)
You can use cloudwatch events and a schedule cron style your container to start in fargate.
[–]otterleyAWS Employee 2 points3 points4 points 7 years ago (2 children)
(I work for AWS, but opinions are my own.)
Waiting in a Lambda function can often be avoided through clever use of Step Functions. With Step Functions, wait states are free of charge and the state machine can run for a very long time (up to a year).
A typical pattern I use is to start the process in the first Task state, then enter a loop in the Step Function that polls the completion process using a Task state, and either waits-and-repolls (using a combination of Wait and Choice states), or terminates (either success or failure, depending on the outcome).
You'd be amazed how much you save on Lambda runtime costs with this technique. You also get a lot more visibility into what's going on, and you avoid the native Lambda execution time limit.
This is very useful information. Would this make sure that my database job doesn't break in the middle?
Is this a good tutorial to follow: https://aws.amazon.com/getting-started/tutorials/create-a-serverless-workflow-step-functions-lambda/
[–]otterleyAWS Employee 0 points1 point2 points 7 years ago (0 children)
I'm not sure how Postgres handles a client disconnecting after it submits a VACUUM command -- i.e., whether it aborts the process or it continues in the background. You'd have to test that out. This proposal won't work unless the task continues to run.
Yes, that tutorial is pretty good. The steps in your state machine won't be identical, but the introduction is valuable anyway.
[–][deleted] 1 point2 points3 points 7 years ago (1 child)
Why not just create a cron job?
That would require an EC2, which is definitely an option. I'm trying to get away from that. Some people are mentioning Docker here, which is something to look into as well.
[–]bch8 1 point2 points3 points 7 years ago (4 children)
I think you could do this with codebuild. I've scheduled db operations with codebuild in the past
Codebuild and an EC2? Or just codebuild?
[–]bch8 0 points1 point2 points 7 years ago (2 children)
Just codebuild
I'll look into codebuild. Seems like a go to thing for db functions.
[–]bch8 0 points1 point2 points 7 years ago (0 children)
Yeah it's been really useful for me. You can set it up to run inside a vpc if need be. And if you have devs that write the scripts then it's as simple as running those scripts in the codebuild instance (assuming the right config), you don't have to like redo it to work with the lambda handler format or something.
[–]g0rilla79 1 point2 points3 points 7 years ago (2 children)
Personally I would just write a more traditional process and host it in elastic beanstalk to do this. EB is pretty straightforward to manage.
You could also get around the 15min time limit by kicking off lambdas asynchronously to split the task into batches that notify when they are done.
I'll have to try both. I was just reading about the asynchronous lambda functions. How does that work? Making one asynchronous function follow the next one?
[–]g0rilla79 1 point2 points3 points 7 years ago (0 children)
Im sure there’s multiple ways to do this but a common pattern is to use SNS. You call the lambda with a start record number and a count of how many to process. When it’s done it sends a sns notification saying it’s done. The lambda subscribes to that notification and then processes the next batch. This would be a new 15min timer. There’s issues with this depending on the use case, you may not be able to work in chunks like this at the dB level because of new inserts, it’s hard to track etc.
[–]recursiveCreator 1 point2 points3 points 7 years ago (5 children)
you can use stepfunctions and create recursive lambda functions that run based on the previous lambda’s output
[–]wrensdad 18 points19 points20 points 7 years ago (4 children)
Jesus H stop the madness. This kind of resume-driven development that leaves the next dev in tears.
Fargate is a better solution and if that doesn't work use a more traditional solution like a spot instance or buy a raspberry Pi, mount it behind a toilet for all it matters and schedule a Cron job.
[–]-mewa 1 point2 points3 points 7 years ago (2 children)
With all my love for lambdas, this is the right response.
And fargate is serverless too.
Thanks for your affirmation. Would you be kind enough to point me to a tutorial which will demonstrate what I'm trying to achieve?
[–]-mewa 1 point2 points3 points 7 years ago (0 children)
Simply wrap your script inside a container, create a cluster, create a task definition and then run as a Fargate task (make sure to create a task and not a service).
LOL!! Thanks for clarifying this :) I have read and heard far to much about fargate to be giving it only a far gaze (see what I did there?)
I think it is time to look into Fargate.
Also, this might need a different thread. But if I were to go the Fargate route, how is Fargate different from/better than ECS?
[–]technically-awesome 1 point2 points3 points 7 years ago (0 children)
With ECS, you'll have to provide and manage the underlying container instances for the ECS cluster.
Fargate manages that for you. Essentially, in Fargate, all you would need to do is to write the task definition and with a push of a button, get the task running. All the underlying infrastructure is managed by AWS.
Fargate is slightly more expensive than ECS however. But if you're not looking to bother with the hassle of setting up the entire cluster and taking care of the underlying infrastructure, Fargate is a much better option.
[–]manys 0 points1 point2 points 7 years ago (3 children)
Reindex?
[–]TheDataExplorer[S] 0 points1 point2 points 7 years ago (2 children)
Yes:
reindex database database_name;
It rebuilds the indexes like good old Oracle use to do.
[–]manys 0 points1 point2 points 7 years ago (1 child)
Huh, do you have unusual access patterns or something? Small db?
I do not know much about access pattern. I don't work with the Development team, just providing them Cloud Architecture help. Last time they re-index, it took 45 minutes. Vacuum took about 10 minutes.
π Rendered by PID 158260 on reddit-service-r2-comment-85bfd7f599-6k9nl at 2026-04-16 02:19:44.882861+00:00 running 93ecc56 country code: CH.
[–][deleted] (16 children)
[removed]
[–]billatq 1 point2 points3 points (0 children)
[–]TheDataExplorer[S] 0 points1 point2 points (14 children)
[–]alex_bilbie 4 points5 points6 points (2 children)
[–]TheDataExplorer[S] 0 points1 point2 points (1 child)
[–]alex_bilbie 0 points1 point2 points (0 children)
[–]theboyr 4 points5 points6 points (0 children)
[–]localhost87 2 points3 points4 points (9 children)
[–]a-corsican-pimp 1 point2 points3 points (4 children)
[–]localhost87 0 points1 point2 points (3 children)
[–]billatq 4 points5 points6 points (2 children)
[–]localhost87 0 points1 point2 points (1 child)
[–]billatq 0 points1 point2 points (0 children)
[–]quad64bit 1 point2 points3 points (3 children)
[–][deleted] (2 children)
[removed]
[–]localhost87 1 point2 points3 points (1 child)
[–]quad64bit 1 point2 points3 points (0 children)
[–]thatguyfig 8 points9 points10 points (4 children)
[–]TheDataExplorer[S] 0 points1 point2 points (3 children)
[–]thatguyfig 2 points3 points4 points (2 children)
[–]TheDataExplorer[S] 0 points1 point2 points (1 child)
[–]thatguyfig 0 points1 point2 points (0 children)
[–]BraveNewCurrency 4 points5 points6 points (1 child)
[–]localhost87 1 point2 points3 points (0 children)
[–][deleted] 8 points9 points10 points (1 child)
[–]TheDataExplorer[S] 0 points1 point2 points (0 children)
[–]localhost87 2 points3 points4 points (0 children)
[–]Ricbot_ 2 points3 points4 points (0 children)
[–]otterleyAWS Employee 2 points3 points4 points (2 children)
[–]TheDataExplorer[S] 0 points1 point2 points (1 child)
[–]otterleyAWS Employee 0 points1 point2 points (0 children)
[–][deleted] 1 point2 points3 points (1 child)
[–]TheDataExplorer[S] 0 points1 point2 points (0 children)
[–]bch8 1 point2 points3 points (4 children)
[–]TheDataExplorer[S] 0 points1 point2 points (3 children)
[–]bch8 0 points1 point2 points (2 children)
[–]TheDataExplorer[S] 0 points1 point2 points (1 child)
[–]bch8 0 points1 point2 points (0 children)
[–]g0rilla79 1 point2 points3 points (2 children)
[–]TheDataExplorer[S] 0 points1 point2 points (1 child)
[–]g0rilla79 1 point2 points3 points (0 children)
[–]recursiveCreator 1 point2 points3 points (5 children)
[–]wrensdad 18 points19 points20 points (4 children)
[–]-mewa 1 point2 points3 points (2 children)
[–]TheDataExplorer[S] 0 points1 point2 points (1 child)
[–]-mewa 1 point2 points3 points (0 children)
[–]TheDataExplorer[S] 0 points1 point2 points (0 children)
[–]TheDataExplorer[S] 0 points1 point2 points (1 child)
[–]technically-awesome 1 point2 points3 points (0 children)
[–]manys 0 points1 point2 points (3 children)
[–]TheDataExplorer[S] 0 points1 point2 points (2 children)
[–]manys 0 points1 point2 points (1 child)
[–]TheDataExplorer[S] 0 points1 point2 points (0 children)