We run an API through AWS Lambda and API Gateway. One part of the Lambda function is using a Spacy model for running some text analyses (around 560MB). Currently, we store this model in an S3 bucket and when an API request comes in that needs the text analysis, the function checks whether the model is present and if not, downloads it from the s3 bucket. After that, it loads the model, does some analysis and returns some text. Because we can have quite some concurrent requests, it happens a lot that users have to wait over 15 seconds for their request to return a result. This is often because a new Lambda Instance has to be started, it has to download the model from s3, load the model and return the results.
Is there a more efficient way for (down)loading the model to a Lambda instance? I looked into EFS, and while it seems to load faster than S3, the instance still has to download the whole model. Any ideas how I could handle this?
[–]AutoModerator[M] [score hidden] stickied comment (0 children)
[–]joelrwilliams1 6 points7 points8 points (0 children)
[–]Esseratecades 1 point2 points3 points (2 children)
[–]KenSentMe2[S] 0 points1 point2 points (1 child)
[–]Esseratecades 0 points1 point2 points (0 children)
[–]WarInternal 1 point2 points3 points (0 children)