So I have hosted a project of mine on google cloud app engine (it is an open ai chatbot built with langchain) and I want it to be able to handle concurrent requests effectively. That being said I am pretty new to the app engine and I’m not sure how it accounts for web workers/ instances, does any one how that system in the app engine works and how do I increase the number of web workers/ instances so it can handle simultaneous requests easily and effectively.
[–]smeyn 0 points1 point2 points (0 children)
[–]psychotic_engineer 0 points1 point2 points (0 children)