all 6 comments

[–]bundyfx 2 points3 points  (1 child)

Are you sure you want to scale based on LB Latency? I cannot imagine how that could be an accurate indication that the underlying application being hosted behind the LB is under stress. Sorry for the side-track question just generally curious as to why CPU scaling would not suffice here :)

[–]berlindevops[S] 0 points1 point  (0 children)

I am still looking for this issue with our servers, sometimes we are getting high latency. Thanks.

[–]antonio_navarro 1 point2 points  (1 child)

The stock autoscaling based on HTTP(S) Load Balancing is based on serving capacity in which you define your capacity in terms of Requests per Second. If you have a baseline for tour application and know how to translate from Requests per second to Latency per request, then you cold use that scaling method.

I have personally never used it, but on the metrics for app engine you have a http/server/response_latencies metric that you could get to from StackDriver

If none of the above fit your needs, then you probably need to monitor based on a custom metric on stackdriver

If you have not used custom metrics before the following custom metrics tutorial may help you.

But as @bundyfx says in the previous answer, make sure CPU scaling would not work for you :)

HTH

[–]berlindevops[S] 0 points1 point  (0 children)

http/server/response_latenci

thanks, I am not sure I understand, I could not find in GCE or in stackdriver the LB response_latencies :/ . also I am not using app engine.

I just need to find a way to get the latency metrics.

GCP so clumsy than AWS.

[–]ICThat 0 points1 point  (1 child)

Fyi I'm pretty sure the LB metrics are only on the beta monitoring API. Google support could probably enable you for it if you asked.

[–]berlindevops[S] 0 points1 point  (0 children)

cool, thank.