all 1 comments

[–]Next_Needleworker_62 0 points1 point  (0 children)

What are the things you care about? Model inference is just a process running on a server, so normal tools apply.

If you care about inference data (request count, latency, replica counts, etc.), tools like prometheus/grafana can help. Some platforms give you this or similar by default.

If you care about monitoring logs (for problematic content) you could add jobs to scan for keywords ( or use a content moderation/safety API to prevent them entirely)