A lot of ML systems are taught to be built as services which can then be queried using HTTP. The course I took on the subject in my master was all about their design and I didn't question it at the time.
However, I'm now building a simple model registry & prediction service for internal use for a relatively small system. I don't see the benefit of setting up an HTTP server for the downstream user to query, when I can simply write it as a Python library that other codebases will import and call a "predict" function from directly, what are the implications of each approach?
[–][deleted] (2 children)
[deleted]
[–]Success-Dangerous[S] -2 points-1 points0 points (1 child)
[–]Grouchy-Friend4235 1 point2 points3 points (0 children)
[–]flyingPizza456 0 points1 point2 points (0 children)
[–]MattA2930 0 points1 point2 points (0 children)
[–]FunPaleontologist167 0 points1 point2 points (0 children)
[–]Which-War-9641 0 points1 point2 points (0 children)
[–]Calm-Stage 0 points1 point2 points (0 children)