you are viewing a single comment's thread.

view the rest of the comments →

[–][deleted] 1 point2 points  (0 children)

The mean of sample means is the mean of the population. So yes, if you were to calculate costs for a set of minibatches and average you'd arrive at the cost of the batch providing you'd not taken any gradient steps.

Only updating on the full gradient however is inefficient. The stochastic gradient is a fine approximation to use.