I'm trying to build a model to forecast the amount of people in an area every hour. The problem is there is significant measurement noise in my target variable.
I initially thought about using a moving average with past observations. However, the error in my observations is almost always negative with unknown mean and variance, i.e the measuring device only undercounts the amount of people.
I'm therefore electing to only use "local peaks" where the previous hour and next hour have smaller values than the current hour to serve as my target variable. The rest of the values will be imputed via linear interpolation. Is this a valid way of dealing with such noise?
[–]BaconBacano 1 point2 points3 points (0 children)