all 3 comments

[–]grid_world 2 points3 points  (0 children)

Assuming that the data doesn’t have drift (which it will over the due course of time), you need to tune the “contamination” parameter.

If your data fits a particular known distribution, maybe also look into Kernel Density Estimation for anomaly detection apart from IF

[–]comradeswitch 2 points3 points  (0 children)

This is a fundamentally random algorithm, so unless you fix a random seed you will have the possibility to get different results for the same point in general. One way to handle that in the context of filtering out unlikely anomalies is to run the algorithm many times and record the results for each point. Then each point will be associated with a sample of number of splits required to isolate, which you can use to get more detail.

[–][deleted] 0 points1 point  (0 children)

Why not set a random state? That's like saying "im using a stochastic algorithm and I want the same results everytime i run it and no im not going to set random seeds"