Sample Question #59 (statistics)
What are some of the ways by which we can clean a dataset of outliers? Name at least three.
Tougher: If the dataset is huge, say with over 100 million observations, what would you do to identify and filter out the outliers?
(Hint: what is an "outlier"?)