Documente Academic
Documente Profesional
Documente Cultură
Background
Instatistics,anoutlierisanobservationthatisnumericallydistantfromtherestofthedata.Inother words,anoutlierisonethatappearstodeviatemarkedlyfromothermembersofthesampleinwhichit occurs. Outliersariseduetochangesinsystembehavior,fraudulentbehavior,humanerror,instrumenterroror simplythroughnaturaldeviationsinpopulations(occurringbychanceinanydistribution,orwhenthe populationhasaheavytaileddistribution). Intimeseriesanalysis,weshouldexaminethepresenceofoutlier(s)onlyforastationaryprocess. Pleasenotethatoutliers,beingthemostextremeobservations,mayincludethesamplemaximumor sampleminimum,orboth,dependingonwhethertheyareextremelyhighorlow.However,thesample maximumandminimumarenotalwaysoutliersbecausetheymaynotbeunusuallyfarfromother observations.
DataPrepOutliers
SpiderFinancialCorp,2012
Ontheotherhand,weshouldalwayssearchforthecausesoftheidentifiedoutliers,payingspecial attentiontolevelchanges,variance,andthoseoutliersthatcantbeexplained.
Pleasebearinmindthatthemethodsabovedetectapotentialoutlier,butitisyourresponsibilityto verify,andtosomeextentexplain,theirvalues.
DataPrepOutliers
SpiderFinancialCorp,2012
Conclusion
Theoutliersprocessingisabigandcomplexsubject,andtheanswerwilldependonhowmucheffort youwanttoinvestinit,andhoweffectiveyourmeansofoutlierdetectionprovetobe.
Deletionofoutlierdataisacontroversialpractice.
DataPrepOutliers
SpiderFinancialCorp,2012