Outlier Detection Using the Multiobjective Genetic Algorithm
PDF

Keywords

outliers detection
genetic algorithm

How to Cite

Duraj, A., & Chomątek, Łukasz. (2017). Outlier Detection Using the Multiobjective Genetic Algorithm. Journal of Applied Computer Science, 25(2), 29-42. https://doi.org/10.34658/jacs.2017.2.29-42

Abstract

Since almost all datasets may be affected by the presence of anomalies which may skew the interpretation of data, outlier detection has become a crucial element of many datamining applications. Despite the fact that several methods of outlier detection have been proposed in the literature, there is still a need to look for new, more effective ones. This paper presents a new approach to outlier identification based on genetic algorithms. The study evaluates the performance and examines the features of several multiobjective genetic algorithms.

https://doi.org/10.34658/jacs.2017.2.29-42
PDF

References

Tomczyk, A., Detection of line segments, Journal of Applied Computer Science, Vol. 22, No. 2, 2014, pp. 81–90.

Hawkins, D. M., Identification of outliers, Vol. 11, Springer, 1980.

Barnett, V. and Lewis, T., Outliers in statistical data, Chichester: John Wiley, 1995. 584p, 1964.

Aggarwal, C. C., Outlier detection in categorical, text and mixed attribute data, In: Outlier Analysis, Springer, 2013, pp. 199–223.

Chomatek, L. and Duraj, A., Multiobjective genetic algorithm for outliers detection, In: INnovations in Intelligent SysTems and Applications (INISTA), 2017 IEEE International Conference on, IEEE, 2017, pp. 379–384.

Duraj, A. and Chomatek, L., Supporting Breast Cancer Diagnosis with Multiobjective Genetic Algorithm for Outlier Detection, In: International Conference on Diagnostics of Processes and Systems, Springer, 2017, pp. 304–315.

Aggarwal, C. C. and Yu, P. S., Outlier detection for high dimensional data, Vol. 30, ACM Sigmod Record.

Goel, A., Xu, H., and Shatz, S. M., A Multi-State Bayesian Network for Shill Verification in Online Auctions. In: SEKE, 2010, pp. 279–285.

He, Z., Xu, X., and Deng, S., Discovering cluster-based local outliers, Pattern Recognition Letters, Vol. 24, No. 9, 2003, pp. 1641–1650.

Hekimoglu, S., Erenoglu, R. C., and Kalina, J., Outlier detection by means of robust regression estimators for use in engineering science, Journal of Zhejiang University Science A, Vol. 10, No. 6, 2009, pp. 909–921.

Koscielniak, P., ´ Non-linear robust regression procedure for calibration in flame atomic absorption spectrometry, Analytica chimica acta, Vol. 278, No. 1, 1993, pp. 177–187.

Knorr, E. M., Ng, R. T., and Tucakov, V., Distance-based outliers: algorithms and applications, The VLDB Journal—The International Journal on Very Large Data Bases, Vol. 8, No. 3-4, 2000, pp. 237–253.

Knox, E. M. and Ng, R. T., Algorithms for mining distancebased outliers in large datasets, In: Proceedings of the International Conference on Very Large Data Bases, Citeseer, 1998, pp. 392–403.

Breunig, M. M., Kriegel, H.-P., Ng, R. T., and Sander, J., LOF: identifying density-based local outliers, In: ACM sigmod record, Vol. 29, ACM, 2000, pp. 93–104.

Jin, W., Tung, A. K., and Han, J., Mining top-n local outliers in large databases, In: Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2001, pp. 293–298.

Ester, M., Kriegel, H.-P., Sander, J., Xu, X., et al., A density-based algorithm for discovering clusters in large spatial databases with noise. In: Kdd, Vol. 96, 1996, pp. 226–231.

Kriegel, H.-P., Kröger, P., Schubert, E., and Zimek, A., LoOP: local outlier probabilities, In: Proceedings of the 18th ACM conference on Information and knowledge management, ACM, 2009, pp. 1649–1652.

Orair, G. H., Teixeira, C. H., Meira Jr, W., Wang, Y., and Parthasarathy, S., Distance-based outlier detection: consolidation and renewed bearing, Proceedings of the VLDB Endowment, Vol. 3, No. 1-2, 2010, pp. 1469–1480.

Kreinovich, V., Longpré, L., Patangay, P., Ferson, S., and Ginzburg, L., Outlier detection under interval uncertainty: algorithmic solvability and computational complexity, Reliable Computing, Vol. 11, No. 1, 2005, pp. 59–76.

Schubert, E., Zimek, A., and Kriegel, H.-P., Local outlier detection reconsidered: a generalized view on locality with applications to spatial, video, and network outlier detection, Data Mining and Knowledge Discovery, Vol. 28, No. 1, 2014, pp. 190–237.

Duraj, A. and Krawczyk, A., Finding outliers for large medical datasets, Przeglad Elektrotechniczny, Vol. 86, 2010, pp. 188–191.

Duraj, A. and Szczepaniak, P. S., Information Outliers and Their Detection, In: Information Studies and the Quest for Transdisciplinarity, World Scientific Publishing Company, 2017, pp. 413–437.

Duraj, A., Szczepaniak, P. S., and Ochelska-Mierzejewska, J., Detection of Outlier Information Using Linguistic Summarization, 2016, pp. 101–113.

Duraj, A., Outlier detection in medical data using linguistic summaries, In: Innovations in Intelligent SysTems and Applications (INISTA), 2017 IEEE International Conference on, IEEE, 2017, pp. 385–390.

Nowak-Brzezinska, A., ´ Mining rule-based knowledge bases inspired by rough set theory, Fundamenta Informaticae, Vol. 148, No. 1-2, 2016, pp. 35–50.

Nowak-Brzezinska, A., ´ Outlier mining in rule-based knowledge bases, In: INnovations in Intelligent SysTems and Applications (INISTA), 2017 IEEE International Conference on, IEEE, 2017, pp. 391–396.

Emets, V. and Rogowski, J., Scattering of acoustical waves by a hard strip and outlier phenomenon, In: INnovations in Intelligent SysTems and Applications (INISTA), 2017 IEEE International Conference on, IEEE, 2017, pp. 376–378.

Smolinski, M., Resolving classical concurrency problems using adaptive conflictless scheduling, In: INnovations in Intelligent SysTems and Applications (INISTA), 2017 IEEE International Conference on, IEEE, 2017, pp. 397–402.

Crawford, K. D. and Wainwright, R. L., Applying Genetic Algorithms to Outlier Detection. In: ICGA, 1995, pp. 546–550.

Tolvi, J., Genetic algorithms for outlier detection and variable selection in linear regression models, Soft Computing, Vol. 8, No. 8, 2004, pp. 527–533.

Schwarz, G. et al., Estimating the dimension of a model, The annals of statistics, Vol. 6, No. 2, 1978, pp. 461–464.

Alma, Ö. G., Serdar, K., and Aybars, U., Genetic algorithm based outlier detection using Bayesian information criterion in multiple regression models having multicollinearity problems, Gazi University Journal of Science, Vol. 22, No. 3, 2009, pp. 141–148.

Cucina, D., di Salvatore, A., and Protopapas, M. K., Outliers detection in multivariate time series using genetic algorithms, Chemometrics and Intelligent Laboratory Systems, Vol. 132, 2014, pp. 103–110.

Taloba, A. I., Marghny, M., and El-Aziz, R. M. A., Outlier Detection using Improved Genetic K-means, International Journal of Computer Applications, 2014.

Konak, A., Coit, D. W., and Smith, A. E., Multi-objective optimization using genetic algorithms: A tutorial, Reliability Engineering & System Safety, Vol. 91, No. 9, 2006, pp. 992–1007.

Zitzler, E., Laumanns, M., Thiele, L., et al., SPEA2: Improving the strength Pareto evolutionary algorithm, 2001.

Deb, K., Pratap, A., Agarwal, S., and Meyarivan, T., A fast and elitist multiobjective genetic algorithm: NSGA-II, IEEE transactions on evolutionary computation, Vol. 6, No. 2, 2002, pp. 182–197.

Corne, D. W., Jerram, N. R., Knowles, J. D., and Oates, M. J., PESA-II: Region-based selection in evolutionary multiobjective optimization, In: Proceedings of the 3rd Annual Conference on Genetic and Evolutionary Computation, Morgan Kaufmann Publishers Inc., 2001, pp. 283–290.

Lichman, M., UCI Machine Learning Repository, 2013.

Downloads

Download data is not yet available.