Samir Brahim Belhaouari

Associate Professor, Hamad Bin Khallifa Uniersity

sbelhaouari [AT] hbku.edu.qa

Distance Based Joint Probability Density Estimation for Unsupervised Outlier Detection

Advantages and Evaluation of the Proposed JPDE-DM Approach for Outlier Detection

Abstract. Outlier detection is a crucial preprocessing step in data mining and is highly significant for Machine Learning (ML) algorithms. If a ML model is trained without removing outliers, the outliers can adversely affect the prediction accuracy, leading to potentially misleading results. Recognizing the importance of outlier detection, this paper proposes an unsupervised outlier detection mechanism based on Joint Probability Density Estimation (JPDE) integrated with a Distance Measure (DM). The proposed approach leverages a single-dimensional distance vector to identify outliers, making it suitable for high-dimensional datasets with low computational complexity. Additionally, the paper presents and evaluates three different JPDE-DM-based methods using complex benchmark synthetic datasets.

Illustration of the proposed experimental design.