Abstract.
This paper presents a novel Computer Aided Diagnosis (CAD) system for lung cancer utilizing advanced machine learning techniques. The system incorporates a feature extraction phase using Wavelet Transforms (WT) with Haar, Daubechies (db1), and Symlet (sym3) functions, followed by a two-stage feature selection based on variance and energy metrics, defined as \(\text{Var_mod} = \frac{1}{n} \sum_{i=1}^{n} \left(\mu_i - \mu_T\right)^2\) and \(E = \sum_{i=1}^{N} x_i^2\). Classification is performed using a hybrid Cluster-K-Nearest Neighbor (C-KNN) algorithm, combining K-means and K-NN, enhancing accuracy by minimizing intra-cluster variance and maximizing inter-cluster variance. Evaluation on the JSRT dataset yields a peak accuracy of 96.58% with db1 at level 3, highlighting the effectiveness of the proposed method. The unique clustering and classification strategy significantly reduces false positives and negatives, presenting a robust approach for lung nodule detection. Future work will explore additional wavelet functions and the application of Curvelet Transforms to further augment diagnostic precision.
Illustration of the proposed experimental design.