1 Department of Mathematics and Computer Science (IMADA), Faculty of Science, SDU2 Computer Science, Department of Mathematics and Computer Science (IMADA), Faculty of Science, SDU3 unknown4 Department of Mathematics and Computer Science (IMADA), Faculty of Science, SDU
Exhaled air carries information on human health status. Ion mobility spectrometers combined with a multi-capillary column (MCC/IMS) is a well-known technology for detecting volatile organic compounds (VOCs) within human breath. This technique is relatively inexpensive, robust and easy to use in every day practice. However, the potential of this methodology depends on successful application of computational approaches for finding relevant VOCs and classification of patients into disease-specific profile groups based on the detected VOCs. We developed an integrated state-of-the-art system using sophisticated statistical learning techniques for VOC-based feature selection and supervised classification into patient groups. We analyzed breath data from 84 volunteers, each of them either suffering from chronic obstructive pulmonary disease (COPD), or both COPD and bronchial carcinoma (COPD + BC), as well as from 35 healthy volunteers, comprising a control group (CG). We standardized and integrated several statistical learning methods to provide a broad overview of their potential for distinguishing the patient groups. We found that there is strong potential for separating MCC/IMS chromatograms of healthy controls and COPD patients (best accuracy COPD vs CG: 94%). However, further examination of the impact of bronchial carcinoma on COPD/no-COPD classification performance is necessary (best accuracy CG vs COPD vs COPD + BC: 79%). We also extracted 20 high-scoring VOCs that allowed differentiating COPD patients from healthy controls. We conclude that these statistical learning methods have a generally high accuracy when applied to well-structured, medical MCC/IMS data.
Genetics and Molecular Research, 2012, Vol 11, Issue 3, p. 2733-2744