1 Department of Applied Mathematics and Computer Science, Technical University of Denmark2 Cognitive Systems, Department of Applied Mathematics and Computer Science, Technical University of Denmark3 Copenhagen Center for Health Technology, Center, Technical University of Denmark
This Ph.D. thesis titled “Assessing Miniaturized Sensor Performance using Supervised Learning, with Application to Drug and Explosive Detection” is a part of the strategic research project “Miniaturized sensors for explosives detection in air” funded by the Danish Agency for Science and Technology’s, Program Commission on Nanoscience Biotechnology and IT (NABIIT), case number: 2106-07-0031. The project, baptized “Xsense” was led by professor Anja Boisen, DTU Nanotech. DTU Informatics participate in the project as data analysis partner. This thesis presents advances in the area of detection of vapor emanated by explosives and drugs, similar to an electronic nose. To evaluate sensor responses a data processing and evaluation pipeline is required. The work presented herein focuses on the feature extraction, feature representation and sensor accuracy. Thus the primary aim of this thesis is twofold; firstly, present methods suitable for assessing sensor accuracy, and secondly improve sensor performance by enhancing the preprocessing and feature extraction. Five different miniaturized sensors are presented. Naturally, each sensor require its own special preprocessing and feature extraction techniques before the sensor responses can be applied to supervised learning algorithms. The technologies used for sensing consist of Calorimetry, Cantilevers, Chemoselective compounds, Quartz Crystal Microbalance and Surface Enhanced Raman Scattering. Each of the sensors have their own strength and weaknesses. The reasoning for using multiple sensors was the desire to investigate the feasibility for an integrated multisensor solution. A unique setup of multiple independent detectors is able to vastly enhance accuracy compared to what a single sensor can deliver. As we are detecting hazardous compounds this enables the need for sensors to deliver not only decisions but also certainty about decisions. This requirement is handled by introducing classifiers that offer posterior probabilities and not only decisions. The three probabilistic classification models utilized are Artificial Neural Networks, Logistic Regression and Gaussian Processes. Often, there is no tradition for using these methods in the communities of the prescribed sensors. Here, a method of too much complexity is often undesired so it is a balance when to utilize more sophisticated methods. For this reason, an array of methods that only discriminate between samples are used as baseline. The methods used vary from sensor to sensor, as these methods serve as baseline performance when introducing new methods. The most widely used baseline method in this thesis is the k-nearest-neighbor algorithm. This method is of particular interest in the application of sensors, as the sensors are designed to provide robust and reliable measurements. That means, the sensors are designed to have repeated measurement clusters. Sensor fusion is presented for the sensor based on chemoselective compounds. An array of color changing compounds are handled and in unity they make up an colorimetric sensor array. In this setting it is valuable to qualify which compounds in the colorimetric sensor array are important. That knowledge enables the ability to either reduce the size of the sensor or replace less sensitive and unimportant compounds with more selective and responsive compounds. A framework based on forward selection Gaussian Process classification is demonstrated to successfully identify a set of important compounds.