Over the last few decades computers have gotten to play an essential role in our daily life, and data is now being collected in various domains at a faster pace than ever before. This dissertation presents research advances in four machine learning fields that all relate to the challenges imposed by the analysis of big data. In the field of kernel methods, we present an information-based denoising technique based on semi-supervised kernel Principal Component Analysis (PCA), that incorporates label information into the kernel PCA objective. Effectively, this guides the low-rank representation towards relevant components, while exploiting intrinsic manifold structures exposed by the data. In the same field, we also introduce a scalable randomized heuristic for optimizing kernel hyperparameters, that is based on maximizing the Minimum Enclosing Ball (MEB) of the class means in the associated Reproducing Kernel Hilbert Space (RKHS). In the field of spectral methods, we introduce semi-supervised eigenvectors of a graph Laplacian, that inherit many of the properties that characterize the global eigenvectors, but by using side-information in the form of a seed set, the semi-supervised eigenvectors are better at modeling local heterogeneities. In the field of machine learning for neuroimaging, we introduce learning protocols for real-time functional Magnetic Resonance Imaging (fMRI) that allow for dynamic intervention in the human decision process. Specifically, the model exploits the structure of fMRI data by incorporating a temporal Gaussian Process (GP) smoothness prior, which reduces model degeneracy caused by mislabeled data samples. Finally, in the field of topic modeling, we introduce a Graphics Processing Unit (GPU) accelerated framework for co-clustering in large-scale sparse bipartite networks. By implementing the Infinite Relational Model (IRM) in this framework we achieve speedups of two orders of magnitude compared to estimation based on conventional processors.