Much recent progress has been made for Hilbert space embedding in probabilistic distributions. In enables development of regularization methods for probabilistic graphical models. We consider non-parametric hidden Markov model (HMM) that extends common HMM to non-Gaussian continuous distributions by means of embedding into Reproducing Kernel Hilbert Space (RKHS). Due to ill-posedness of the inverse problem arising at learning stage, regularization is required. Well-known training algorithms use L_1, L_2 and truncated spectral regularization to invert the corresponding kernel matrix. In our research, we consider more general discrete regularization method, specifically, Nystrom-type subsampling. Moreover, simultaneous regularization by means of Nystrom-type subsampling and improved optimization technique enable us to use this approach for online algorithms. In the present study, regularization scheme is equipped with a strategy for choice of regularization parameters, which is based on the idea of an ensemble of regularized solutions corresponding to different values of the regularization parameter. The coefficients for each of components are estimated by means of linear functional strategy. We investigate, both theoretically and empirically, regularization and approximation bounds of the discrete regularization method. Finally, we present applications of the method to real-world problems, and compare the approach to the state-of-the-art algorithms.
展开▼