Information theoretic learning is a learning paradigm that uses concepts of entropies and divergences from information theory. A variety of signal processing and machine learning methods fall into this framework. Minimum error entropy principle is a typical one amongst them. In this talk, we study a kernel version of minimum error entropy methods that can be used to find nonlinear structures in the data. We show that the kernel minimum error entropy can be implemented by kernel based gradient descent algorithms with or without regularization. Convergence rates for both algorithms are deduced.