In machine learning, there is a revived interest for kernel methods, e.g. for designing interpretable convolutional networks or in the context of Gaussian processes. More generally, in kernel-based learning, a central question concerns large scale approximations of the kernel matrix. A popular method for finding a low rank approximation of kernel matrices is the so-called Nystrom method, which relies on the sampling of 'good' landmark points in a dataset. We will discuss an approach for selecting 'diverse' landmarks with some theoretical guarantees. Our work makes a connection between kernelized Christoffel functions, ridge leverage scores and determinantal point processes.