Patent attributes
Implementations provide for use of spherical random features for polynomial kernels and large-scale learning. An example method includes receiving a polynomial kernel, approximating the polynomial kernel by generating a nonlinear randomized feature map, and storing the nonlinear feature map. Generating the nonlinear randomized feature map includes determining optimal coefficient values and standard deviation values for the polynomial kernel, determining an optimal probability distribution of vector values for the polynomial kernel based on a sum of Gaussian kernels that use the optimal coefficient values, selecting a sample of the vectors, and determining the nonlinear randomized feature map using the sampled vectors. Another example method includes normalizing a first feature vector for a data item, transforming the first feature vector into a second feature vector using a feature map that approximates a polynomial kernel with an explicit nonlinear feature map, and providing the second feature vector to a support vector machine.