It has been verified experimentally that when nonlinear Principal Component Analysis (PCA) learning rules are used for the weights of a neural layer, the neurons have signal separation capabilities. The network is performing Independent Component Analysis. The learning rule, earlier proposed by the author, is studied here mathematically to analyze why and how the algorithm works in this application, It is shown that the weight matrix obtained as the asymptotic solution of the nonlinear PCA learning rule is in some cases a rotation of the input vector to statistically independent directions. This explains why it can be used for image and speech signal separation. Sufficient conditions are formulated, depending on the nonlinear neuron activation function and on the probability densities of the original signal components. It is shown that a sigmoidal nonlinearity as the activation function is feasible for flat sub-Gaussian densities of the original signals, while polynomial activation functions are feasible for sharp super-Gaussian densities.