On large scales, the higher order moments of the mass distribution, S-J = ($) over bar xi(J)/($) over bar xi(2)(J-1), e.g., the skewness S-3 and kurtosis S-4, can be predicted using nonlinear perturbation theory. Comparison of these predictions with moments of the observed galaxy distribution probes the bias between galaxies and mass. Applying this method to models with initially Gaussian fluctuations and power spectra P(k) similar to that of galaxies in the Automatic Plate Measuring (APM) survey, we find that the predicted higher order moments S-J(R) are in good agreement with those directly inferred from the APM survey in the absence of bias. We use this result to place limits on the linear and nonlinear bias parameters. Models in which the extra power observed on large scales (with respect to the standard cold dark matter [CDM] model) is produced by scale-dependent bias match the APM higher order amplitudes only if nonlinear bias (rather than nonlinear gravity) generates the observed higher order moments. When normalized to COBE DMR, these models are significantly ruled out by the S-3 observations. The cold plus hot dark matter model normalized to COBE can reproduce the APM higher order correlations if one introduces nonlinear bias terms, while the low-density CDM model with a cosmological constant does not require any bias to fit the large-scale amplitudes.