We construct a new orthonormal basis for L-2 (R-2), whose elements are angularly integrated ridge functions orthonormal ridgelets. The basis elements are smooth and of rapid decay in the spatial domain, and in the frequency domain are localized near angular wedges which, at radius r = 2(j), have radial extent Delta r approximate to 2(j) and angular extent Delta theta approximate to 2 pi/2(j). Orthonormal ridgelet expansions expose an interesting phenomenon in nonlinear approximation: they give very efficient approximations to objects such as 1({x1cos theta+x2sin theta>a}) e-(x12 - x22) which are sm th away from a discontinuity along a line. The orthonormal ridgelet coefficients of such objects are sparse : they belong to every l(p), p> 0. This implies that simple thresholding in the ridgelet orthobasis is, in a certain sense, a near-ideal nonlinear approximation scheme for such objects. Orthonormal ridgelets may be viewed as L-2 substitutes for approximation by sums of ridge functions, and so can perform many of the same tasks as the ridgelet systems constructed by Candes [Ph. D. Thesis, Department of Statistics, Stanford University, Stanford, CA, 1998; Appl. Comput. Harmon. Anal., 6 (1999), pp. 197-218]. Orthonormal ridgelets make available the machinery of orthogonal decompositions, which is not available for ridge functions as they are not in L-2 (R-2). The ridgelet orthobasis is constructed as the isometric image of a special wavelet basis for Radon space; as a consequence, ridgelet analysis is equivalent to a special wavelet analysis in the Radon domain. This means that questions of ridgelet analysis of linear singularities can be answered by wavelet analysis of point singularities. At the heart of our nonlinear approximation result is the study of a certain tempered distribution on R-2 defined formally by S (u,v) = \v\(1/2) sigma(u/\v\) with sigma a certain smooth bounded function; this is singular at (u,v) = (0, 0) and C-infinity elsewhere. The key point is that the analysis of this point singularity by tensor Meyer wavelets yields sparse coefficients at high frequencies; this is reflected in the sparsity of the ridgelet coefficients and the god nonlinear approximation properties of the ridgelet basis.