The scope of this work is the separation of N sources from M linear mixtures when the underlying system is underdetermined, that is, when M < N. If the input distribution is sparse the mixing matrix can be estimated either by external optimization or by clustering and, given the mixing matrix, a minimal l(1) norm representation of the sources can be obtained by solving a low-dimensional linear programming problem for each of the data points. Yet, when the signals per se do not satisfy this assumption, sparsity can still be achieved by realizing the separation in a sparser transformed domain. The approach is illustrated here for M = 2. In this case we estimate both the number of sources and the mixing matrix by the maxima of a potential function along the circle of unit length, and we obtain the minimal l(1) norm representation of each data point by a linear combination of the pair of basis vectors that enclose it. Several experiments with music and speech signals show that their time-domain representation is not sparse enough. Yet, excellent results were obtained using their short-time Fourier transform, including the separation of up to six sources from two mixtures. (C) 2001 Elsevier Science B.V. All rights reserved.