We study in a probabilistic framework some topics concerning the way words can overlap. Our probabilistic model assumes that a word is a sequence of i.i.d. symbols generated from a finite alphabet. This defines the so-called Bernoulli model. We investigate the length of a subword that can be recopied, that is, a subword that occurs at least twice in a given word. An occurence of such repeated substrings is easy to detect in a digital tree called a suffix tree. The length of a repeated substring corresponds to the typical depth in the associated suffix tree. Our main finding shows that the typical depth in a suffix tree is asymptotically distributed in the same manner as the typical depth in a digital tree that stores independent keys (i.e., independent tries). More precisely, we prove that the typical depth in a suffix tree built from the first n suffixes of a random word is normally distributed with the mean asymptotically becoming 1/h1 log n and the variance alpha . log n, where h1 is the entropy of the alphabet and alpha is a parameter of the underlying probabilistic model. We prove these results using a novel technique called the string-ruler approach. Our results provide new insights into several algorithms on words and data compression schemes. They find direct applications in many facets of science, most notably in molecular biology, coding theory, theory of languages, and design and analysis of algorithms. (C) 1994 Academic Press, Inc.