Accurate prediction of the complete structures of protein-coding genes, including 5'-untranslated regions, is crucial for full interpretation of genome sequence, A new report describes the statistical properties of the first exons of genes, and presents a novel statistical method for predicting their boundaries. First exons, which are partially or completely non-coding, have been relatively neglected by gene-finding algorithms. When integrated with other genome annotation tools, this new method will improve the technology of automated genome interpretation.