This study examined whether the quasi-distributed response function used in TOPMODEL provides superior performance for event simulation in small, temperate forested catchments, compared to lumped reservoir representations of runoff routing similar to those employed in many catchment hydrology models. The alternatives were a two-reservoir black-box model and a three-reservoir model structured to represent our perceptual model of runoff processes based on held observations. A second objective was to test the statistical significance of differences in model performance using a new approach that combines the Jackknife with analysis of variance (ANOVA). The models were tested against streamflow data from two small forested catchments using Klemes' hierarchical scheme for operational validation. At levels 1 (split-sample test) and 2 (proxy-basin test) there were no statistically significant differences in model performance when expressed using the entire hydrographs. However, TOPMODEL did appear to provide superior fits to the peak flows. At level 3 (differential split-sample test), TOPMODEL performed statistically significantly better than both lumped models in both catchments. The statistical analysis for level 4 (proxy-basin differential split-sample test) indicates that TOPMODEL performed significantly better than the lumped models at one catchment but not at the other, The statistical design combining ANOVA with the Jackknife shows promise as a workable method for establishing statistical significance of differences in model performance indices based on the entire hydrograph, However. performance indices bused only on the peak flows exhibited extreme skew that was not amenable to normalization by transformation. Further research should investigate alternative, robust methods for assessing statistical significance of model performance indices. (C) 1999 Elsevier Science B.V, All rights reserved.