Benthic invertebrate communities of 50 nearshore reference sites in the North American Great Lakes were evaluated by replicate (n = 5), quantitative sampling. Also, sediments collected at the 50 sites were used in eight, replicated (n = 3), lethal and sublethal bioassays in the laboratory. We quantified the magnitude of variation and the proportion of variation among sires, as opposed to among replicates within sites, for densities of major community members and all bioassay endpoints. Tetra and bioassay endpoints with a large amount of variation, primarily among sites, best described the magnitude and nature of variation among unpolluted reference sites. Sponges (Porifera) and worms (Oligochaeta) were the most descriptive benthic taxa, with relatively high amounts of variation, mostly (>80%) among sites. Growth of Hexagenia limbata and tubificid growth and reproduction best described variation in bioassay endpoints among the reference sites, with a considerable amount of variation, mostly (>60%) among sites. In general, bioassay endpoints showed less variation than taxon abundances. A Mantel's test showed a strong (r = 0.20; p<0.004) relationship between community structure, as reflected in the density of the fifteen major benthic taxa, and sediment toxicity, as I reflected in the eight bioassays. Semi-strong hybrid multi-dimensional scaling of the community and bioassay matrices showed three correlated sets of sites: (i) depauperate sites with poor Hexagenia and tubificid performance in bioassays, (ii) high sponge sites with good Hexagenia and tubificid performance in bioassays; and (iii) high worm sites with poor Hexagenia and moderate tubificid performance in bioassays. This study has illustrated both the magnitude and nature of variation in benthic communities and sediment toxicity among reference sites in the North American Great Lakes, as well as the covariation of community and bioassay measures of ecosystem structure and function.