Comparison of model outputs with observations of the climate system forms an essential component of model assessment and is crucial for building our confidence in model predictions. Methods for undertaking this comparison are not always clearly justified and understood. Here we show that the popular approach of comparing the ensemble spread to a so-called "observationally-constrained pdf" can be highly misleading. Such a comparison will almost certainly result in disagreement, but in reality tells us little about the performance of the ensemble. We present an alternative approach, and show how it may lead to very different, and rather more encouraging, conclusions. We additionally present some necessary conditions for an ensemble (or more generally, a probabilistic prediction) to be challenged by an observation. Citation: Annan, J. D., J. C. Hargreaves, and K. Tachiiri (2011), On the observational assessment of climate model performance, Geophys. Res. Lett., 38, L24702, doi: 10.1029/2011GL049812.