We consider paradigms for interpretation and analysis of the CMIP3 ensemble of climate model simulations. The dominant paradigm in climate science, of an ensemble sampled from a distribution centred on the truth, is contrasted with the paradigm of a statistically indistinguishable ensemble, which has been more commonly adopted in other fields. This latter interpretation ( which gives rise to a natural probabilistic interpretation of ensemble output) leads to new insights about the evaluation of ensemble performance. Using the well-known rank histogram method of analysis, we find that the CMIP3 ensemble generally provides a rather good sample under the statistically indistinguishable paradigm, although it appears marginally over-dispersive and exhibits some modest biases. These results contrast strongly with the incompatibility of the ensemble with the truth-centred paradigm. Thus, our analysis provides for the first time a sound theoretical foundation, with empirical support, for the probabilistic use of multi-model ensembles in climate research. Citation: Annan, J. D., and J. C. Hargreaves (2010), Reliability of the CMIP3 ensemble, Geophys. Res. Lett., 37, L02703, doi: 10.1029/2009GL041994.