An overemphasis on creativity for evaluating research has lead to a serious devaluation of replication studies. However, we need a total sample size of N = 153,669 to estimate a causal effect to two digits, which is quite rare for a single study. The only way to get accurate estimation is to average across replications. If the average sample size were as high as N = 200, we would need over 700 replication studies. Scientific replications are more problematic than pure statistical replications, and so we need even more replications to achieve reasonable accuracy.