In pixel-level image sequence fusion, a composite image sequence has to be built of several spatially registered input image sequences. One of the primary goals in image sequence fusion is the temporal stability and consistency of the fused image sequence. To fulfill the preceding desiderata, we propose a novel approach based on a shift invariant extension of the 2D discrete wavelet transform, which yields an overcomplete and thus shift invariant multiresolution signal representation. The advantage of the shift invariant fusion method is the improved temporal stability and consistency of the fused sequence, compared to other multiresolution fusion methods. To evaluate temporal stability and consistency of the fused sequence we introduce a quality measure based on the mutual information between the inter-frame-differences (IFD) of the input sequences and the fused image sequence. If the mutual information is high, the information in the IFD of the fused sequence is low with respect to the information present in the IFDs of the input sequences, indicating a stable and consistent fused image sequence. We evaluate the performance of several multiresolution fusion schemes on a real word image sequence pair and show that the shift invariant fusion method outperforms the other multiresolution fusion methods with respect to temporal stability and consistency.