We propose a method to estimate the pose of a hand in a sequence of stereo images. This is a difficult problem since a hand is a complex object with a high number of degrees of freedom, and automatically segment the hand in the images is not easy. Our method is intending to solve these problems. Two video cameras feed two images to a stereo-correlation algorithm, allowing a 3D reconstruction of the scene. Then a 3D articulated model of the hand, made of truncated cones and spheres, is fitted to this reconstruction in order to estimate pose of palm and fingers. We are dealing with model-based tracking of a hand movement, in which we suppose that the pose of the hand is known in the first images.