One goal of automated cartography is to generate an accurate 3-D model of man-made structures and natural terrain. Some of the most challenging problems in cartographic feature extraction occur in dense urban areas where the level of detail and scene clutter greatly complicate traditional map compilation techniques. In this paper, we describe experiments toward a comprehensive stereo analysis system to recover the 3-D description of an urban area using high-resolution aerial imagery. Given an area of interest in terms of geographic coverage, our system can automatically find the appropriate stereo pair using a spatial database, select control points to register the two images so that epipolar geometry is satisfied, and recover disparity information using two complementary matching techniques. In our research, we do not assume that the initial input images satisfy the epipolar geometry constraint because this is rarely the case in unrectified aerial imagery. Therefore, we argue that stereo mapping research must explicitly address error and uncertainty in both scene registration and stereo matching and that we need techniques to evaluate such errors in a rigorous manner. We also argue that in order to achieve robust behavior, multiple methods for scene feature extraction should be utilized, and if possible, their results should be integrated into a consistent framework. We describe techniques for scene registration using five different features that can be automatically extracted to provide control points for fine image registration. In the stereo matching process, two techniques are utilized: an area-based and a feature-based stereo matcher to generate a disparity map for a scene. We also present some preliminary results on a technique to merge the results of the stereo matching algorithms to provide improved information regarding height estimates. Finally, we describe techniques to generate rigorous performance analysis metrics to compare stereo matching algorithms based on a manually derived 3-D ground truth segmentation. The analysis includes the error estimation metrics for both height and delineation accuracy based on the measurements of deviations from manual estimates. These estimates are computed globally over the entire scene and locally on a structure-by-structure basis. Relative accuracy of the area-based, feature-based, and merged disparity estimates are provided for several different test scenes.