As we look to the future, and the prospect of a billion transistors on a chip, it seems inevitable that micro-processors witt exploit having multiple parallel threads. To achieve the full potential of these "single-chip mutliprocessors," however tee must find a way to parallelize non-numeric applications. Unfortunately, compilers have had little success in parallelizing non-numeric codes due to their complex access patterns. This paper explores the potential for using thread-level data speculation (TLDS) to overcome this limitations by allowing the compiler to view parallelization solely as a cost/benefit tradeoff rather than something which is likely to violate program correctness. Our experimental results demonstrate than with realistic compiler support. TLDS can offer significant program speedups. We also demonstrate that through modest hardware extensions, a generic single-chip multiprocessor could support TLDS by augmenting its cache coherence scheme to detect dependence violations, and by using the primary data caches to buffer speculative state.