Combining elements from two existing theories of speech sound discrimination, dual process theory (DPT) and trace context theory (TCT), a new theory, called phoneme perception theory, is proposed, consisting of a long-term phoneme memory, a context-coding memory, and a trace memory, each with its own time constants. This theory is tested by means of stop-consonant discrimination data in which interstimulus interval (ISI; values of 100, 300, and 2000 ms) is an important variable. It is shown that discrimination in which labeling plays an important part (2IFC and AX between category) benefits from increased ISI, whereas discrimination in which only sensory traces are compared (AX within category), decreases with increasing ISI. The theory is also tested on speech discrimination data from the literature in which ISI is a variable [Pisoni, J. Acoust. Soc. Am. 36, 277-282 (1964); Cowan and Morse, J. Acoust. Soc. Am. 79, 500-507 (1986)]. It is concluded that the number of parameters in trace context theory is not sufficient to account for most speech-sound discrimination data and that a few additional assumptions are needed, such as a form of sublabeling, in which subjects encode the quality of a stimulus as a member of a category, and which requires processing