Sequences of intracellular and extracellular soluble proteins were analyzed statistically in terms of amino acid composition and residue-pair frequencies. Residue-pair frequencies were calculated for sequential separations from (n, n + 1) to (n, n + 5), and converted into scoring parameters. Then, for each test protein, the single-residue and residue-pair parameters were applied to calculate a total score. According to our definition, a protein which yields a positive score is indicative of an intracellular protein, whereas a negative score implies an extracellular one. The parameter set was derived from 894 sequences constituting different protein families in the PIR database, and assessed by application to a test of 379 proteins. The results showed that 88% of intracellular and 84% of extracellular proteins were correctly assigned. The discrimination power was improved by about 8% in comparison with the previous study, which used composition data alone. Segregation of intra/ extracellular proteins is also observed by other criteria, such as structural class (intracellular proteins prefer α and α/α types and extracellular proteins prefer α and α+α types). Segregation by sequence was found to be a more reliable procedure for distinguishing intra/ extracellular proteins than methods using structural class. Possible causes for this segregation by sequence are discussed. © 1994 Academic Press Limited.