In pipelined parallel computations the inner loops are often implemented in a block fashion. In such programs, an important compiler optimization involves the need to statically determine the grain size. This paper presents extensions and experimental validation of the previous results of Andonov and Rajopadhye on optimal grain size determination.