This paper presents a new approach for the optimal design of distributed wastewater treatment networks with multiple contaminants. It consists of a two-stage solution strategy. In the first stage, a decomposition method is employed that replaces the general non-linear program (NLP) by a succession of linear programs, one for each treatment unit. In the second stage, the resulting network is used as a starting point for the solution of the general NLP by a local optimization solver. The decomposition process considers a specific substructure, where it is assumed that the wastewater streams go through the treatment units in sequence. To consider all combinations, the two-stage solution strategy is applied as many times as the number of possible sequences. This allows considering multiple and structurally different starting points, thus increasing the probability of finding global optimal solutions. The results have shown that the proposed approach can find better solutions than other approach reported in the literature, however with a drawback of being more demanding computationally.