A simple model for cooperation between "selfish" agents, which play an extended version of the prisoner's dilemma game, in which they use arbitrary payoffs, is presented and studied. A continuous variable, representing the probability of cooperation, p(k)(t)is an element of[0,1], is assigned to each agent k at time t. At each time step t a pair of agents, chosen at random, interact by playing the game. The players update their p(k)(t) using a criterion based on the comparison of their utilities with the simplest estimate for expected income. The agents have no memory and do not use strategies based on direct reciprocity or "tags." Depending on the payoff matrix, the system self-organizes-after a transient-into stationary states characterized by their average probability of cooperation (p) over bar (eq) and average equilibrium per-capita income (p) over bar (eq),(U) over bar (infinity). It turns out that the model exhibits some results that contradict the intuition. In particular, some games that a priori seem to favor defection most, may produce a relatively high degree of cooperation. Conversely, other games, which one would bet lead to maximum cooperation, indeed are not the optimal for producing cooperation.