Functional optimization problems can be solved analytically only if special assumptions are verified; otherwise, approximations are needed. The approximate method that we propose is based on two steps. First, the decision functions are constrained to take on the structure of linear combinations of basis functions containing free parameters to be optimized (hence, this step can be considered as an extension to the Ritz method, for which fixed basis functions are used). Then, the functional optimization problem can be approximated by nonlinear programming problems. Linear combinations of basis functions are called approximating networks when they benefit from suitable density properties. We term such networks nonlinear (linear) approximating networks if their basis functions contain (do not contain) free parameters. For certain classes of d-variable functions to be approximated, nonlinear approximating networks may require a number of parameters increasing moderately with d, whereas linear approximating networks may be ruled out by the curse of dimensionality. Since the cost functions of the resulting nonlinear programming problems include complex averaging operations, we minimize such functions by stochastic approximation algorithms. As important special cases, we consider stochastic optimal control and estimation problems. Numerical examples show the effectiveness of the method in solving optimization problems stated in high-dimensional settings, involving for instance several tens of state variables.