Many decision problems can be modeled using mathematical programs, which seek to maximize or minimize some objective which is a function of the decisions. The possible decisions are constrained by limits in resources, minimum requirements, etc. Decisions are represented by variables, which may be, for example, nonnegative or integer. Objectives and constraints are functions of the variables, and problem data. Examples of problem data include unit costs, production rates, sales, or capacities.
Suppose the decisions are represented by the variables (x1,x2,...xn). For example, x(i) can represent production of the i th of n products. The general form of a mathematical program is
minimize f(x1, x2, x3, ...., xn) subject to g1(x1, x2, x3, ...., xn) <= 0 .. gm(x1, x2, x3, ...., xn) <= 0 x1, x2, x3, ...., xn in Xwhere X is a set that be, e.g. all nonnegative real numbers. The constraints can be quite general, but linear constraints are sufficient in many cases to capture the essence of the model.
For a good introduction to Mathematical Programming, we like
Stochastic programs are mathematical programs where some of the data incorporated into the objective or constraints is uncertain. Uncertainty is usually characterized by a probability distribution on the parameters. Although the uncertainty is rigorously defined, in practice it can range in detail from a few scenarios (possible outcomes of the data) to specific and precise joint probability distributions. The outcomes are generally described in terms of elements w of a set W. W can be, for example, the set of possible demands over the next few months.
When some of the data is random, then solutions and the optimal objective value to the optimization problem are themselves random. A distribution of optimal decisions is generally unimplementanble (what do you tell your boss?). Ideally, we would like one decision and one optimal objective value.
minimize f1(x) + Expected Value[ f2( y(w), w ) ] subject to g1(x) <= 0, ... gm(x) <= 0 h1(x, y(w) ) <= 0 for all w in W .. hk(x, y(w) ) <= 0 for all w in W x in X, y(w) in YThe set of constraints h1 ... hk describe the links between the first stage decisions x and the second stage decisions y(w). Note that we require that each constraint hold with probability 1, or for each possible w in W. The functions f2 are quite frequently themselves the solutions of mathematical problems. We don't want to make an arbirary correction (recourse) to the first stage decision; we want to make the best such correction.
Recourse models can be extended in a number of ways. One of the most common is to include more stages. With a multistage problem, we in effect make one decision now, wait for some uncertainty to be resolved (realized), and then make another decision based on what's happened. The objective is to minimize the expected costs of all decisions taken.
Solving a recourse problem is generally much more difficult than the determinsitic version. The hardest part is to evaluate the expected costs of each stage (except the first). In effect, we're trying to perform high-dimensional numerical integration on the solutions to mathematical programs.
When the random data is discretely distributed, the problem can be written as a large deterministic problem. The expectations can be written as finite sums, and each constraint can be duplicated for each realization of the random data. The resulting deterministic equivalent problem can be solved using any general purpose optimization package. (A program called STOCHTOMPS that forms linear determinisitic equivalent problems is available via anonymous FTP.) The determinsitic equivalent models enjoy a lot of structure, so several packages have been developed to solve them quickly. Click here to see a short FAQ list with references and pointers to software
When the random data has a continuous distribution, the integrals are particularly difficult. However, you can find upper and lower bounds to the expected recourse cost that reduce to discretely distributed stochastic programs. (Other approximations have been developed, but not extensively tested.) Click here to see a short FAQ list with references and pointers to software
Again, assume that x is a vector of decisions. The general form of this problem is to
minimize f(x1, x2, x3, ...., xn) subject to Pr[ g1(x1, x2, x3, ...., xn) <= 0 ... .. gm(x1, x2, x3, ...., xn) <= 0 ] >= alpha h1(x1, x2, ... xn ) <= 0 .. hk(x1, x2, ... xn ) <= 0 x1, x2, x3, ...., xn in X