Subsections
18
Sequential
Statistical Decisions Model, "Bride Problem"
1 Introduction
Most of the decisions in a personal life and in organizations are
made sequentially. That means that, at any given moment, one
either makes the final decision, or postpones it hoping for
better times. Statistical part of sequential decisions
represents uncertainties of future events and a limited
reliability of our observations.
The Bride problem is a good example of sequential statistical
decisions [Wald, 1947,Wald, 1950]. The dynamic programming is a
conventional technique to optimize sequential decisions
[Bellman, 1957]. Applying the dynamic
programming to specific problems, one
develops specific algorithms, as usual. The algorithms, similar
to these of bride's decision making, can
be applied choosing the optimal time to buy durable goods, such
as cars, houses, computers, to start new business, e.t.c..
Typical industrial applications of sequential decisions are the
optimal stopping rules. For example, one can consider stopping of
some nuclear plant as a marriage. One can evaluate stopping cost
directly. Risks involved in running the plant, when one observes
some deviations from the normal routine, can be considered as
risks of waiting for the next proposal. That means that one hopes
for something better while risking a bad outcome.
2 Average Utility
The Bride problem is to maximize the average utility of marriage
by the optimal choice of a groom. Denote the actual goodness
of a groom
by
. Denote by
the bride's impression
about the groom
.
is a prior probability density
of goodness
.
is a probability
density of impression
. Assume that goodness of different
grooms are independent and identically distributed. This means that a prior probability density of
goodness is
Suppose that impressions about the goodness of
grooms are independent and identically distributed.
Then the probability density of an impression
, given the
goodness
, is
Assume the Gaussian prior probability density of goodness
Here
is a prior variance and
is a prior mean
of goodness. For example,
shows an optimistic bride.
shows that a bride is pessimistic. Suppose, that a
prior probability density of bride impressions is
Here
is a variance of impressions around the true
goodness
.
Assume that both the groom goodness and the bride
impression are random variables depending on many independent
factors. This explains the Gaussian distributions (18.3) and (18.4). One defines a posterior probability density of
goodness
, given the impression
, by the Bayesian formula
[Bayes, 1783]
Here
3 Single-Marriage Case
Denote by
the bride's decision about the groom
Suppose that
The last condition means that brides marry, and marry only once. Formally condition (18.8) defines the set of feasible decisions when the
-th groom proposes
Here
is the marriage index
The marriage index is zero, if the bride is marred. This prevents repeated marriages.
1 Bellman's Equations
The expected utility function is
. Here
is the
impression made by the successful groom
.
Denote by
the expected utility function, if the
impression of the last groom is
Comparing (18.11) and (18.12) one observes that
This follows from independence assumptions (18.1) and
(18.2).
Denote by
the expected utility,
if the impression of the
th groom is
and a bride is
making the optimal decision
Here
Following the same pattern, we define the expected utility, if the
impression of the
-th groom is
and the bride is making
the optimal decision
Here
Note, that the utility
of accepting a proposal in
expressions (18.17) and (18.18) is a function only of
impression
. It does not depend on the proposal number
. That follows from independence assumptions
(18.1) and (18.2).
Solving these recurrent equations one
defines the sequence of optimal decision functions
and the expected utilities
. This should be done for
all possible impressions
and for all
numbers
. One cannot do that in continuous case.
Therefore, one uses a discrete approximation
2 Discrete Approximation
From expressions
(18.11) and (18.12), replacing the integrals by sums one obtains that
From expression (18.16)
From expression (18.19)
Here
and
. That is a discrete approximation of the recurrent
equations. All possible impressions
and all numbers
are considered. One
sets the number of iterations
by the accuracy needed.
The results are sequences of optimal decision functions
and the expected utilities
. These sequences are stored in a set of arrays which define how the
optimal decisions
and the expected utilities
depend on
the possible impressions
. Doing this, one avoids
the repeated calculations. Large arrays is a disadvantage
of this procedure.
The waiting losses are important in many
real-life sequential decision problems. Denote by
the loss of
waiting for the next groom. Including this parameter into
Bellman equations one obtains
In a similar way one defines the expected utility if the
impression of the
-th groom is
and the bride is making
the optimal decision
The other expressions remains the same.
Expression (18.12) was defined assuming the linear
bride's utility function. It was supposed that bride's utility
is equal to the goodness of groom
. In the real life, utility functions are
nonlinear (see chapter 14) and non-convex, as usual.
Then expression (18.12) is replaced by this integral
4 Multi-Marriage Case
Condition (18.8) implies that
brides marry and marry only once. No divorce is permitted.
That is wrong, if the "marriage"
represents buying a Personal Computer (PC).
We illustrate this by a simple "Buy-a-PC"
example.
1 " Buy-a-PC" Example
Define PC parameters by a vector
. Here
is the speed of CPU in
,
is the volume of RAM in
, and
is the volume of HD in
.
Express a subjective utility of PC by the weighted sum
. Here
are
expressed in $ per unit.
Assume that a prior probability density of PC utilities is Gaussian
Here
is a prior variance and
is a prior mean
of PC utilities. Suppose that
and
that
. This means that
expected PC utility is increasing linearly. The expected diversity remains the same.
The PC price in $ is denoted by
.
Suppose that the price of PC depends linearly
on the weighted sum
Here parameters
are defined in $ per unit and reflect the market prices of CPU, RAM, and HD. Expressing the price
as a function
of utility
:
Here
Assume that one observes the utility
exactly.
This means that the impression
and that impression errors
in expression (18.2).
That simplifies the problem.
2 Bellman's Equations
The expected utility function is
. Here
is the utility of a new PC.
is the utility of the old PC, to be replaced by the new one.
Consider a "horizon" of
years. During a year one can change PC only once, if one wishes.
Denote by
the maximal expected utility in the year
There are two possible decisions
. The decision
means to buy a new PC. The utility of the new PC is
. The utility of the old one is
.
The utility of the decision
, to keep the old PC in the last year
, is
. Here
is the price of refusing to buy a new PC. This includes the waiting losses
minus the price
of new PC in
the year
defined by (18.30). It is assumed that we abandon the old PC as soon as we obtain the new one.
Therefore, one "wins" the price
of the new PC by using the old PC.
The optimal decision in the last year
Denote by
the maximal expected utility in the year
.
Here
is the maximal expected utility in the year
,
if the utility of the old PC is
.
is a prior probability density of
defined by expression (18.28) at a time
.
Here
is obtained from this equation
The optimal decision in the year
Denote by
the maximal expected utility in the year
. Then
Here
is the maximal expected utility in the year
,
if the utility of the old PC is
.
is a prior probability density of
defined by expression (18.28) at a time
.
is obtained from the equation
The optimal decision in the year
Here the decision functions
depend on two variables
defining the corresponding utilities of the new and the old computer.
That is difficult for both the calculations and grafical representations.
One applies the discrete approximation such as in
Section 3.2. The optimal decision functions depend on two variables
and
.
Therefore, one needs
times more calculations to obtain the same accuracy as in the single marriage case (see Section 3.2).
To solve real life problems
the "directly unobservable" factors should be included into the "Buy-a-PC" model. For example , one can estimate the expected life-span
of PC, given the impression
,
by the Bayesian formula
Here
is the life-span of PC and
is the user's impression about its reliability. The impression
depends on the warranty,
the reputation of the manufacturer and on some subjective factors, too.
To simplify the calculations, the "One-Step-Ahead"
approximation is suggested. Here one assumes that the next step is the last one and introduces the parameter
.
The one-step-optimal decision
Following the same pattern, one defines the maximal expected utility in
-th year
where
Here
The one-step-optimal decision
The one-step-optimal decision
is a function of difference
that depends on two parameters
and
.
Therefore, one defines it as a function of critical pairs
depending on time
.
Solving these recurrent equations one
defines the sequence of one-step-optimal decision functions
and the expected utilities
. This should be done for
all possible values of
and
.
The One-Step-Ahead approximation,
assuming that
, reduces the computing time to a level of single marriage case.
However, one needs to estimate the approximation error.
The important application a of sequential statistical decisions is optimal ordering of supplies.
Consider, for example, same computer shop
that orders
units of RAM (Random Access Memory) at fixed times
dividing the time into
stages of unit length each.
1 Uncertain Demand
One optimizes the supply
when the demand
is unknown. Denote by
a number of RAM units demanded during one stage. Denote the retail price of one RAM unit by
and the
wholesale price by
.
The utility function-the profit-at the
-th stage
Here
and
,
is the number of unsold units.
is the loss due to short supply including fines for breaking supply contracts and/or damages to reputation as an reliable retailer.
All these parameters correspond to the stage
,
meaning that
where
is the demand at the stage
,
is the supply belonging to a feasible set
at this stage, and
is the number of unsold RAM units, left from the previous stage
.
Here and later we omit the lower indices to make expressions shorter.
Denote by
the probability
of demand
at the stage
.
Usually the demand
during the time
is described by the Poisson distribution [Tijms, 1994]
Considering stages of unit time
and
where
is the expected demand at the stage
.
Then the expected utility at the stage
The maximal expected utility at the last stage
where
is a number of unsold RAM units available at the beginning of the
-th stage
,
is a number of RAM units ordered at the beginning of the
-th stage,
is the demand at the stage
,
is a set of feasible supplies at the stage
.
Denote by
the optimal decision function maximizing the expected utility (18.55)
at any fixed
.
Denote by
the maximal expected utility at the
-th stage
Note, that the meaning of parameter
depends on the
index of the function
, for example,
, if the index is
, and
,
if this index is
, etc.
The maximum of (18.56) is obtained by the optimal decision function
where
.
Extending (18.56) to
one obtains
Solving recurrent equations (18.55)-(18.57) one
defines the sequence of optimal decision functions
and the optimal expected utility functions
.
This is done for
all possible remaining RAM units
and for all
stages
.
In the software terms that means calculating recurrently the sequence
of arrays
that shows how the optimal supply
and the maximal profit
depend on the
unsold RAM units
, left from the previous stages.
This way one reduces computing time considerably, because
array access operations need much less time as compared to multi-dimensional optimization.
Denote by
a probability that the wholesale price
at the stage
is
. Denote by
the expected wholesale price
at the stage
.
Assuming the Gaussian distribution of the wholesale price logarithm, one obtains the lognormal density function [Tijms, 1994]
where
means
,
is the shape parameter, and
is the scale parameter.
The expected value
and the standard
of the lognormal distribution
The lognormal density is unimodal with maximum at
.
In the case of random wholesale prices
,
the profit at the
-th stage
Here
,
is the number of unsold units.
Then the expected utility at the stage
The maximal expected utility at the last stage
where
is a number of unsold RAM units left at the beginning of
-th stage,
is the wholesale price at the stage
.
Denote by
the optimal decision function maximizing the expected utility (18.63)
at any fixed
.
Denote by
the maximal expected utility at the
-th stage
That is achieved by the optimal decision function
where
.
Extending to
one obtains
Solving recurrent equations (18.63)-(18.65) one
defines the sequence of optimal decision functions
and the optimal expected utilities
.
This is done for
all possible numbers of RAM units
available at the beginning of stages
.
We call by Market Elasticity the relation
of expected demand
to
the retail price
, where
is a set of feasible retail prices. A simple approximation is the exponential function
where
is the saturation demand,
is the elasticity parameter,
is the retail price, and
is a set of feasible retail prices, all at the stage
.
In this case
the profit function at the
-th stage
Here
denotes the difference
of retail and wholesale prices at the stage
.
The expected utility at the stage
where
.
The maximal expected utility at the last stage
Denote by
, where
denotes
, the two-dimensional optimal decision function maximizing the expected utility (18.69)
at any fixed
.
Denote by
the maximal expected utility at the
-th stage
That is achieved by the two-dimensional optimal decision function
where
means
.
Extending (18.70) to
one obtains
Solving recurrent equations (18.69)-(18.71) one
defines the sequence of decision functions
and the optimal expected profits
.
This is done for
all possible numbers
of RAM units
available at the beginning of
stages
.
In the case of market elasticity, to represent maximal profits
and two-dimensional optimal decisions
as functions of remaining RAM units
one needs to calculate a sequence of
arrays defining the sequence of two-dimensional decision functions.
This is difficult, therefore, one starts from the simplest case of uncertain demand (see section 5.1),
as usual.
Note that in the actual calculations of expressions (18.63)-(18.71) the integrals are replaced by sums and the densities by probabilities
using expressions similar to those in the section 3.2.
In the demonstration version the "horizon"
means the estimated end of business.
In the expert version one uses the "moving horizon"
, repeating the optimization after each stage. This way one simplifies calculations and
updates data.
The recurrent equations (18.55)-(18.57)
include Bride problems as special cases with only two feasible decisions
and different utility functions
(18.51).
Several interactive versions of the Bride problem are implemented
as Java applets. These, and the corresponding
software are
on web-sites (see section 4).
For this example one selects the section
"Dicrete, Linear and Dynamic Programming" in some web-site and opens "Bride-1".
Figure
18.1 shows the input window of the Bride problem
[Kajokas, 1997]
Figure 18.2 shows simulated results
of the "optimal marriage".
Figure 18.3 shows the
optimal decision function defining how the critical "goodness" of
a groom depends on his number.
Figure 18.1:
The input window of the Bride problem.
 |
Figure 18.2:
The result of statistical simulation of the
Bride problem.
 |
Figure 18.3:
The optimal decision function of the Bride
problem.
 |
The "Buy-a-Car" version applies modified "Single-Marriage" software.
Figure 18.4 shows the help window of the
"Buy-a-Car" version of the Bride problem [Mikenas, 1998].
Figure
18.5 shows the input window of the "Buy-a-Car"
problem.
Figure 18.6 shows the output window of the
"Buy-a-Car" problem.
Figure 18.7 shows the optimal
decision function of the "Buy-a-Car" problem.
Figure 18.4:
The help window solving the "Buy-a-Car"
problem.
 |
Figure 18.5:
The input window solving the "Buy-a-Car"
problem.
 |
Figure 18.6:
The output window of the "Buy-a-Car" problem.
 |
Figure 18.7:
The optimal decision function of the
"Buy-a-Car" problem.
 |
The "Buy-a-Car" algorithm is just a first approximation.
It should be improved. For example, improving the By-a-PC model one should
- remove the "single marriage" condition (18.8),
- add "Change-a-Car" cost,
- express desirable car properties (see Figure 18.5),
using conditions of Pareto optimality [Mockus, 1989b],
- extend the property list by including properties that are not directly observable, such as the reliability of a car,
- keep the "No-Return" rule.
"No-Return" means that one cannot return to a situation which he or she rejected previously. The "No-Return" rule is essential in the dynamic programming. One considers "Return"
cases in the framework of discrete or non-linear programming. That requires more computing, as usual.
First one selects the problem "Bride-Multi" in the section "Discrete, Dynamic and Linear Programming ".
The program [Damasevicius, 2000] starts by clicking the button
at the end of the
page.
Data is entered using the operation page (see Figure 18.8).
Optimization starts by clicking the "Calculate" button.
Figure 18.8:
Input page.
 |
The decision function ( see Figure 18.9) is defined by Bellman equations (18.33)(18.50).
Figure 18.9:
Example of a decision function.
 |
The "State-of Nature" generated by Monte Carlo simulation of probability distributions (18.1) (18.2) is also shown in Figure 18.9.
The jumps of the decision function correspond to two marriages.
In addition, Figure 18.10 shows the best "State-of-Nature" .
Figure 18.10:
Monte Carlo simulation of the"State of Nature".
 |
jonas mockus
2004-03-20