Learning, Optimal Stopping, and the Odds Algorithm
Optimal stopping has usually a well-defined meaning. This is to think of a decision model and an objective which can be dressed into a suitable objective function to be optimized, where the “variable” is then a stopping time. But in some cases, as for instance in sequential decision problems in business and finance, we would like to be more flexible, since a good model may not be obvious at the beginning. But then if we would like to decide on the choice after an initial observation period, the optimisation part often becomes shaky. We will argue, that the objective to stop on a "last interesting event” is one of the positive exceptions and worth to attract our attention.The so-called odds-algorithm stands out for this type of problems in the case of independent events with known probabilities. How far can we go with it if we weaken the requirements of independence, or if the probabilities for the different events are a priori unknown. How good are plug-in odds-algorithms and other "learning forms" of the odds-algorithm? We will give several examples showing that, for decision processes with independent increments, many problems can be solved. This will motivate us to look at a new surprising result of F. Delbaen which decomposes martingales into (infinite) sums of martingales with independent increments.