|Résumé||(Exposé en Français)
We consider Markov decision processes with finitely many states and actions.
Roughly speaking, a Markov decision process is a Markov chain whose
execution is influenced by a player: at each step, depending on the current state
of the Markov chain, the player is allowed
to choose which probability distribution
will be used to make the transition to the next state.
His choices are restricted to a finite set of distributions,
also called *actions*.
The goal of the player is to maximize his expected payoff, which is computed
by a payoff function. Such a function associates to each infinite sequence of states and actions a real number.
For arbitrary payoff function, the player may have to play very complex
strategies to maximize his expected payoff.
We are interested in *positional* payoff functions that make the job
very simple for the player: whatever is the Markov decision process,
the player is able to play *optimally* with *no memory at all*.
More precisely, for each state, the player can choose again and again
the same action each time the process reaches this state.
We will present a theorem about positional payoff functions
which provides a unified proof of several results previously known,
as well as new examples of Markov decision processes
in which simple optimal strategies are guaranteed to exist.