The MDPtoolbox proposes functions related to the resolution of
discrete-time Markov Decision Processes: backwards induction, value iteration,
policy iteration, linear programming algorithms with some variants.
The toolbox is under BSD license.
It is currently available on several environment: MATLAB, GNU Octave, Scilab and R.
A Python development was also made by S. Cordwel.
If the toolbox was useful, please cite:
Chades I., Chapron G., Cros MJ., Garcia F., Sabbadin R. (2014). MDPtoolbox: a multi-platform toolbox to solve stochastic dynamic programming problems. Ecography 37:916-920.
The functions were first developped with
MATLAB (note that one of the functions requires the
Mathworks Optimization Toolbox and is not available in Scilab)
by the decision team of the
Applied Mathematics and Computer Science Unit of
Nowadays, the toolbox is a collaboration of persons in various organisms.
The version 3.0 (September 2009) adds several functions related to Reinforcement Learning and improves the handling of sparse matrices.
The version 4.0 (October 2012) is available on the Comprehensive R Archive Network (CRAN) and is compatible with GNU Octave (version 3.6), the output of several functions: mdp_relative_value_iteration, mdp_value_iteration and mdp_eval_policy_iterative, were modified.
The version 4.0.1 (January 2014) mainly improve the documentation and provide a QuickStart for MATLAB. For more detail see the README file.
About Markov Decision Processes
Markov Decision Processes. Martin L. Puterman. John Wiley&Sons, New-York, 1994.
Processus décisionnels de Markov en intelligence artificielle (Volume 1 et 2), sous la direction de O. Sigaud et O. Buffet , Lavoisier, 2008. (French)
Reinforcement Learning. O. Sigaud and F. Garcia. In O. Sigaud et O. Buffet (eds), Markov Decision Processes in Artificial Intelligence, ISTE/Wiley, ch. 2:49-73, 2010.