Maximum likelihood approach for model-free inverse reinforcement learning
MetadataShow full item record
Preparing an intelligent system in advance to respond optimally in every possible situation is difficult. Machine learning approaches like Inverse Reinforcement Learning can help learning behavior using a limited number of demonstrations. We present a model-free technique by applying maximum likelihood estimation to an IRL problem. To make our approach model-free, we model the environment using the canonical Markov Decision Process tuple, except we exclude the transition function. We define our reward function as a linear function of a known set of features. We use a modified Q-learning technique, called Q-Averaging. The direction for optimization is guided by the gradient of the likelihood function for current feature weights until the unknown reward function is identified. Experimental results over a grid world problem support our model-free representation of an IRL technique. We also extend our experiments to real-world freeway merging problem of autonomous cars and the results are significant.