Nonlinear reinforcement learning of dynamic Nash equilibrium

The Digital Institutional Repository of IIM Bangalore

Please use this identifier to cite or link to this item: https://repository.iimb.ac.in/handle/2074/9777

Title:	Nonlinear reinforcement learning of dynamic Nash equilibrium
Authors:	Mohapatra, Prakash
Keywords:	Marketing management
Issue Date:	2013
Publisher:	Indian Institute of Management Bangalore
Series/Report no.:	EPGP_P13_09
Abstract:	In this paper, we make a three-fold contribution to the domain of reinforcement learning of equilibrium in the framework of nonzero-sum stochastic dynamic games. First of all, we extend the techniques of Q( )- learning to the multi-player setup. We also extend the idea of polynomial learning rate to this domain for faster convergence. Most importantly, we propose a novel nonlinear learning algorithm which eliminates the learning starvation typical of such linear learning algorithms such as Q( )-learning. Prior work in the reinforcement learning domain is mainly restricted to linear techniques which lead to learning starvation. Our learning objective is the in nite horizon discounted pay-o criterion which is used to estimate the long term market equilibria. We have applied this model to a real life business case to analyze the competition between ARM and Intel in the smart phone microprocessor market. The model is restricted to a duopoly; however, the work can be easily extended to the more general case. We have estimated the market equilibrium payo s for this set-up and proposed some business insights based on our ndings.
URI:	http://repository.iimb.ac.in/handle/2074/9777
Appears in Collections:	2010-2015

File	Size	Format
EPGP_P13_1214046.pdf	433.47 kB	Adobe PDF	View/Open Request a copy

Check