Working Paper

Counter Intuitive Learning: An Exploratory Study

Nobuyuki Hanaki, Alan P. Kirman, Paul Pezanis-Christou
CESifo, Munich, 2016

CESifo Working Paper No. 6029

The literature on learning in unknown environments emphasises reinforcing on actions which produce positive results. But, in some cases, success requires shifting from a currently successful actions to others. We examine, experimentally and theoretically in a very simple framework, how individuals initially learn by exploiting information from the pay-offs of actions taken but also from exploring new actions. We analyse if and how they learn that pay-offs are inter-temporally dependent. We then ran the same experiments but where individuals could observe the actions taken or the pay-offs obtained by others or both. Such observations improved pay-offs if one of the pair had learned to obtain the maximum pay-off.

CESifo Category
Behavioural Economics
Empirical and Theoretical Methods

Keywords: multi-armed bandit, reinforcement learning, eureka moment, pay-off patterns, observational learning

JEL Classification: D810, D830