Save

Memory for Reward in Probabilistic Choice: Markovian and Non-Markovian Properties

In: Behaviour
Authors:
Derick G.S. Davis (Department of Psychology, Duke University, Durham, N. Carolina, U.S.A. 27706

Search for other papers by Derick G.S. Davis in
Current site
Google Scholar
PubMed
Close
and
J.E.R. Staddon (Department of Psychology, Duke University, Durham, N. Carolina, U.S.A. 27706

Search for other papers by J.E.R. Staddon in
Current site
Google Scholar
PubMed
Close
Download Citation Get Permissions

Access options

Get access to the full article by using one of the access options below.

Institutional Login

Log in with Open Athens, Shibboleth, or your institutional credentials

Login via Institution

Purchase

Buy instant access (PDF download and unlimited online access):

€36.93

Abstract

Pigeons were rewarded with food for pecking keys in various forms of two-armed bandit situation for an extended series of daily sessions in two experiments. The average daily preference (S=R/[R+L]) is very well fit by a markovian linear model in which predicted preference today is an average of predicted preference yesterday and reinforcement conditions today: s(N+1) = as(N) + (1-a)A(N+1), where A(N+1) is set equal to 1 when all rewards are for the Right response, and 0 when all are for the Left, and a is a longterm memory parameter. This linear model explains some apparent paradoxes in earlier reports of memory effects in two-armed bandit experiments. Nevertheless, closer examination of the details of preference changes within each experimental session showed several kinds of non-markovian effects. The most important was a regression at the beginning of each experimental session towards a preference characteristic of earlier sessions (spontaneous recovery). This effect, but not a smaller, less reliable non-markovian reminiscence effect, is consistent with a very simple rule, namely that the effect on preference of each individual reward for a Right or Left response is inversely related to how long ago the reward occurred. Thus, animals learn to prefer the rewarded side each day because these rewards are recent; but they regress to earlier preferences overnight because the most recent rewards become relatively less recent with lapse of time.

Content Metrics

All Time Past 365 days Past 30 Days
Abstract Views 355 27 1
Full Text Views 126 3 0
PDF Views & Downloads 22 2 0