In
behavioral psychology
Behaviorism is a systematic approach to understanding the behavior of humans and animals. It assumes that behavior is either a reflex evoked by the pairing of certain antecedent stimuli in the environment, or a consequence of that individual' ...
, reinforcement is a
consequence
Consequence may refer to:
* Logical consequence, also known as a ''consequence relation'', or ''entailment''
* In operant conditioning, a result of some behavior
* Consequentialism, a theory in philosophy in which the morality of an act is determi ...
applied that will strengthen an organism's future behavior whenever that behavior is preceded by a specific
antecedent stimulus. This strengthening effect may be measured as a higher frequency of behavior (e.g., pulling a lever more frequently), longer duration (e.g., pulling a lever for longer periods of time), greater magnitude (e.g., pulling a lever with greater force), or shorter latency (e.g., pulling a lever more quickly following the antecedent stimulus).
The model of self-regulation has three main aspects of human behavior, which are self-awareness, self-reflection, and self-regulation. Reinforcements traditionally align with self-regulation. The behavior can be influenced by the consequence but behavior also needs antecedents. There are four types of reinforcement: positive reinforcement, negative reinforcement, extinction, and punishment. Positive reinforcement is the application of a positive reinforcer. Negative reinforcement is the practice of removing something negative from the space of the subject as a way to encourage the antecedent behavior from that subject.
Extinction involves a behavior that requires no contingent consequence. If something (good or bad) is not reinforced, it should in theory disappear. Lastly, punishment is an imposition of aversive consequence upon undesired behavior. Punishment by removal is a common example or removing a benefit following poor performance. While reinforcement does not require an individual to consciously perceive an effect elicited by the stimulus, it still requires conscious effort to work towards a desired goal.
Rewarding stimuli
The reward system (the mesocorticolimbic circuit) is a group of neural structures responsible for incentive salience (i.e., "wanting"; desire or craving for a reward and motivation), associative learning (primarily positive reinforcement and clas ...
, which are associated with
"wanting" and "liking" (desire and pleasure, respectively) and appetitive behavior, function as
positive reinforcer
In behavioral psychology, reinforcement is a consequence applied that will strengthen an organism's future behavior whenever that behavior is preceded by a specific antecedent stimulus. This strengthening effect may be measured as a higher fre ...
s;
the
converse statement is also true: positive reinforcers provide a desirable stimulus.
Reinforcement does not require an individual to consciously perceive an effect elicited by the stimulus.
Thus, reinforcement occurs only if there is an observable strengthening in behavior. However, there is also negative reinforcement, which is characterized by taking away an undesirable stimulus. Changing someone's job might serve as a negative reinforcer to someone who has back problems, (e.g. changing from a laborer's job to an office position).
In most cases, the term "reinforcement" refers to an enhancement of behavior, but this term is also sometimes used to denote an enhancement of memory; for example, "post-training reinforcement" refers to the provision of a stimulus (such as food) after a learning session in an attempt to increase the retained breadth, detail, and duration of the individual memories or overall memory just formed.
The memory-enhancing stimulus can also be one whose effects are directly rather than only indirectly emotional, as with the phenomenon of "
flashbulb memory
A flashbulb memory is a vivid, long-lasting memory about a surprising or shocking event that has happened in the past.
The term "flashbulb memory" suggests the surprise, indiscriminate illumination, detail, and brevity of a photograph; however fl ...
," in which an emotionally highly intense stimulus can incentivize memory of a set of a situation's circumstances well beyond the subset of those circumstances that caused the emotionally significant stimulus, as when people of appropriate age are able to remember where they were and what they were doing when they learned of the
assassination of John F. Kennedy
John F. Kennedy, the 35th president of the United States, was assassinated on Friday, November 22, 1963, at 12:30 p.m. CST in Dallas, Texas, while riding in a presidential motorcade through Dealey Plaza. Kennedy was in the vehicle with ...
or
September 11 terrorist attacks
The September 11 attacks, commonly known as 9/11, were four coordinated suicide terrorist attacks carried out by al-Qaeda against the United States on Tuesday, September 11, 2001. That morning, nineteen terrorists hijacked four commercial ...
.
Reinforcement is an important part of
operant or instrumental conditioning.
Terminology
In the behavioral sciences, the terms "positive" and "negative" refer when used in their strict technical sense to the nature of the action performed by the conditioner rather than to the responding operant's evaluation of that action and its consequence(s). "Positive" actions are those that add a factor, be it pleasant or unpleasant, to the environment, whereas "negative" actions are those that remove or withhold from the environment a factor of either type. In turn, the strict sense of "reinforcement" refers only to reward-based conditioning; the introduction of unpleasant factors and the removal or withholding of pleasant factors are instead referred to as "punishment", which when used in its strict sense thus stands in contradistinction to "reinforcement". Thus, "positive reinforcement" refers to the addition of a pleasant factor, "positive punishment" refers to the addition of an unpleasant factor, "negative reinforcement" refers to the removal or withholding of an unpleasant factor, and "negative punishment" refers to the removal or withholding of a pleasant factor.
This usage is at odds with some non-technical usages of the four term combinations, especially in the case of the term "negative reinforcement", which is often used to denote what technical parlance would describe as "positive punishment" in that the non-technical usage interprets "reinforcement" as subsuming both reward and punishment and "negative" as referring to the responding operant's evaluation of the factor being introduced. By contrast, technical parlance would use the term "negative reinforcement" to describe encouragement of a given behavior by creating a scenario in which an unpleasant factor is or will be present but engaging in the behavior results in either escaping from that factor or preventing its occurrence, as in
Martin Seligman's
experiments involving dogs' learning processes regarding the avoidance of electric shock.
Introduction
B.F. Skinner
Burrhus Frederic Skinner (March 20, 1904 – August 18, 1990) was an American psychologist, behaviorist, author, inventor, and social philosopher. He was a professor of psychology at Harvard University from 1958 until his retirement in 1974.
...
was a well-known and influential researcher who articulated many of the theoretical constructs of reinforcement and
behaviorism
Behaviorism is a systematic approach to understanding the behavior of humans and animals. It assumes that behavior is either a reflex evoked by the pairing of certain antecedent (behavioral psychology), antecedent stimuli in the environment, o ...
. Skinner defined reinforcers according to the change in response strength (response rate) rather than to more subjective criteria, such as what is pleasurable or valuable to someone. Accordingly, activities, foods or items considered pleasant or enjoyable may not necessarily be reinforcing (because they produce no increase in the response preceding them). Stimuli, settings, and activities only fit the definition of reinforcers if the behavior that immediately precedes the potential reinforcer increases in similar situations in the future; for example, a child who receives a cookie when he or she asks for one. If the frequency of "cookie-requesting behavior" increases, the cookie can be seen as reinforcing "cookie-requesting behavior". If however, "cookie-requesting behavior" does not increase the cookie cannot be considered reinforcing.
The sole criterion that determines if a stimulus is reinforcing is the change in probability of a behavior after administration of that potential reinforcer. Other theories may focus on additional factors such as whether the person expected a behavior to produce a given outcome, but in the behavioral theory, reinforcement is defined by an increased probability of a response.
The study of reinforcement has produced an enormous body of
reproducible
Reproducibility, also known as replicability and repeatability, is a major principle underpinning the scientific method. For the findings of a study to be reproducible means that results obtained by an experiment or an observational study or in a ...
experimental results. Reinforcement is the central concept and procedure in
special education
Special education (known as special-needs education, aided education, exceptional education, alternative provision, exceptional student education, special ed., SDC, or SPED) is the practice of educating students in a way that accommodates th ...
,
applied behavior analysis
Applied behavior analysis (ABA), also called behavioral engineering, is a psychological intervention that applies empirical approaches based upon the principles of respondent and operant conditioning to change behavior of social significance. ...
, and the
experimental analysis of behavior The experimental analysis of behavior is school of thought in psychology founded on B. F. Skinner's philosophy of radical behaviorism and defines the basic principles used in applied behavior analysis. A central principle was the inductive reasoning ...
and is a core concept in some medical and
psychopharmacology
Psychopharmacology (from Greek grc, ψῡχή, psȳkhē, breath, life, soul, label=none; grc, φάρμακον, pharmakon, drug, label=none; and grc, -λογία, -logia, label=none) is the scientific study of the effects drugs have on mo ...
models, particularly
addiction
Addiction is a neuropsychological disorder characterized by a persistent and intense urge to engage in certain behaviors, one of which is the usage of a drug, despite substantial harm and other negative consequences. Repetitive drug use o ...
,
dependence, and
compulsion
Compulsion may refer to:
* Compulsive behavior, a psychological condition in which a person does a behavior compulsively, having an overwhelming feeling that they must do so.
* Obsessive–compulsive disorder, a mental disorder characterized by i ...
.
Brief history
Laboratory research on reinforcement is usually dated from the work of
Edward Thorndike
Edward Lee Thorndike (August 31, 1874 – August 9, 1949) was an American psychologist who spent nearly his entire career at Teachers College, Columbia University. His work on comparative psychology and the learning process led to the theory o ...
, known for his experiments with cats escaping from puzzle boxes. A number of others continued this research, notably B.F. Skinner, who published his seminal work on the topic in The Behavior of Organisms, in 1938, and elaborated this research in many subsequent publications. Notably Skinner argued that positive reinforcement is superior to punishment in shaping behavior.
Though punishment may seem just the opposite of reinforcement, Skinner claimed that they differ immensely, saying that positive reinforcement results in lasting
behavioral modification
Behavior modification is an early approach that used respondent and operant conditioning to change behavior. Based on methodological behaviorism, overt behavior was modified with consequences, including positive and negative reinforcement contin ...
(long-term) whereas punishment changes behavior only temporarily (short-term) and has many detrimental side-effects.
A great many researchers subsequently expanded our understanding of reinforcement and challenged some of Skinner's conclusions. For example, Azrin and Holz defined punishment as a “consequence of behavior that reduces the future probability of that behavior,”
and some studies have shown that positive reinforcement and punishment are equally effective in modifying behavior. Research on the effects of positive reinforcement, negative reinforcement and punishment continue today as those concepts are fundamental to learning theory and apply to many practical applications of that theory.
Operant conditioning
The term ''operant conditioning'' was introduced by B. F. Skinner to indicate that in his experimental paradigm, the organism is free to operate on the environment. In this paradigm, the experimenter cannot trigger the desirable response; the experimenter waits for the response to occur (to be emitted by the organism) and then a potential reinforcer is delivered. In the
classical conditioning
Classical conditioning (also known as Pavlovian or respondent conditioning) is a behavioral procedure in which a biologically potent stimulus (e.g. food) is paired with a previously neutral stimulus (e.g. a triangle). It also refers to the learni ...
paradigm, the experimenter triggers (elicits) the desirable response by presenting a reflex eliciting stimulus, the ''Unconditional Stimulus'' (UCS), which he pairs (precedes) with a neutral stimulus, the ''Conditional Stimulus'' (CS).
''Reinforcement'' is a basic term in operant conditioning. For the punishment aspect of operant conditioning, see
punishment (psychology)
In operant conditioning, punishment is any change in a human or animal's surroundings which, occurring after a given behavior or response, reduces the likelihood of that behavior occurring again in the future. As with reinforcement, it is the ''b ...
.
Positive reinforcement
Positive reinforcement occurs when a
desirable event or stimulus is presented as a consequence of a behavior and the chance that this behavior will manifest in similar environments increases.
* Example: Whenever a rat presses a button, it gets a treat. If the rat starts pressing the button more often, the treat serves to positively reinforce this behavior.
* Example: A father gives candy to his daughter when she tidies up her toys. If the frequency of picking up the toys increases, the candy is a positive reinforcer (to reinforce the behavior of cleaning up).
* Example: A company enacts a rewards program in which employees earn prizes dependent on the number of items sold. The prizes the employees receive are the positive reinforcement if they increase sales.
* Example: A teacher praises his student when he receives a good grade. The praise the student receives is the positive reinforcement in case the student's grades improve.
* Example: A supervisor attaches a monetary reward for the employee who exceeds expectations the most. The monetary reward is the positive reinforcement of the good behavior: exceeding expectations.
The
High Probability Instruction (HPI) treatment is a
behaviorist
Behaviorism is a systematic approach to understanding the behavior of humans and animals. It assumes that behavior is either a reflex evoked by the pairing of certain antecedent stimuli in the environment, or a consequence of that individual' ...
psychological treatment
Psychology is the scientific study of mind and behavior. Psychology includes the study of conscious and unconscious phenomena, including feelings and thoughts. It is an academic discipline of immense scope, crossing the boundaries betwe ...
based on the idea of positive reinforcement.
Negative reinforcement
Negative reinforcement occurs when the rate of a behavior increases because an
aversive event or stimulus is removed or prevented from happening.
[
* Example: A child cleans their room, and this behavior is followed by the parent stopping "nagging" or asking the child repeatedly to do so. Here, the nagging serves to negatively reinforce the behavior of cleaning because the child wants to remove that aversive stimulus of nagging.
* Example: A company has a policy that if an employee completes their assigned work by Friday, they can have Saturday off. Working Saturday is the aversive stimulus; the employees have incentive to increase productivity to avoid the aversive stimulus.
*Example: An individual leaves early for work to beat traffic and avoid arriving late. The behavior is leaving early for work, and the aversive stimulus the individual wishes to remove is being late to work.
]
Extinction
Extinction can be intentional or unintentional and happens when an undesired behavior is ignored.
* Example (Intended): A young child ignores bullies making fun of them. The bullies do not get a reaction from the child and lose interest in bullying them.
* Example (Unintended): A worker has not received any recognition for their above and beyond hard work. They then stop working as hard.
* Example (Intended): A cat kept meowing for food in the night. The owners would not feed the cat so the cat stopped meowing through the night.
Reinforcement versus punishment
Reinforcers serve to increase behaviors whereas punishers serve to decrease behaviors; thus, positive reinforcers are stimuli that the subject will work to attain, and negative reinforcers are stimuli that the subject will work to be rid of or to end. The table below illustrates the adding and subtracting of stimuli (pleasant or aversive) in relation to reinforcement vs. punishment.
For example, offering a child candy if he cleans his room is positive reinforcement. Spanking a child if he breaks a window is positive punishment. Taking away a child's toys for misbehaving is negative punishment. Giving a child a break from his chores if he performs well on a test is negative reinforcement. "Positive and negative" do not carry the meaning of "good and bad" in this usage.
Further ideas and concepts
* Distinguishing between positive and negative can be difficult and may not always be necessary; focusing on ''what'' is being removed or added and ''how'' it is being removed or added will determine the nature of the reinforcement.
* Negative reinforcement is not punishment. The two, as explained above, differ in the increase (negative reinforcement) or decrease (punishment) of the future probability of a response. In negative reinforcement, the stimulus removed following a response is an aversive stimulus; if this stimulus were presented contingent on a response, it may also function as a positive punisher.
* The form of a stimulus is separate from its function in terms of whether it will reinforce or punish behavior. An event that may punish behavior for some may serve to reinforce behavior for others. Example: A child is repeatedly given detention for acting up in school, but the frequency of the bad behavior increases. Thus, the detention may be a reinforcer (could be positive or negative); perhaps the child now gets one-on-one attention from a teacher or perhaps they now avoid going home where they are often abused.
* Some reinforcement can be simultaneously positive and negative, such as a drug addict taking drugs for the added euphoria (a positive feeling) and eliminating withdrawal symptoms (which would be a negative feeling). Or, in a warm room, a current of external air serves as positive reinforcement because it is pleasantly cool and as negative reinforcement because it removes uncomfortable hot air.
* Reinforcement in the business world is essential in driving productivity. Employees are constantly motivated by the ability to receive a positive stimulus, such as a promotion or a bonus. Employees are also driven by negative reinforcement. This can be seen when employees are offered Saturdays off if they complete the weekly workload by Friday.
* Though negative reinforcement has a positive effect in the short term for a workplace (i.e. encourages a financially beneficial action), over-reliance on a negative reinforcement hinders the ability of workers to act in a creative, engaged way creating growth in the long term.
* Both positive and negative reinforcement ''increase'' behavior. Most people, especially children, will learn to follow instruction by a mix of positive and negative reinforcement.
* Limited resources can cause a person to not be able to provide constant reinforcement.
Primary and secondary reinforcers
A primary reinforcer, sometimes called an ''unconditioned reinforcer'', is a stimulus that does not require pairing with a different stimulus in order to function as a reinforcer and most likely has obtained this function through the evolution and its role in species' survival. Examples of primary reinforcers include food, water, and sex. Some primary reinforcers, such as certain drugs, may mimic the effects of other primary reinforcers. While these primary reinforcers are fairly stable through life and across individuals, the reinforcing value of different primary reinforcers varies due to multiple factors (e.g., genetics, experience). Thus, one person may prefer one type of food while another avoids it. Or one person may eat much food while another eats very little. So even though food is a primary reinforcer for both individuals, the value of food as a reinforcer differs between them.
A secondary reinforcer, sometimes called a conditioned reinforcer, is a stimulus or situation that has acquired its function as a reinforcer after pairing with a stimulus that functions as a reinforcer. This stimulus may be a primary reinforcer or another conditioned reinforcer (such as money). An example of a secondary reinforcer would be the sound from a clicker, as used in clicker training
Clicker training is a positive reinforcement animal training method based on a bridging stimulus ( the clicker) in operant conditioning. The system uses conditioned reinforcers, which a trainer can deliver more quickly and more precisely than pri ...
. The sound of the clicker has been associated with praise or treats, and subsequently, the sound of the clicker may function as a reinforcer. Another common example is the sound of people clapping – there is nothing inherently positive about hearing that sound, but we have learned that it is associated with praise and rewards.
When trying to distinguish primary and secondary reinforcers in human examples, use the "caveman test." If the stimulus is something that a caveman would naturally find desirable (e.g., candy) then it is a primary reinforcer. If, on the other hand, the caveman would not react to it (e.g., a dollar bill), it is a secondary reinforcer. As with primary reinforcers, an organism can experience satisfaction and deprivation with secondary reinforcers.
Other reinforcement terms
* A generalized reinforcer is a conditioned reinforcer that has obtained the reinforcing function by pairing with many other reinforcers and functions as a reinforcer under a wide-variety of motivating operation
Motivating operation (MO) is a behavioristic concept introduced by Jack Michael in 1982. It is used to explain variations in the effects in the consequences of behavior. Most importantly, an MO affects how strongly the person is reinforced or pu ...
s. (One example of this is money because it is paired with many other reinforcers).[Miltenberger, R. G. "Behavioral Modification: Principles and Procedures". ]Thomson/Wadsworth
Cengage Group is an American educational content, technology, and services company for the higher education, K-12, professional, and library markets. It operates in more than 20 countries around the world.(Jun 27, 2014Global Publishing Leaders ...
, 2008.
* In reinforcer sampling, a potentially reinforcing but unfamiliar stimulus is presented to an organism without regard to any prior behavior.
* Socially-mediated reinforcement (direct reinforcement) involves the delivery of reinforcement that requires the behavior of another organism.
* The Premack principle The Premack principle, or the relativity theory of reinforcement, states that more probable behaviors will reinforce less probable behaviors.
Origin and description
The Premack principle was derived from a study of Cebus monkeys by David Premack. ...
is a special case of reinforcement elaborated by David Premack
David Premack (October 26, 1925 – June 11, 2015) was an American psychologist who was a professor of psychology at the University of Pennsylvania. He was educated at the University of Minnesota when logical positivism was in full bloom. The de ...
, which states that a highly preferred activity can be used effectively as a reinforcer for a less-preferred activity.[
* Reinforcement hierarchy is a list of actions, rank-ordering the most desirable to least desirable consequences that may serve as a reinforcer. A reinforcement hierarchy can be used to determine the relative frequency and desirability of different activities, and is often employed when applying the Premack principle.
* Contingent outcomes are more likely to reinforce behavior than non-contingent responses. Contingent outcomes are those directly linked to a ]causal
Causality (also referred to as causation, or cause and effect) is influence by which one event, process, state, or object (''a'' ''cause'') contributes to the production of another event, process, state, or object (an ''effect'') where the ca ...
behavior, such a light turning on being contingent on flipping a switch. Note that contingent outcomes are ''not'' necessary to demonstrate reinforcement, but perceived contingency may increase learning.
* Contiguous stimuli are stimuli closely associated by time and space with specific behaviors. They reduce the amount of time needed to learn a behavior while increasing its resistance to extinction
Extinction is the termination of a kind of organism or of a group of kinds (taxon), usually a species. The moment of extinction is generally considered to be the death of the last individual of the species, although the capacity to breed and ...
. Giving a dog a piece of food immediately after sitting is more contiguous with (and therefore more likely to reinforce) the behavior than a several minute delay in food delivery following the behavior.
* Noncontingent reinforcement refers to response-independent delivery of stimuli identified as reinforcers for some behaviors of that organism. However, this typically entails time-based delivery of stimuli identified as maintaining aberrant behavior, which decreases the rate of the target behavior. As no measured behavior is identified as being strengthened, there is controversy surrounding the use of the term noncontingent "reinforcement".
Natural and artificial
In his 1967 paper, ''Arbitrary and Natural Reinforcement'', Charles Ferster
Charles Bohris Ferster (1 November 1922 – 3 February 1981) was an American behavioral psychologist. A pioneer of applied behavior analysis, he developed errorless learning and was a colleague of B.F. Skinner's at Harvard University, co-autho ...
proposed classifying reinforcement into events that increase frequency of an operant as a natural consequence of the behavior itself, and events that are presumed to affect frequency by their requirement of human mediation, such as in a token economy
A token economy is a system of contingency management based on the systematic reinforcement of target behavior. The reinforcers are symbols or tokens that can be exchanged for other reinforcers. A token economy is based on the principles of o ...
where subjects are "rewarded" for certain behavior with an arbitrary token of a negotiable value.
In 1970, Baer and Wolf created a name for the use of natural reinforcers called "behavior traps". A behavior trap requires only a simple response to enter the trap, yet once entered, the trap cannot be resisted in creating general behavior change. It is the use of a behavioral trap that increases a person's repertoire, by exposing them to the naturally occurring reinforcement of that behavior. Behavior traps have four characteristics:
* They are "baited" with virtually irresistible reinforcers that "lure" the student to the trap
* Only a low-effort response already in the repertoire is necessary to enter the trap
* Interrelated contingencies of reinforcement inside the trap motivate the person to acquire, extend, and maintain targeted academic/social skills
* They can remain effective for long periods of time because the person shows few, if any, satiation effects
As can be seen from the above, artificial reinforcement is in fact created to build or develop skills, and to generalize, it is important that either a behavior trap is introduced to "capture" the skill and utilize naturally occurring reinforcement to maintain or increase it. This behavior trap may simply be a social situation that will generally result from a specific behavior once it has met a certain criterion (e.g., if you use edible reinforcers to train a person to say hello and smile at people when they meet them, after that skill has been built up, the natural reinforcer of other people smiling, and having more friendly interactions will naturally reinforce the skill and the edibles can be faded).
Intermittent reinforcement schedules
Much behavior is not reinforced every time it is emitted, and the pattern of intermittent reinforcement strongly affects how fast an operant response is learned, what its rate is at any given time, and how long it continues when reinforcement ceases. The simplest rules controlling reinforcement are continuous reinforcement, where every response is reinforced, and extinction, where no response is reinforced. Between these extremes, more complex "schedules of reinforcement" specify the rules that determine how and when a response will be followed by a reinforcer.
Specific schedules of reinforcement reliably induce specific patterns of response, irrespective of the species being investigated (including humans in some conditions). However, the quantitative properties of behavior under a given schedule depend on the parameters of the schedule, and sometimes on other, non-schedule factors. The orderliness and predictability of behavior under schedules of reinforcement was evidence for B.F. Skinner
Burrhus Frederic Skinner (March 20, 1904 – August 18, 1990) was an American psychologist, behaviorist, author, inventor, and social philosopher. He was a professor of psychology at Harvard University from 1958 until his retirement in 1974.
...
's claim that by using operant conditioning he could obtain "control over behavior", in a way that rendered the theoretical disputes of contemporary comparative psychology obsolete. The reliability of schedule control supported the idea that a radical behaviorist experimental analysis of behavior The experimental analysis of behavior is school of thought in psychology founded on B. F. Skinner's philosophy of radical behaviorism and defines the basic principles used in applied behavior analysis. A central principle was the inductive reasoning ...
could be the foundation for a psychology
Psychology is the scientific study of mind and behavior. Psychology includes the study of conscious and unconscious phenomena, including feelings and thoughts. It is an academic discipline of immense scope, crossing the boundaries betwe ...
that did not refer to mental or cognitive processes. The reliability of schedules also led to the development of applied behavior analysis
Applied behavior analysis (ABA), also called behavioral engineering, is a psychological intervention that applies empirical approaches based upon the principles of respondent and operant conditioning to change behavior of social significance. ...
as a means of controlling or altering behavior.
Many of the simpler possibilities, and some of the more complex ones, were investigated at great length by Skinner using pigeons
Columbidae () is a bird family consisting of doves and pigeons. It is the only family in the order Columbiformes. These are stout-bodied birds with short necks and short slender bills that in some species feature fleshy ceres. They primarily ...
, but new schedules continue to be defined and investigated.
Simple schedules
* Ratio schedule – the reinforcement depends only on the number of responses the organism has performed.
* Continuous reinforcement (CRF) – a schedule of reinforcement in which every occurrence of the instrumental response (desired response) is followed by the reinforcer.[
** Lab example: each time a rat presses a bar it gets a pellet of food.
** Real-world example: each time a dog defecates outside its owner gives it a treat; each time a person puts $1 in a candy machine and presses the buttons they receive a candy bar.
Simple schedules have a single rule to determine when a single type of reinforcer is delivered for a specific response.
* Fixed ratio (FR) – schedules deliver reinforcement after every ''n''th response.][ An FR 1 schedule is synonymous with a CRF schedule.
** Example: FR 2 = every second desired response the subject makes is reinforced.
** Lab example: FR 5 = rat's bar-pressing behavior is reinforced with food after every 5 bar-presses in a ]Skinner box
An operant conditioning chamber (also known as a Skinner box) is a laboratory apparatus used to study animal behavior. The operant conditioning chamber was created by B. F. Skinner while he was a graduate student at Harvard University. The cham ...
.
** Real-world example: FR 10 = Used car dealer gets a $1000 bonus for each 10 cars sold on the lot.
* Variable ratio schedule (VR) – reinforced on average every ''n''th response, but not always on the ''n''th response.[
** Lab example: VR 4 = first pellet delivered on 2 bar presses, second pellet delivered on 6 bar presses, third pellet 4 bar presses (2 + 6 + 4 = 12; 12 / 3= 4 bar presses to receive pellet).
** Real-world example: slot machines (because, though the probability of hitting the jackpot is constant, the number of lever presses needed to hit the jackpot is variable).
* Fixed interval (FI) – reinforced after ''n'' amount of time.
** Example: FI 1-s = reinforcement provided for the first response after 1 second.
** Lab example: FI 15-s = rat's bar-pressing behavior is reinforced for the first bar press after 15 seconds passes since the last reinforcement.
** Real-world example: FI 30-min = a 30-minute washing machine cycle.
* Variable interval (VI) – reinforced on an average of ''n'' amount of time, but not always exactly ''n'' amount of time.][
** Example: VI 4-min = first pellet delivered after 2 minutes, second delivered after 6 minutes, third is delivered after 4 minutes (2 + 6 + 4 = 12; 12 / 3 = 4). Reinforcement is delivered on the average after 4 minutes.
** Lab example: VI 10-s = a rat's bar-pressing behavior is reinforced for the first bar press after an average of 10 seconds passes since the last reinforcement.
** Real-world example: VI 30-min = Going fishing—you might catch a fish after 10 minutes, then have to wait an hour, then have to wait 20 minutes.
* Fixed time (FT) – Provides a reinforcing stimulus at a fixed time since the last reinforcement delivery, regardless of whether the subject has responded or not. In other words, it is a non-contingent schedule.
** Lab example: FT 5-s = rat gets food every 5 seconds regardless of the behavior.
** Real-world example: FT 30-d = a person gets an annuity check every month regardless of behavior between checks
* Variable time (VT) – Provides reinforcement at an average variable time since last reinforcement, regardless of whether the subject has responded or not.
Simple schedules are utilized in many differential reinforcement procedures:
* Differential reinforcement of alternative behavior (DRA) - A conditioning procedure in which an undesired response is decreased by placing it on ]extinction
Extinction is the termination of a kind of organism or of a group of kinds (taxon), usually a species. The moment of extinction is generally considered to be the death of the last individual of the species, although the capacity to breed and ...
or, less commonly, providing contingent punishment, while simultaneously providing reinforcement contingent on a desirable response. An example would be a teacher attending to a student only when they raise their hand, while ignoring the student when he or she calls out.
* Differential reinforcement of other behavior (DRO) – Also known as omission training procedures, an instrumental conditioning procedure in which a positive reinforcer is periodically delivered only if the participant does something other than the target response. An example would be reinforcing any hand action other than nose picking.
* Differential reinforcement of incompatible behavior (DRI) – Used to reduce a frequent behavior without punishing it by reinforcing an incompatible response. An example would be reinforcing clapping to reduce nose picking
* Differential reinforcement of low response rate (DRL) – Used to encourage low rates of responding. It is like an interval schedule, except that premature responses reset the time required between behavior.
** Lab example: DRL 10-s = a rat is reinforced for the first response after 10 seconds, but if the rat responds earlier than 10 seconds there is no reinforcement and the rat has to wait 10 seconds from that premature response without another response before bar pressing will lead to reinforcement.
** Real-world example: "If you ask me for a potato chip no more than once every 10 minutes, I will give it to you. If you ask more often, I will give you none."
* Differential reinforcement of high rate (DRH) – Used to increase high rates of responding. It is like an interval schedule, except that a minimum number of responses are required in the interval in order to receive reinforcement.
** Lab example: DRH 10-s/FR 15 = a rat must press a bar 15 times within a 10-second increment to get reinforced.
** Real-world example: "If Lance Armstrong
Lance Edward Armstrong ('' né'' Gunderson; born September 18, 1971) is an American former professional road racing cyclist. Regarded as a sports icon for winning the Tour de France seven consecutive times from 1999 to 2005 after recovering fr ...
is going to win the Tour de France
The Tour de France () is an annual men's multiple-stage bicycle race primarily held in France, while also occasionally passing through nearby countries. Like the other Grand Tours (the Giro d'Italia and the Vuelta a España), it consists ...
he has to pedal ''x'' number of times during the ''y''-hour race."
Effects of different types of simple schedules
* Fixed ratio: activity slows after reinforcer is delivered, then response rates increase until the next reinforcer delivery (post-reinforcement pause).
* Variable ratio: rapid, steady rate of responding; most resistant to extinction
Extinction is the termination of a kind of organism or of a group of kinds (taxon), usually a species. The moment of extinction is generally considered to be the death of the last individual of the species, although the capacity to breed and ...
.
* Fixed interval: responding increases towards the end of the interval; poor resistance to extinction.
* Variable interval: steady activity results, good resistance to extinction.
* Ratio schedules produce higher rates of responding than interval schedules, when the rates of reinforcement are otherwise similar.
* Variable schedules produce higher rates and greater resistance to extinction
Extinction is the termination of a kind of organism or of a group of kinds (taxon), usually a species. The moment of extinction is generally considered to be the death of the last individual of the species, although the capacity to breed and ...
than most fixed schedules. This is also known as the Partial Reinforcement Extinction Effect (PREE).
* The variable ratio schedule produces both the highest rate of responding and the greatest resistance to extinction (for example, the behavior of gambler
Gambling (also known as betting or gaming) is the wagering of something of value ("the stakes") on a random event with the intent of winning something else of value, where instances of strategy are discounted. Gambling thus requires three elem ...
s at slot machine
A slot machine (American English), fruit machine (British English) or poker machine (Australian English and New Zealand English) is a gambling machine that creates a game of chance for its customers. Slot machines are also known pejoratively a ...
s).
* Fixed schedules produce "post-reinforcement pauses" (PRP), where responses will briefly cease immediately following reinforcement, though the pause is a function of the upcoming response requirement rather than the prior reinforcement.
** The PRP of a fixed interval schedule is frequently followed by a "scallop-shaped" accelerating rate of response, while fixed ratio schedules produce a more "angular" response.
*** fixed interval scallop: the pattern of responding that develops with fixed interval reinforcement schedule, performance on a fixed interval reflects subject's accuracy in telling time.
* Organisms whose schedules of reinforcement are "thinned" (that is, requiring more responses or a greater wait before reinforcement) may experience "ratio strain" if thinned too quickly. This produces behavior similar to that seen during extinction.
** Ratio strain: the disruption of responding that occurs when a fixed ratio response requirement is increased too rapidly.
** Ratio run: high and steady rate of responding that completes each ratio requirement. Usually higher ratio requirement causes longer post-reinforcement pauses to occur.
* Partial reinforcement schedules are more resistant to extinction than continuous reinforcement schedules.
** Ratio schedules are more resistant than interval schedules and variable schedules more resistant than fixed ones.
** Momentary changes in reinforcement value lead to dynamic changes in behavior.
Compound schedules
Compound schedules combine two or more different simple schedules in some way using the same reinforcer for the same behavior. There are many possibilities; among those most often used are:
* Alternative schedules – A type of compound schedule where two or more simple schedules are in effect and whichever schedule is completed first results in reinforcement.
* Conjunctive schedules – A complex schedule of reinforcement where two or more simple schedules are in effect independently of each other, and requirements on all of the simple schedules must be met for reinforcement.
* Multiple schedules – Two or more schedules alternate over time, with a stimulus indicating which is in force. Reinforcement is delivered if the response requirement is met while a schedule is in effect.
** Example: FR4 when given a whistle and FI6 when given a bell ring.
* Mixed schedules – Either of two, or more, schedules may occur with no stimulus indicating which is in force. Reinforcement is delivered if the response requirement is met while a schedule is in effect.
** Example: FI6 and then VR3 without any stimulus warning of the change in schedule.
*Concurrent schedules – A complex reinforcement procedure in which the participant can choose any one of two or more simple reinforcement schedules that are available simultaneously. Organisms are free to change back and forth between the response alternatives at any time.
** Real-world example: changing channels on a television.
* Concurrent-chain schedule of reinforcement – A complex reinforcement procedure in which the participant is permitted to choose during the first link which of several simple reinforcement schedules will be in effect in the second link. Once a choice has been made, the rejected alternatives become unavailable until the start of the next trial.
* Interlocking schedules – A single schedule with two components where progress in one component affects progress in the other component. In an interlocking FR 60 FI 120-s schedule, for example, each response subtracts time from the interval component such that each response is "equal" to removing two seconds from the FI schedule.
* Chained schedules – Reinforcement occurs after two or more successive schedules have been completed, with a stimulus indicating when one schedule has been completed and the next has started
** Example: On an FR 10 schedule in the presence a red light, a pigeon pecks a green disc 10 times; then, a yellow light indicates an FR 3 schedule is active; after the pigeon pecks a yellow disc 3 times, a green light to indicates a VI 6-s schedule is in effect; if this were the final schedule in the chain, the pigeon would be reinforced for pecking a green disc on a VI 6-s schedule; however, all schedule requirements in the chain must be met before a reinforcer is provided.
* Tandem schedules – Reinforcement occurs when two or more successive schedule requirements have been completed, with no stimulus indicating when a schedule has been completed and the next has started.
** Example: VR 10, after it is completed the schedule is changed without warning to FR 10, after that it is changed without warning to FR 16, etc. At the end of the series of schedules, a reinforcer is finally given.
* Higher-order schedules – completion of one schedule is reinforced according to a second schedule; e.g. in FR2 (FI10 secs), two successive fixed interval schedules require completion before a response is reinforced.
Superimposed schedules
The psychology
Psychology is the scientific study of mind and behavior. Psychology includes the study of conscious and unconscious phenomena, including feelings and thoughts. It is an academic discipline of immense scope, crossing the boundaries betwe ...
term ''superimposed schedules of reinforcement'' refers to a structure of rewards where two or more simple schedules of reinforcement operate simultaneously. Reinforcers can be positive, negative, or both. An example is a person who comes home after a long day at work. The behavior of opening the front door is rewarded by a big kiss on the lips by the person's spouse and a rip in the pants from the family dog jumping enthusiastically. Another example of superimposed schedules of reinforcement is a pigeon in an experimental cage pecking at a button. The pecks deliver a hopper of grain every 20th peck, and access to water after every 200 pecks.
Superimposed schedules of reinforcement are a type of compound schedule that evolved from the initial work on simple schedules of reinforcement
In behavioral psychology, reinforcement is a consequence applied that will strengthen an organism's future behavior whenever that behavior is preceded by a specific antecedent stimulus. This strengthening effect may be measured as a higher fr ...
by B.F. Skinner
Burrhus Frederic Skinner (March 20, 1904 – August 18, 1990) was an American psychologist, behaviorist, author, inventor, and social philosopher. He was a professor of psychology at Harvard University from 1958 until his retirement in 1974.
...
and his colleagues (Skinner and Ferster, 1957). They demonstrated that reinforcers could be delivered on schedules, and further that organisms behaved differently under different schedules. Rather than a reinforcer, such as food or water, being delivered every time as a consequence of some behavior, a reinforcer could be delivered after more than one instance of the behavior. For example, a pigeon
Columbidae () is a bird family consisting of doves and pigeons. It is the only family in the order Columbiformes. These are stout-bodied birds with short necks and short slender bills that in some species feature fleshy ceres. They primarily ...
may be required to peck a button switch ten times before food appears. This is a "ratio schedule". Also, a reinforcer could be delivered after an interval of time passed following a target behavior. An example is a rat that is given a food pellet immediately following the first response that occurs after two minutes has elapsed since the last lever press. This is called an "interval schedule".
In addition, ratio schedules can deliver reinforcement following fixed or variable number of behaviors by the individual organism. Likewise, interval schedules can deliver reinforcement following fixed or variable intervals of time following a single response by the organism. Individual behaviors tend to generate response rates that differ based upon how the reinforcement schedule is created. Much subsequent research in many labs examined the effects on behaviors of scheduling reinforcers.
If an organism is offered the opportunity to choose between or among two or more simple schedules of reinforcement at the same time, the reinforcement structure is called a "concurrent schedule of reinforcement". Brechner (1974, 1977) introduced the concept of superimposed schedules of reinforcement
In behavioral psychology, reinforcement is a consequence applied that will strengthen an organism's future behavior whenever that behavior is preceded by a specific antecedent stimulus. This strengthening effect may be measured as a higher fr ...
in an attempt to create a laboratory analogy of social trap
In psychology, a social trap is a conflict of interest or perverse incentive where individuals or a group of people act to obtain short-term individual gains, which in the long run leads to a loss for the group as a whole. Social traps are the caus ...
s, such as when humans overharvest
Overexploitation, also called overharvesting, refers to harvesting a renewable resource to the point of diminishing returns. Continued overexploitation can lead to the destruction of the resource, as it will be unable to replenish. The term app ...
their fisheries or tear down their rainforests. Brechner created a situation where simple reinforcement schedules were superimposed upon each other. In other words, a single response or group of responses by an organism led to multiple consequences. Concurrent schedules of reinforcement can be thought of as "or" schedules, and superimposed schedules of reinforcement can be thought of as "and" schedules. Brechner and Linder (1981) and Brechner (1987) expanded the concept to describe how superimposed schedules and the social trap
In psychology, a social trap is a conflict of interest or perverse incentive where individuals or a group of people act to obtain short-term individual gains, which in the long run leads to a loss for the group as a whole. Social traps are the caus ...
analogy could be used to analyze the way energy
In physics, energy (from Ancient Greek: ἐνέργεια, ''enérgeia'', “activity”) is the quantitative property that is transferred to a body or to a physical system, recognizable in the performance of work and in the form of heat a ...
flows through system
A system is a group of Interaction, interacting or interrelated elements that act according to a set of rules to form a unified whole. A system, surrounded and influenced by its environment (systems), environment, is described by its boundaries, ...
s.
Superimposed schedules of reinforcement have many real-world applications in addition to generating social trap
In psychology, a social trap is a conflict of interest or perverse incentive where individuals or a group of people act to obtain short-term individual gains, which in the long run leads to a loss for the group as a whole. Social traps are the caus ...
s. Many different human individual and social situations can be created by superimposing simple reinforcement schedules. For example, a human being could have simultaneous tobacco and alcohol addictions. Even more complex situations can be created or simulated by superimposing two or more concurrent schedules. For example, a high school senior could have a choice between going to Stanford University or UCLA, and at the same time have the choice of going into the Army or the Air Force, and simultaneously the choice of taking a job with an internet company or a job with a software company. That is a reinforcement structure of three superimposed concurrent schedules of reinforcement.
Superimposed schedules of reinforcement can create the three classic conflict situations (approach–approach conflict, approach–avoidance conflict, and avoidance–avoidance conflict) described by Kurt Lewin
Kurt Lewin ( ; 9 September 1890 – 12 February 1947) was a German-American psychologist, known as one of the modern pioneers of social, organizational, and applied psychology in the United States. During his professional career Lewin applied hi ...
(1935) and can operationalize other Lewinian situations analyzed by his force field analysis
In social science, force-field analysis provides a framework for looking at the factors ("forces") that influence a situation, originally social situations. It looks at forces that are either driving movement toward a goal (helping forces) or block ...
. Other examples of the use of superimposed schedules of reinforcement as an analytical tool are its application to the contingencies of rent control (Brechner, 2003) and problem of toxic waste dumping in the Los Angeles County storm drain system (Brechner, 2010).
Concurrent schedules
In operant conditioning
Operant conditioning, also called instrumental conditioning, is a learning process where behaviors are modified through the association of stimuli with reinforcement or punishment. In it, operants—behaviors that affect one's environment—are c ...
, concurrent schedules of reinforcement are schedules of reinforcement that are simultaneously available to an animal subject or human participant, so that the subject or participant can respond on either schedule. For example, in a two-alternative forced choice Two-alternative forced choice (2AFC) is a method for measuring the sensitivity of a person, child or infant, or animal to some particular sensory input, stimulus, through that observer's pattern of choices and response times to two versions of the s ...
task, a pigeon
Columbidae () is a bird family consisting of doves and pigeons. It is the only family in the order Columbiformes. These are stout-bodied birds with short necks and short slender bills that in some species feature fleshy ceres. They primarily ...
in a Skinner box
An operant conditioning chamber (also known as a Skinner box) is a laboratory apparatus used to study animal behavior. The operant conditioning chamber was created by B. F. Skinner while he was a graduate student at Harvard University. The cham ...
is faced with two pecking keys; pecking responses can be made on either, and food reinforcement might follow a peck on either. The schedules of reinforcement arranged for pecks on the two keys can be different. They may be independent, or they may be linked so that behavior on one key affects the likelihood of reinforcement on the other.
It is not necessary for responses on the two schedules to be physically distinct. In an alternate way of arranging concurrent schedules, introduced by Findley in 1958, both schedules are arranged on a single key or other response device, and the subject can respond on a second key to change between the schedules. In such a "Findley concurrent" procedure, a stimulus (e.g., the color of the main key) signals which schedule is in effect.
Concurrent schedules often induce rapid alternation between the keys. To prevent this, a "changeover delay" is commonly introduced: each schedule is inactivated for a brief period after the subject switches to it.
When both the concurrent schedules are variable intervals, a quantitative relationship known as the matching law
In operant conditioning, the matching law is a quantitative relationship that holds between the relative rates of response and the relative rates of reinforcement in concurrent schedules of reinforcement. For example, if two response alternative ...
is found between relative response rates in the two schedules and the relative reinforcement rates they deliver; this was first observed by R.J. Herrnstein
Richard Julius Herrnstein (May 20, 1930 – September 13, 1994) was an American psychologist at Harvard University. He was an active researcher in animal learning in the Skinnerian tradition. Herrnstein was the Edgar Pierce Professor of Psychol ...
in 1961. Matching law is a rule for instrumental behavior which states that the relative rate of responding on a particular response alternative equals the relative rate of reinforcement for that response (rate of behavior = rate of reinforcement). Animals and humans have a tendency to prefer choice in schedules.
Shaping
Shaping is reinforcement of successive approximations to a desired instrumental response. In training a rat to press a lever, for example, simply turning toward the lever is reinforced at first. Then, only turning and stepping toward it is reinforced. The outcomes of one set of behaviours starts the shaping process for the next set of behaviours, and the outcomes of that set prepares the shaping process for the next set, and so on. As training progresses, the response reinforced becomes progressively more like the desired behavior; each subsequent behaviour becomes a closer approximation of the final behaviour.
Shaping is used as an intervention for various desired behaviors for individuals with Autism as well as other developmental disabilities. When shaping is combined with other evidence-based practices such as complex functional communication training (FCT), can yield a positive outcomes for the individual. When shaping is paired with a schedule of reinforcements with efficiency, the target behavior is increased.
Shaping is also used for food refusal. Food refusal is when an individual has a partial or total aversion to food items. This can be as minimal as a picky eater to severe and can affect the individuals' health. Shaping has been used to have a higher success rate for food acceptance.
Chaining
Chaining involves linking discrete behaviors together in a series, such that each result of each behavior is both the reinforcement (or consequence) for the previous behavior, and the stimuli (or antecedent) for the next behavior. There are many ways to teach chaining, such as forward chaining (starting from the first behavior in the chain), backwards chaining (starting from the last behavior) and total task chaining (in which the entire behavior is taught from beginning to end, rather than as a series of steps). An example is opening a locked door. First the key is inserted, then turned, then the door opened.
Forward chaining would teach the subject first to insert the key. Once that task is mastered, they are told to insert the key, and taught to turn it. Once that task is mastered, they are told to perform the first two, then taught to open the door. Backwards chaining would involve the teacher first inserting and turning the key, and the subject then being taught to open the door. Once that is learned, the teacher inserts the key, and the subject is taught to turn it, then opens the door as the next step. Finally, the subject is taught to insert the key, and they turn and open the door. Once the first step is mastered, the entire task has been taught. Total task chaining would involve teaching the entire task as a single series, prompting through all steps. Prompts are faded (reduced) at each step as they are mastered.
Challenging behaviors seen in individuals with Autism and other related disabilities have successfully managed and maintained by previous studies using a scheduled of chained reinforcements. Functional communication training is an intervention that often uses chained schedules of reinforcement to effectively promote the appropriate and desired functional communication response. The purpose of the chaining procedures when using it paired with functional communication training are to decrease challenging or inappropriate behaviors with functional or more appropriate ways to express the individual.
Persuasive communication and the reinforcement theory
;Persuasive communication
:Persuasion
Persuasion or persuasion arts is an umbrella term for Social influence, influence. Persuasion can influence a person's Belief, beliefs, Attitude (psychology), attitudes, Intention, intentions, Motivation, motivations, or Behavior, behaviours.
...
influences
''Influences'' is the debut solo album by English musician Mark King, singer and bass player with Level 42. It was released by Polydor Records in July 1984.
The album features a cover of the song "I Feel Free" by Cream, which was released as a ...
any person the way they think, act and feel. Persuasive skill tells about how people understand the concern, position and needs of the people. Persuasion can be classified into informal persuasion and formal persuasion.
;Informal persuasion
:This tells about the way in which a person interacts with colleagues and customers. The informal persuasion can be used in team, memos as well as e-mails.
:Example: "I noticed that you helped out Joe while your equipment was being serviced by the maintenance crew." OR
:"I overheard your explanation to that last customer about how to obtain, use, and the advantages of having a credit card. I think we may be adding her to our business."
;Formal persuasion
:This type of persuasion is used in writing customer letter, proposal and also for formal presentation to any customer or colleagues.
;Process of persuasion
:Persuasion relates how you influence people with your skills, experience, knowledge, leadership, qualities and team capabilities. Persuasion is an interactive process while getting the work done by others. Here are examples for which you can use persuasion skills in real time. Interview: you can prove your best talents, skills and expertise. Clients: to guide your clients for the achievement of the goals or targets. Memos: to express your ideas and views to coworkers for the improvement in the operations. Resistance identification and positive attitude are the vital roles of persuasion.
Persuasion is a form of human interaction. It takes place when one individual expects some particular response from one or more other individuals and deliberately sets out to secure the response through the use of communication. The communicator must realize that different groups have different values.
In instrumental learning situations, which involve operant behavior, the persuasive communicator will present his message and then wait for the receiver to make a correct response. As soon as the receiver makes the response, the communicator will attempt to fix the response by some appropriate reward or reinforcement.
In conditional learning situations, where there is respondent behavior, the communicator presents his message so as to elicit the response he wants from the receiver, and the stimulus that originally served to elicit the response then becomes the reinforcing or rewarding element in conditioning.[
]
Mathematical models
A lot of work has been done in building a mathematical model of reinforcement. This model is known as MPR, short for mathematical principles of reinforcement
The mathematical principles of reinforcement (MPR) constitute of a set of mathematical equations set forth by Peter Killeen and his colleagues attempting to describe and predict the most fundamental aspects of behavior (Killeen & Sitomer, 2003).
...
. Peter Killeen has made key discoveries in the field with his research on pigeons.
Criticisms
The standard definition of behavioral reinforcement has been criticized as circular
Circular may refer to:
* The shape of a circle
* ''Circular'' (album), a 2006 album by Spanish singer Vega
* Circular letter (disambiguation)
** Flyer (pamphlet), a form of advertisement
* Circular reasoning, a type of logical fallacy
* Circula ...
, since it appears to argue that response strength is increased by reinforcement, and defines reinforcement as something that increases response strength (i.e., response strength is increased by things that increase response strength). However, the correct usage of reinforcement is that something is a reinforcer ''because'' of its effect on behavior, and not the other way around. It becomes circular if one says that a particular stimulus strengthens behavior because it is a reinforcer, and does not explain why a stimulus is producing that effect on the behavior. Other definitions have been proposed, such as F.D. Sheffield's "consummatory behavior contingent on a response", but these are not broadly used in psychology.
Increasingly, understanding of the role reinforcers play is moving away from a "strengthening" effect to a "signalling" effect. That is, the view that reinforcers increase responding because they signal the behaviours that are likely to result in reinforcement. While in most practical applications, the effect of any given reinforcer will be the same regardless of whether the reinforcer is signalling or strengthening, this approach helps to explain a number of behavioural phenomenon including patterns of responding on intermittent reinforcement schedules (fixed interval scallops) and the differential outcomes effect The differential outcomes effect is a theory in behaviorism, a branch of psychology, that shows that a positive effect on accuracy occurs in discrimination learning between different stimuli when unique rewards are paired with each individual stim ...
.
History of the terms
In the 1920s Russian physiologist Ivan Pavlov
Ivan Petrovich Pavlov ( rus, Ива́н Петро́вич Па́влов, , p=ɪˈvan pʲɪˈtrovʲɪtɕ ˈpavləf, a=Ru-Ivan_Petrovich_Pavlov.ogg; 27 February 1936), was a Russian and Soviet experimental neurologist, psychologist and physiol ...
may have been the first to use the word ''reinforcement'' with respect to behavior, but (according to Dinsmoor) he used its approximate Russian cognate sparingly, and even then it referred to strengthening an already-learned but weakening response. He did not use it, as it is today, for selecting and strengthening new behaviors. Pavlov's introduction of the word ''extinction'' (in Russian) approximates today's psychological use.
In popular use, ''positive reinforcement'' is often used as a synonym for ''reward
Reward may refer to:
Places
* Reward (Shelltown, Maryland), a historic home in Shelltown Maryland
* Reward, California (disambiguation)
* Reward-Tilden's Farm, a historic home in Chestertown Maryland
Arts, entertainment, and media
* "Rewa ...
'', with people (not behavior) thus being "reinforced", but this is contrary to the term's consistent technical usage, as it is a dimension of behavior, and not the person, which is strengthened. ''Negative reinforcement'' is often used by laypeople and even social scientists outside psychology as a synonym for ''punishment
Punishment, commonly, is the imposition of an undesirable or unpleasant outcome upon a group or individual, meted out by an authority—in contexts ranging from child discipline to criminal law—as a response and deterrent to a particular acti ...
''. This is contrary to modern technical use, but it was B.F. Skinner
Burrhus Frederic Skinner (March 20, 1904 – August 18, 1990) was an American psychologist, behaviorist, author, inventor, and social philosopher. He was a professor of psychology at Harvard University from 1958 until his retirement in 1974.
...
who first used it this way in his 1938 book. By 1953, however, he followed others in thus employing the word ''punishment'', and he re-cast ''negative reinforcement'' for the removal of aversive stimuli.
There are some within the field of behavior analysis who have suggested that the terms "positive" and "negative" constitute an unnecessary distinction in discussing reinforcement as it is often unclear whether stimuli are being removed or presented. For example, Iwata poses the question: "... is a change in temperature more accurately characterized by the presentation of cold (heat) or the removal of heat (cold)?" Thus, reinforcement could be conceptualized as a pre-change condition replaced by a post-change condition that reinforces the behavior that followed the change in stimulus conditions.
Applications
Reinforcement and punishment are ubiquitous in human social interactions, and a great many applications of operant principles have been suggested and implemented. Following are a few examples.
Addiction and dependence
Positive and negative reinforcement play central roles in the development and maintenance of addiction
Addiction is a neuropsychological disorder characterized by a persistent and intense urge to engage in certain behaviors, one of which is the usage of a drug, despite substantial harm and other negative consequences. Repetitive drug use o ...
and drug dependence. An addictive drug is intrinsically rewarding; that is, it functions as a primary positive reinforcer of drug use. The brain's reward system assigns it incentive salience
Motivational salience is a cognitive process and a form of attention that ''motivates'' or propels an individual's behavior towards or away from a particular stimulus (psychology), object, perceived event or outcome. Motivational salience regulat ...
(i.e., it is "wanted" or "desired"), so as an addiction develops, deprivation of the drug leads to craving. In addition, stimuli associated with drug use – e.g., the sight of a syringe, and the location of use – become associated with the intense reinforcement induced by the drug. These previously neutral stimuli acquire several properties: their appearance can induce craving, and they can become conditioned positive reinforcers of continued use. Thus, if an addicted individual encounters one of these drug cues, a craving for the associated drug may reappear. For example, anti-drug agencies previously used posters with images of drug paraphernalia
"Drug paraphernalia" is a term to denote any equipment, product or accessory that is intended or modified for making, using or concealing drugs, typically for recreational purposes. Drugs such as marijuana, cocaine, heroin, and methamphetamin ...
as an attempt to show the dangers of drug use. However, such posters are no longer used because of the effects of incentive salience in causing relapse
In internal medicine, relapse or recidivism is a recurrence of a past (typically medical) condition. For example, multiple sclerosis and malaria often exhibit peaks of activity and sometimes very long periods of dormancy, followed by relapse or ...
upon sight of the stimuli illustrated in the posters.
In drug dependent individuals, negative reinforcement occurs when a drug is self-administered in order to alleviate or "escape" the symptoms of physical dependence
Physical dependence is a physical condition caused by chronic use of a tolerance-forming drug, in which abrupt or gradual drug withdrawal causes unpleasant physical symptoms. Physical dependence can develop from low-dose therapeutic use of certai ...
(e.g., tremors and sweating) and/or psychological dependence
Psychological dependence is a cognitive disorder that involves emotional–motivational withdrawal symptoms—e.g. anxiety and anhedonia—upon cessation of prolonged drug abuse or certain repetitive behaviors. It develops through frequent exp ...
(e.g., anhedonia
Anhedonia is a diverse array of deficits in hedonic function, including reduced motivation or ability to experience pleasure. While earlier definitions emphasized the inability to experience pleasure, anhedonia is currently used by researchers t ...
, restlessness, irritability, and anxiety) that arise during the state of drug withdrawal
Drug withdrawal, drug withdrawal syndrome, or substance withdrawal syndrome, is the group of symptoms that occur upon the abrupt discontinuation or decrease in the intake of pharmaceutical or recreational drugs.
In order for the symptoms of wit ...
.
Animal training
Animal trainers and pet owners were applying the principles and practices of operant conditioning long before these ideas were named and studied, and animal training still provides one of the clearest and most convincing examples of operant control. Of the concepts and procedures described in this article, a few of the most salient are: availability of immediate reinforcement (e.g. the ever-present bag of dog yummies); contingency, assuring that reinforcement follows the desired behavior and not something else; the use of secondary reinforcement, as in sounding a clicker immediately after a desired response; shaping, as in gradually getting a dog to jump higher and higher; intermittent reinforcement, reducing the frequency of those yummies to induce persistent behavior without satiation; chaining, where a complex behavior is gradually put together.
Child behaviour – parent management training
Providing positive reinforcement for appropriate child behaviors is a major focus of parent management training. Typically, parents learn to reward appropriate behavior through social rewards (such as praise, smiles, and hugs) as well as concrete rewards (such as stickers or points towards a larger reward as part of an incentive system created collaboratively with the child).[Kazdin AE (2010). Problem-solving skills training and parent management training for oppositional defiant disorder and conduct disorder. ]
Evidence-based psychotherapies for children and adolescents (2nd ed.)
'' 211–226. New York: Guilford Press. In addition, parents learn to select simple behaviors as an initial focus and reward each of the small steps that their child achieves towards reaching a larger goal (this concept is called "successive approximations").[Forgatch MS, Patterson GR (2010). Parent management training — Oregon model: An intervention for antisocial behavior in children and adolescents. ]
Evidence-based psychotherapies for children and adolescents (2nd ed.)
'' 159–78. New York: Guilford Press. They may also use indirect rewards such through progress chart
Progress charts are tools used in classrooms, in child care centers, and in homes across the world. They are used to promote good behaviors and reward children for those behaviors, which is why they are also known as behavior charts. They can b ...
s. Providing positive reinforcement in the classroom can be beneficial to student success. When applying positive reinforcement to students, it's crucial to make it individualized to that student's needs. This way, the student understands why they are receiving the praise, they can accept it, and eventually learn to continue the action that was earned by positive reinforcement. For example, using rewards or extra recess time might apply to some students more, whereas others might accept the enforcement by receiving stickers or check marks indicating praise.
Economics
Both psychologists and economists have become interested in applying operant concepts and findings to the behavior of humans in the marketplace. An example
is the analysis of consumer demand, as indexed by the amount of a commodity that is purchased. In economics, the degree to which price influences consumption is called "the price elasticity of demand." Certain commodities are more elastic than others; for example, a change in price of certain foods may have a large effect on the amount bought, while gasoline and other essentials may be less affected by price changes. In terms of operant analysis, such effects may be interpreted in terms of motivations of consumers and the relative value of the commodities as reinforcers.
Gambling – variable ratio scheduling
As stated earlier in this article, a variable ratio schedule yields reinforcement after the emission of an unpredictable number of responses. This schedule typically generates rapid, persistent responding. Slot machines pay off on a variable ratio schedule, and they produce just this sort of persistent lever-pulling behavior in gamblers. Because the machines are programmed to pay out less money than they take in, the persistent slot-machine user invariably loses in the long run. Slots machines, and thus variable ratio reinforcement, have often been blamed as a factor underlying gambling addiction.
Managing behavior in organizations
An alternative to traditional pay for performance incentive schemes that is rooted in reinforcement theory, known as the O.B. Mod Approach, has been proposed as a practical approach to managing the performance-related behaviors of an organization's members. . O.B. Mod. and its "reinforce-for-performance" basis has been shown empirically to yield performance improvements in both manufacturing and service organizations, though improvements varied by type of reinforcer in both contexts.
Nudge theory
Nudge theory (or nudge) is a concept in behavioural science
Behavioral sciences explore the cognitive processes within organisms and the behavioral interactions between organisms in the natural world. It involves the systematic analysis and investigation of human and animal behavior through naturalistic o ...
, political theory
Political philosophy or political theory is the philosophical study of government, addressing questions about the nature, scope, and legitimacy of public agents and institutions and the relationships between them. Its topics include politics, l ...
and economics
Economics () is the social science that studies the Production (economics), production, distribution (economics), distribution, and Consumption (economics), consumption of goods and services.
Economics focuses on the behaviour and intera ...
which argues that positive reinforcement and indirect suggestions to try to achieve non-forced compliance can influence
Influence or influencer may refer to:
*Social influence, in social psychology, influence in interpersonal relationships
** Minority influence, when the minority affect the behavior or beliefs of the majority
*Influencer marketing, through individ ...
the motives, incentives and decision making
In psychology, decision-making (also spelled decision making and decisionmaking) is regarded as the cognitive process resulting in the selection of a belief or a course of action among several possible alternative options. It could be either rati ...
of groups and individuals, at least as effectively – if not more effectively – than direct instruction, legislation, or enforcement.
Praise
The concept of praise as a means of behavioral reinforcement in humans is rooted in B.F. Skinner's model of operant conditioning. Through this lens, praise has been viewed as a means of positive reinforcement, wherein an observed behavior is made more likely to occur by contingently praising said behavior. Hundreds of studies have demonstrated the effectiveness of praise in promoting positive behaviors, notably in the study of teacher and parent use of praise on child in promoting improved behavior and academic performance, but also in the study of work performance. Praise has also been demonstrated to reinforce positive behaviors in non-praised adjacent individuals (such as a classmate of the praise recipient) through vicarious reinforcement. Praise may be more or less effective in changing behavior depending on its form, content and delivery. In order for praise to effect positive behavior change, it must be contingent on the positive behavior (i.e., only administered after the targeted behavior is enacted), must specify the particulars of the behavior that is to be reinforced, and must be delivered sincerely and credibly.
Acknowledging the effect of praise as a positive reinforcement strategy, numerous behavioral and cognitive behavioral interventions have incorporated the use of praise in their protocols. The strategic use of praise is recognized as an evidence-based practice in both classroom management and parenting training interventions, though praise is often subsumed in intervention research into a larger category of positive reinforcement, which includes strategies such as strategic attention and behavioral rewards.
Manipulation
Braiker identified the following ways that manipulators control
Control may refer to:
Basic meanings Economics and business
* Control (management), an element of management
* Control, an element of management accounting
* Comptroller (or controller), a senior financial officer in an organization
* Controllin ...
their victims:
* Positive reinforcement
In behavioral psychology, reinforcement is a consequence applied that will strengthen an organism's future behavior whenever that behavior is preceded by a specific antecedent stimulus. This strengthening effect may be measured as a higher freq ...
: includes praise, superficial charm, superficial sympathy
Sympathy is the perception of, understanding of, and reaction to the distress or need of another life form. According to David Hume, this sympathetic concern is driven by a switch in viewpoint from a personal perspective to the perspective of ano ...
(crocodile tears
Crocodile tears, or superficial sympathy, is a false, insincere display of emotion such as a hypocrite crying fake tears of grief. The phrase derives from an ancient belief that crocodiles shed tears while consuming their prey, and as such is p ...
), excessive apologizing, money, approval, gifts, attention, facial expressions such as a forced laugh or smile
A smile is a facial expression formed primarily by flexing the muscles at the sides of the mouth. Some smiles include a contraction of the muscles at the corner of the eyes, an action known as a Duchenne smile.
Among humans, a smile expresses ...
, and public recognition.
* Negative reinforcement
In behavioral psychology, reinforcement is a consequence applied that will strengthen an organism's future behavior whenever that behavior is preceded by a specific antecedent stimulus. This strengthening effect may be measured as a higher freq ...
: may involve removing one from a negative situation
* Intermittent or partial reinforcement: Partial or intermittent negative reinforcement can create an effective climate of fear
Culture of fear (or climate of fear) is the concept that people may incite fear in the general public to achieve political or workplace goals through emotional bias; it was developed as a sociological framework by Frank Furedi and has been mo ...
and doubt. Partial or intermittent positive reinforcement can encourage the victim to persist – for example in most forms of gambling, the gambler is likely to win now and again but still lose money overall.
* Punishment
Punishment, commonly, is the imposition of an undesirable or unpleasant outcome upon a group or individual, meted out by an authority—in contexts ranging from child discipline to criminal law—as a response and deterrent to a particular acti ...
: includes nagging
Nagging, in interpersonal communication, is repetitious behaviour in the form of pestering, hectoring, harassing, or otherwise continuously urging an individual to complete previously discussed requests or act on advice. The word is derived from th ...
, yelling, the silent treatment
Silent treatment is the refusal to communicate verbally and electronically with someone who is trying to communicate and elicit a response. It may range from just sulking to malevolent abusive controlling behaviour. It may be a passive-aggressiv ...
, intimidation
Intimidation is to "make timid or make fearful"; or to induce fear. This includes intentional behaviors of forcing another person to experience general discomfort such as humiliation, embarrassment, inferiority, limited freedom, etc and the victi ...
, threats, swearing, emotional blackmail
Emotional blackmail and FOG are terms popularized by psychotherapist Susan Forward about controlling people in relationships and the theory that fear, obligation and guilt (FOG) are the transactional dynamics at play between the controller and t ...
, the guilt trip
A guilt trip is a feeling of guilt or responsibility, especially an unjustified one induced by someone else.
Overview
Creating a guilt trip in another person may be considered to be manipulation in the form of punishment for a perceived trans ...
, sulking, crying, and playing the victim.
* Traumatic one-trial learning: using verbal abuse
Verbal abuse (also known as verbal aggression, verbal attack, verbal violence, verbal assault, psychic aggression, or psychic violence) is a type of psychological/mental abuse that involves the use of oral, gestured, and written language direct ...
, explosive anger, or other intimidating behavior to establish dominance or superiority; even one incident of such behavior can condition or train victims to avoid upsetting, confronting or contradicting the manipulator.
Traumatic bonding
Traumatic bonding occurs as the result of ongoing cycles of abuse in which the intermittent reinforcement of reward and punishment
Punishment, commonly, is the imposition of an undesirable or unpleasant outcome upon a group or individual, meted out by an authority—in contexts ranging from child discipline to criminal law—as a response and deterrent to a particular acti ...
creates powerful emotional bonds that are resistant to change.[Chrissie Sanderson. ]
Counselling Survivors of Domestic Abuse
'. Jessica Kingsley Publishers; 15 June 2008. . p. 84.
The other source indicated that
'The necessary conditions for traumatic bonding are that one person must dominate the other and that the level of abuse chronically spikes and then subsides. The relationship is characterized by periods of permissive, compassionate, and even affectionate behavior from the dominant person, punctuated by intermittent episodes of intense abuse. To maintain the upper hand, the victimizer manipulates the behavior of the victim and limits the victim's options so as to perpetuate the power imbalance. Any threat to the balance of dominance and submission may be met with an escalating cycle of punishment ranging from seething intimidation to intensely violent outbursts. The victimizer also isolates the victim from other sources of support, which reduces the likelihood of detection and intervention, impairs the victim's ability to receive countervailing self-referent feedback, and strengthens the sense of unilateral dependency ... The traumatic effects of these abusive relationships may include the impairment of the victim's capacity for accurate self-appraisal, leading to a sense of personal inadequacy and a subordinate sense of dependence upon the dominating person. Victims also may encounter a variety of unpleasant social and legal consequences of their emotional and behavioral affiliation with someone who perpetrated aggressive acts, even if they themselves were the recipients of the aggression.
Video games
Most video games are designed around some type of compulsion loop, adding a type of positive reinforcement through a variable rate schedule to keep the player playing the game, though this can also lead to video game addiction
Video game addiction (VGA), also known as gaming disorder or internet gaming disorder, is generally defined as the problematic, compulsive use of video games that results in significant impairment to an individual's ability to function in vario ...
.
As part of a trend in the monetization of video games in the 2010s, some games offered "loot boxes" as rewards or purchasable by real-world funds that offered a random selection of in-game items, distributed by rarity. The practice has been tied to the same methods that slot machines and other gambling devices dole out rewards, as it follows a variable rate schedule. While the general perception that loot boxes are a form of gambling, the practice is only classified as such in a few countries as gambling and otherwise legal. However, methods to use those items as virtual currency for online gambling or trading for real-world money has created a skin gambling
In video games, skin gambling is the use of virtual goods, often cosmetic in-game items such as "Skin (computing)#Video gaming, skins", as virtual currency to bet on the outcome of eSports, professional matches or on other games of chance. It is ...
market that is under legal evaluation.
Workplace culture of fear
Ashforth discussed potentially destructive sides of leadership
Leadership, both as a research area and as a practical skill, encompasses the ability of an individual, group or organization to "lead", influence or guide other individuals, teams, or entire organizations. The word "leadership" often gets view ...
and identified what he referred to as petty tyrants: leaders who exercise a tyrannical style of management, resulting in a climate of fear in the workplace. Partial or intermittent negative reinforcement
In behavioral psychology, reinforcement is a consequence applied that will strengthen an organism's future behavior whenever that behavior is preceded by a specific antecedent stimulus. This strengthening effect may be measured as a higher freq ...
can create an effective climate of fear and doubt
Doubt is a mental state in which the mind remains suspended between two or more contradictory propositions, unable to be certain of any of them.
Doubt on an emotional level is indecision between belief and disbelief. It may involve uncertainty, ...
. When employees get the sense that bullies are tolerated, a climate of fear may be the result.
Individual differences in sensitivity to reward
Reward may refer to:
Places
* Reward (Shelltown, Maryland), a historic home in Shelltown Maryland
* Reward, California (disambiguation)
* Reward-Tilden's Farm, a historic home in Chestertown Maryland
Arts, entertainment, and media
* "Rewa ...
, punishment
Punishment, commonly, is the imposition of an undesirable or unpleasant outcome upon a group or individual, meted out by an authority—in contexts ranging from child discipline to criminal law—as a response and deterrent to a particular acti ...
, and motivation
Motivation is the reason for which humans and other animals initiate, continue, or terminate a behavior at a given time. Motivational states are commonly understood as forces acting within the agent that create a disposition to engage in goal-dire ...
have been studied under the premises of reinforcement sensitivity theory Reinforcement sensitivity theory (RST) proposes three brain-behavioral systems that underlie individual differences in sensitivity to reward, punishment, and motivation. While not originally defined as a theory of personality, the RST has been use ...
and have also been applied to workplace performance.
See also
References
Further reading
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
External links
An On-Line Positive Reinforcement Tutorial
Scholarpedia Reinforcement
scienceofbehavior.com
{{Authority control
Behavior therapy
Behavioral concepts
Behaviorism
Addiction
et:Sarrus