How we really learn… explained simply!Posted in: Clicker Training
Ever wondered how we and our horses, dogs and cats really learn new things?
Some behaviours our animals offer us are great; especially when they are helpful or useful and make our lives quicker and easier but what about the truly irritating ones which drive us to distraction? Behaviours such as our horses pawing at the ground or kicking stable doors. Or our dogs barking constantly, chasing off after rabbits, jumping up at us etc.
The good news is you don’t need to study a PhD in Psychology to understand it!
“Learning Theory” is a discipline of psychology that attempts to explain how an organism learns. It consists of many different aspects of learning including instincts, social facilitation, observation, formal teaching, memory, mimicry and classical and operant conditioning. It is these last two that are of significant interest to animal trainers.
Horses, like most other organisms, learn effortlessly through classical conditioning (or Pavlovian conditioning, for those that have heard of Pavlov’s salivating dogs) which occurs when an initially unimportant stimuli is regularly paired with stimuli that initiates some type of response. As a result, a new association forms in which an animal gives the response upon presentation of the original, inconsequential stimulus. Under this condition, the animal’s behaviour does not affect the occurrence of what happens next as it does in Operant Conditioning (see below).
Horse trainers use classical conditioning regularly when they place a word onto a behaviour such as pairing an initially meaningless word such as “trot” immediately before the horse changes gait during an upward transition. Done consistently, it does not take long before the horse responds with the appropriate action when given only the verbal cue.
Another example of classical conditioning is pairing a ‘click!’ with food. Food is a big motivator for animals due to their survival instincts, so receiving a positive consequence motivates the horse to respond eagerly. When trainers first introduce clicker training to the horse, the ‘click!’ is paired with a food reward immediately afterward. Initially, the ‘click!’ means nothing to the horse, but after a few repetitions the trainer will see the animal visibly acknowledge the ‘click!’ and look towards you in expectation of the reward.
Classical conditioning is very important to animal trainers, because it is difficult to supply an animal with one of the things it naturally likes (or dislikes) in time for it to be an important consequence of the behaviour. In other words, it’s hard to feed a treat to a horse while it’s on the end of a lunge line or in the middle of a jump. So trainers will associate something that’s easier to “deliver” with something the animal wants through classical conditioning. Some trainers call this a ‘bridge’ (because it bridges the time between when the animal performs a desired behaviour and when it gets its reward).
In a less positive vein, stabled horses readily learn the sounds which are associated with their feeding times such as the opening of feed room doors, their dinner being put into a bucket. They also recognize visual cues associated with such a rewarding activity too such as the arrival of a person who regularly feeds them. Within a short time, and to the immense frustration of owners, they display anticipatory behaviours such as vocalizing, pawing or kicking stable doors, box walking, fence walking which, when reinforced by them being fed, quickly become classically conditioned behaviours.
Unlike classical conditioning, operant conditioning deals with the modification of a voluntary behaviour. The animal manipulates and controls its environment to obtain reinforcement (which causes a behaviour to occur with greater frequency) or punishment (which causes a behaviour to occur with less frequency). This creates a total of 4 basic consequences to the voluntary behaviour with the addition of a 5th known as extinction where there is no change in the consequences following a response. These basic consequences are either positive (receiving something desired) or negative (removing something unpleasant) following a response:
- Positive reinforcement (reinforcer) – occurs when a behaviour (response) is followed by a stimulus that is rewarding, thereby increasing the frequency of that behaviour, such as a treat is given
- Negative reinforcement (release) – occurs when a behaviour (response) is followed by the removal of an aversive stimulus, thereby increasing the frequency of that behaviour, such as leg pressure is removed upon the horse going forward
- Positive punishment (punisher) – occurs when a behaviour (response) is followed by a negative consequence result, thereby resulting in a decrease in that behaviour, such as the horse being hit with a whip
- Negative punishment (penalty) – occurs when a behaviour (response) is followed by the removal of a pleasant stimulus, thereby resulting in a decrease in that behaviour, such as the horse not being given its dinner for kicking its door
- Extinction (inconsequential) – when a previously reinforced behaviour is no longer reinforced either positively or negatively, a decline in the frequency of the response will be seen
During operant conditioning, when a horse begins to learn the meaning of a new stimulus, it will respond randomly through trial and error to achieve the desired response. Reinforcement (positive or negative) of the response at the correct moment will cause the animal to repeat the behaviour, albeit imperfectly at first. Horses excel at this type of learning, especially when positive reinforcement is available. For example, when presented with a nutball— which dispenses feed as it rolls around on the ground—horses will usually approach and investigate. They push the nutball around with their noses or legs, causing it to roll and drop the food. Most horses rapidly learn to manipulate the nutball in this manner thereby receiving reinforcement and gaining control over this element of their environment.
It is believed that such manipulation and control is important with regard to animal welfare and more attempts should be made to allow stabled horses some degree of instrumental control over their environments.
A good understanding of positive reinforcement is very useful in working with horses. For instance, on hearing their feed being made up, a horse may inadvertently kick the stall door, perhaps out of impatience. If a person then hurriedly feeds the horse (in the misguided hope of quieting it down) the kicking behaviour will have been positively reinforced. So it won’t take long before the horse becomes an avid door-kicker capable of training humans very effectively!
Horse Training Standard
Operant conditioning is a horse training standard and negative reinforcement has historically been the primary means of shaping behaviours. Horses typically are trained to perform actions in order to avoid something aversive such as pressure (negative reinforcement).
For example, under saddle, they move forward when leg pressure from the rider is applied to both sides; on the ground, they yield their hindquarters when pressure is applied to a flank; they back up when pressure is applied to their nose; and they enter a trailer to avoid pressure on the halter or even whips.
Negative Emotional Fallouts
Whilst negative reinforcement can produce good results if used consistently well, behaviourists are now finding through their research that there can be very large negative emotional fallouts from the use of pure negative reinforcement especially if the aversive stimulus is used too strongly and/or it is not removed the instant the desired behaviour occurs. The consequence to the horse then becomes positive punishment, which actually reduces the occurrence of a behaviour, rather than increasing it.
If the the aversive stimulus is used too strongly, this can also produce large negative emotional fallouts since the outcome (or concequence) to the response is then unpredictable (it could be a reinforcer or could be a punisher). This unpredictability can lead to the development of Poisoned Cues.
I wont go into Poisoned Cues in this blog, as it is such a huge subject and requires more than a few sentences to explain it fully! What I will say on the subject now is that if, in the past, a trigger (cue) has sometimes meant the concequence was a reinforcer and at other times it was a punisher it can very easily result in our horses showing avoidance behaviours and displacement activity when presented with that trigger (cue).
Once we begin to truly understand this concept and link it with the use of negative reinforcement (especially pressure), it begins to provide so many answers as to why our horses behave in certain ways!
Many trainers are thus finding the ultimate solution in clicker training (positive reinforcement) because it enables faster learning as well as emotionally happy horses. Not only is it much easier and less time critical to reward a freely occurring behaviour than it is to be absolutely perfect with your removal of negative stimulus but it also removes the possibility of applying positive punishment by accident.
Ideally, trainers and handlers should incorporate intelligent use of both positive and negative reinforcement into a well-balanced system, as this will produce happier, more relaxed, less confused horses.
Why not sign up to the Equi-libre blog to receive notifications of new posts?