Maintaining Behaviours (Schedules of Reinforcement)Posted in: Clicker Training, Stories & Thoughts
Must we click a specific behaviour forever?
We’ve done our reading up about Clicker Training and we like what we are hearing about the communication and relationships it can produce and aid between two species and so we have decided to introduce it to our horses with the help of Alexandra Kurland’s self study materials.
We have taught our horses how to touch a target stick to introduce the clicker to them and to pair the ‘click!’ with the treat. They pick this up instantly. We then teach them ‘The Grown Ups Are Talking’ exercise so they learn to keep their heads out of our space and not to pester us to become their vending machine…super stuff! We are working our way very nicely through Alexandra’s foundation lessons.
We go along for a while, obviously extremely proud of both ourselves and our horses for learning these new behaviours so quickly and so well. As we teach more behaviours, our horses begin offering them to us and their repertoire begins to grow nicely!
When we begin teaching these foundation lessons (or indeed anything new using positive reinforcement training) we start out on a Continuous Reinforcement Schedule. This ensures the rate of reinforcement for the horse remains high whilst it is learning and while we are shaping the behaviour. A Continuous Schedule means that every single time the horse offers the behaviour, we always click and then reinforce it. So for every single occurrence of the behaviour, the horse gets a click and treat.
We might start use the clicker to help with feet picking or lifting legs too. Your horse picks up his foot for you when you touch him on the back of the knee, you click and give him a treat. Perfect! We are creating beautifully trained horses and having a lot of fun at the same time.
We are also slowly realising the power of the clicker – we can teach anything we want to with this training tool. :0)
As time goes on, we might begin wondering whether we have to click forever more for each and every new thing we teach our horses. Using the example of teaching the horse to pick its foot up; each time we touch him on the back of the knee, do we need to reinforce him for each leg in turn like that forever more, even after he has learnt the behaviour well?
As our clicker repertoire begins to grow, we might begin wondering if we have to click every individual occurrence of these behaviours for the rest of time? This is a very valid question and one which I am asked very frequently.
The answer is no, we don’t need to click each and every time – forever!
A Continuous Reinforcement Schedule is only needed during the initial learning process! Once it is learnt, and we have finished shaping the behaviour, we can switch onto a Variable Reinforcement Schedule to maintain it.
In order to maintain an already-learned behaviour with some degree of reliability, it is absolutely not necessary to reinforce it every time. In fact is quite important that you do not reinforce it on such a regular basis at this point but instead move onto using reinforcement on a more unpredictable basis. This is termed a Variable Schedule of Reinforcement.
NB. We should stay with a Continuous Reinforcement Schedule for a little bit beyond getting the final version of the behaviour, just so the animal learns that we always want that particular version. This way they understand that we are not actually still shaping them when we move onto the Variable Reinforcement Schedule.
Let’s use an example to explain these words in a more practical fashion…
When we begin ‘The Stand On The Mat’ lesson of Alexandra Kurland’s foundation lessons, we reinforce for any interest in the mat at all. This can often begin with just a sniff! Then we begin our shaping process in small approximations towards the end behaviour. Very quickly the horse realises it is the mat which is bringing them the positive reinforcement and so they will become more intrigued and interested in it. They may, as a result, take a step towards it, or even by accident step on to it. Bingo! All of these small chunks/approximations towards the stepping onto the mat are reinforced, one after the other on a Continuous Reinforcement Schedule, every step of the way towards the end, finished behaviour.
There is some nice video footage showing this process of rewarding tiny approximations towards the end behaviour on a Continuous Schedule as I introduce Dancer to our new pedestal. To see the video click here
But what is our end, finished behaviour that we have in our mind? Ideally, we should already have a shaping plan in place, knowing exactly where we are heading from the outset. Perhaps it is that we really want both feet on the mat, standing nice and square as our end, finished behaviour.
We keep the horse on a Continuous Reinforcement Schedule all the way to that end, finished behaviour.
We may have got them stepping one foot onto the mat, then two; clicking and reinforcing every single try.
Yet, still the behaviour isn’t finished according to our shaping plan because their feet might be being plonked on there any old which way. So we continue to shape the behaviour using a Continuous Reinforcement Schedule until the horse is consistently standing squarely on the mat.
At this point, when the horse is consistently stepping onto the mat squarely, we could say we had reached the finished (end) behaviour we set out to achieve? It’s at this point that we could switch to a Variable Reinforcement Schedule to maintain that end behaviour reliably.
This is not to say that later down the line, if we decided that we wanted to refine some other element of standing on the mat, we couldn’t. Instead, we just pop back to a Continuous Reinforcement Schedule again to shape it to where we want to and then switch onto a Variable Schedule to maintain the new finished behaviour once more.
One Time Behaviours
But what about other behaviours which are more ‘one-time’ behaviours, those we only do once in a session so there is no opportunity to be selective and shape the behaviour even further once it is learnt? Haltering is a great example, since once the behaviour is shaped via a constant schedule of reinforcement; it tends only to happen once each session.
What can happen, and regularly does to even experienced clicker trainers, is we can find ourselves clicking and treating each and every time we put the halter on even when the horse has become really good at it. This pattern over time leads to the horse expecting the click as a given and so we can get into the realms of a vicious circle of feeling as though we must click forever more for behaviours such as these.
If we were to simply try to stop clicking at this point, our horses would be pretty confused and upset because they have firmly learnt the pattern of that particular behaviour gets them a click! To suddenly just remove the click is highly likely to cause them huge levels of frustration whilst they try and work out what they did wrong, what else we might want as to why we haven’t clicked! This frustration gives away that they are experiencing the effects of negative punishment (for more information on the 4 quadrants of learning theory – click here), which is not something most clicker trainers wish to do to their horses at all!
Not only that but if we want to maintain a behaviour long term, there has to be some kind of reinforcement there otherwise the behaviour will be subject to extinction (for more info on extinction – click here)
Ultimately, it is for these reasons that it is so important to move a behaviour onto a Variable Reinforcement Schedule as soon as we can and not stay on a Continuous Reinforcement Schedule for very long!
That’s all well and good, but when we find ourselves stuck in a pattern like this such as with haltering, and our horses begin showing us some frustration as we try to change things, how exactly do we deal with it?
Sign up via the box at the top to be notified of new posts…the next post will share some brilliant thoughts from Katie Bartlett, a very experienced US clicker trainer on the different options to try if we find ourselves stuck in a Continuous Reinforcement pattern!