Ross Jacobs: Positive Training Can Be Stressful

Ross Jacobs

Ross Jacobs is an Australian horse trainer who writes frequently of controversial topics in the horse world. This post on the cons of clicker training is an example of his willingness to step into brambly issues with his opinions and experiences. Kudos to Ross for taking on tough topics. Visit his Facebook page here and his website here.

Read Tim McGaffic’s article on clicker training and combined reinforcement.

Ross Jacobs writes:

Let me shout from the start, CLICKER TRAINING IS NOT A NON-STRESSFUL APPROACH TO HORSE TRAINING.

I know this is a controversial topic, but I believe it is worth discussing. I will try to be as clear and plain-speaking as I can and hope I do the topic justice and make myself clear.

Clicker Training

It is a myth to assume that because clicker training involves a reward or the addition of something a horse likes, that it is, therefore, a kinder and less stressful form of training.

I’ll try to break it down why I believe this:

We all know that when it comes to training horses both positive and negative reinforcement methods work. Whether you remove a pressure (-r) or give a treat (+r), both approaches can be effective in teaching a horse to do what we want.

On the surface, +r appears to be the polar opposite approach to –r (one removes a stressor and one adds a reward). So how can both work?

A horse’s behavior is triggered by its need for comfort. Horses are comfort seekers and it is this need to seek comfort that motivates them to make the choices they do. I think this is a universally accepted principle.

Ross Jacobs

So a horse chooses a response or behavior based on its perception of which response will lead to the most comfort (or least discomfort). It doesn’t matter whether we use positive or negative reinforcement training methods – the motivation to search for the correct response is the same. – the desire for comfort.

The horse’s decision is based on which option is the most comfortable. Therefore, from the horse’s point of view, both +r and -r are working through the identically same mechanism. They both impose an uncomfortable situation for a horse and they both finish with a more comfortable situation for a horse. This is what most clicker trainers don’t get.

So how does this work?

Most of us get that when we apply a pressure in –r training (eg, reins, rider’s legs etc), it causes a physical and/or emotional discomfort to a horse. Then it searches for a way to relieve the pressure and when it does we release the pressure, and comfort is restored. In this way, we train horses to give the correct response.

Now with positive reinforcement, it is much the same. Let’s just stick to the idea of using a food treat as the positive reward and the clicker as the marker since that is familiar what most people.

In clicker training, we first target train a horse to teach them the association of the marker (click sound) with the reward (food treat). When they do something correctly, they hear the clicker and this is followed by a food reward. So the food reward creates the comfort – just as in negative reinforcement methods when we remove the pressure it creates comfort.

As I said earlier, horses are comfort seekers and experiencing discomfort is what drives their responses to seek comfort. So in clicker training, it is the promise of a food reward that motivates a horse to search for the correct response. We have already trained them to understand that if they do the right thing they are given a food treat when we first target trained them. So then we exploit this concept of comfort through food treats by withholding of the food treat. This withholding creates discomfort – just like when we apply an uncomfortable pressure with the reins and release that pressure when the horse does the right thing.

Have you ever noticed how anxious and excited some horses or dogs or people get at meal times? The food treat is a source of comfort and withholding the food treat is a discomfort. If you don’t believe me, try using a food treat that a horse doesn’t like eating and see how successful the training is. You’ll have as much success as you would training a child to clean their room with the promise of a raw asparagus.

Now that we have programmed a horse to seek our food treat (comfort food), we use the horse’s need for comfort to bribe him to search for a response that will result in the food treat.

Here is the part that many people don’t appreciate:

The stress created to stimulate a horse to search for a way of getting the food reward is exactly the same degree of stress created to stimulate a horse to search for removal of the pressure when we use negative reinforcement methods. Both methods have to create enough discomfort to motivate a horse to search for a response to find comfort. If this weren’t true, a horse would not search for a new response no matter if you used –r or +r methods.

In positive reinforcement strategies, withholding the food treat creates stress and in negative reinforcement, the stress is created by the addition of pressure.

In positive reinforcement, the brainwashing between the sound of the clicker and the food reward can become so strong that eventually treats can be done away with and the horse associates the click as the reward.

I have tried to be as logical as I can in stating that positive reinforcement is not the stress-free form of training that many think it is. Even though I don’t use clicker training in my work very much, I have found it useful very rarely when horses have been irreparably damaged by improper use of negative reinforcement. Positive reinforcement has its place in the horse world, albeit a limited one. If a person is using it because they think it is kinder and less stressful than more traditional negative reinforcement methods, they have their work cut out proving that to me.

14 Comments

Caryl Richardson

May 30, 2019 at 3:37 am Reply

This topic may be controversial, but I think it’s a topic we horse people need to discuss. I believe the main ingredient in kind and gentle training is the trainer. It’s not about the tools as much as it’s about a knowledgeable, skilled and observant trainer. I love positive reinforcement training. As you say, Ross, it can be very useful for working with traumatized and fearful horses. I find it is also useful in other circumstances. It’s a great tool to have in my toolbox. That said, I agree with you that it’s not necessarily less stressful or kinder than other methods. It’s a powerful tool that can be used or misused, like any other.
Jennifer Hamilton

May 30, 2019 at 8:06 am Reply

Unfortunately, Ross states his opinions without citing any scientific evidence to support them. While nearly all activities create some level of stress, there are significant chemical differences in the brain that effect learning when comparing positive and negative reinforcement. If I keep poking you with a stick to get you to move, your brain will release cortisol and adrenaline to motivate you to move. If I instead dangle a candy bar 5 feet away from you and you move over to attain and eat the candy bar, your brain will release dopamine and serotonin. Those positive chemicals help to solidify the learning that just happened and makes you want to learn more. Although you found comfort when I stopped poking you with a stick, you may not want to sit next to me any more.

If you are going to write articles on the laws of learning as an expert, please make sure your expert advice is based on scientific principles and cites it’s sources.
- Ross Jacobs
  
  March 20, 2020 at 7:00 pm Reply
  
  I will say Jennifer that you don’t need to have scientific evidence to confirm my hypothesis because horses show their stress so clearly in the body language that measuring circulating stress hormones or CNS activity is redundant. When you see a horse stressed by positive reinforcement an aware person could not miss it.
  
  And just as an aside, I am unsure your knowledge of the science is sufficient to equip you to judge the quality of the scientific evidence. I say that because you describe cortisol and adrenaline as being released by the brain. However, cortisol and adrenaline are both released by the adrenal gland, not the brain. So I am not sure what value scientific evidence would be to you if you don’t already know high school biology.
  - Bernelle Verster
    
    September 12, 2020 at 5:16 am Reply
    
    I love clicker training, but have found with at least one of my animals that the desire to get the treat becomes very strong to the point where it interferes with the calmness needed to figure out what she needs to do to get the treat. The resulting frustration leads to her acting out or giving up and walking off. I still use clicker training at times but I mix it up with other methods, and I keep the sessions short.
    
    (P.S. I came here to find out who is Alice, found very useful info and more videos, thank you!)
  - Marjolein
    
    November 18, 2021 at 4:53 am Reply
    
    While I’m always interested in hearing another opinion, I can say in all honesty this comes off as someone ranting about a topic they haven’t researched all that much. Especially seeing comment made in response to Jennifer.
    
    There’s been multiple scientific studies done about positive and negative reinforcement, where positive always comes out on top. Simply because a (food)reward for a task or behavior is a bigger motivator than the release of pressure by making an animal uncomfortable.
    
    You can of course have an opinion about this topic, but I find more and more that a lot of people that work with animals don’t believe in scientific research or believe that they know better, since they have worked with animals for a long time and forget that confirmation bias is a thing.
    
    The response to Jennifer is what really showed how little you seem to want to learn about this topic and how stand off-ish you are about the topic. It’s clear Jennifer only tried explaining it in an easy way that wouldn’t take three pages to explain, but your response to this was to ridicule. Not the greatest way of dealing with someone disagreeing with something you said.
    
    I was interested in reading someone else’s view, but came out quite disappointed.
Valerie

May 30, 2019 at 8:14 am Reply

“The stress created to stimulate a horse to search for a way of getting the food reward is exactly the same degree of stress created to stimulate a horse to search for removal of the pressure when we use negative reinforcement methods. Both methods have to create enough discomfort to motivate a horse to search for a response to find comfort. If this weren’t true, a horse would not search for a new response no matter if you used –r or +r methods.”

I would like to see support for your statement above, perhaps by describing a horse’s body language under both circumstances. Negative reinforcement implies physical discomfort that is applied to the horse until they figure out how to remove it – whereas there is no physical discomfort applied to the horse in positive reinforcement, they “earn” a reward by offering a desired behavior instead (you chose to use food, yet it could be a good scratch, a run around the arena, eating grass for a few minutes: it could be anything that would be rewarding to the horse). I have to say that it’s a little bit of a stretch to try and force the concept that trainers are creating the “same degree of stress” in motivating a horse to want to get a reward as when they create physical pressure from which any animal will automatically try to escape from.

I agree with Caryl above that good training depends most on the skills, knowledge and experience of the trainer rather than only on the tools that they may be using.

You wrote: ” we use the horse’s need for comfort to bribe him to search for a response that will result in the food treat.” – Your use of the word “bribe” is a misconception of how the concept actually is supposed to work – Though I will add that too many trainers do use food as a “bribe” because they haven’t learned how to use a reward any other way, and as you pointed out that method will fail more often than not. It’s a *concept*: the horse learns that he is part of the activity, that he can control what is happening to him and how the experience goes rather than trying to escape uncomfortable physical pressure. And as Caryl said above this concept can work exceptionally and wonderfully for horses that have been traumatized by fear, or to “shape” them to help them understand that it’s okay to be around scary noises or let the farrier pick up their feet.

I guess I see no need to denigrate forms of training that do have a place with some horse in some situation – why do it? Why not discuss how different forms of training can work, which method might be best in a given situation, and why? Why not describe how negative reinforcement might work better than positive reinforcement in a certain situation?
- Ross Jacobs
  
  March 20, 2020 at 7:28 pm Reply
  
  I agree that the term “bribe” is not totally accurate according to the ethology handbook. However, it does give a clear idea of the picture of what motivates a horse to search when we use +r. Furthermore, it is no less accurate than the term “aversion training” that is becoming popular to describe -r methods.
  
  Secondly, I don’t see the essay as a denigration of +r, but as a clarification of some hard truths are that are ignored by the +r advocates. Positive reinforcement is often sold has being a zero or low stress form of training and it is this idea that attracts many people to its ranks. But it is a myth. You only have to watch the zillions of videos on Youtube to see that. Positive reinforcement is no less or no more stressful than negative reinforcement because the minimum level of stress required to motivate a horse to search for an answer is same needed to kickstart the search. Stress is required to get a horse to search to respond in a different way to the way it normally does things (we call it training). If there is no stress there is no motive to change its response. The stress is the same whether it is the stress of what to do to get a treat or the stress of what to do to remove a feel/pressure. An animal does not see one form of stress as kinder or gentler than the other. If the stress is enough to motivate a horse to search, it is all the same.
  
  Lastly, the advantage -r has over +r is that a trainer can moderate the level of stress moment to moment as the horse improves its “try”. But with +r, the trainer can not adjust the stress the horse experiences when search to be given a reward. In some horses withholding a reward to motivate a horse to search creates desperation and in others it results in disinterest and often the result is something in between. The trainer has no control over that. The stress created by +r can not be moderated or adjusted to the moment to moment to meet the needs of the horse. This is a big problem in +r.
  
  So I don’t see these facts, as a denigration. Facts are facts – there is no judgement in them.
Kerry

May 31, 2019 at 5:37 pm Reply

Wow. Another opinion from someone who does not understand the full potential of R+ training and hasn’t looked beyond the most mundane aspects.

First point, removal of pressure is not -R. Removal of pressure is N-. You are removing the negative stimulus. The author apparently is unfamiliar with the training quadrant. If you don’t understand that, you really don’t have a leg to stand on. There is R+, adding a positive reinforcement, R-, removing the positive reinforcer, as in, “No, that was not the correct response.”

Pressure is N+, removal of that pressure is N-.

What is missing is that done properly, the treat can be faded, especially if the animal is allowed autonomy in it’s participation. Of course you set it up to help the animal to choose to participate, but once the animal participates in the training behavior and “gets the game,) you can, if you choose, fade the food reward because just the attention, interaction and being told, “Yes, that was correct,” if done properly can become intrinsically rewarding for the animal so that it participates simply for the neurochemical rewards it is receiving by participating and finding correct answers.

Thousands of search and rescue dogs, police dogs, service dogs are trained with positive reinforcement and no food rewards. So I’m unclear. Is the author suggesting positive reinforcement is stressful? Food rewards are stressful? It’s stressful if you don’t reward the incorrect behavior? I’m stumped.

There’s tons of hard science on this. Is positive reinforcement (R+) misused, used badly, used by people who don’t understand it? Yep. All the time. It’s an incredible tool, and it’s not hard to find good solid training, references, science and coaching for it. It’s not about the clicker. It’s not about the food. Food is easy because it’s a primary reinforcer for all mammals. Required? No.
- Kate
  
  August 30, 2019 at 2:21 pm Reply
  
  Kerry said “First point, removal of pressure is not -R. Removal of pressure is N-. You are removing the negative stimulus. The author apparently is unfamiliar with the training quadrant. ”
  
  Ummm, I’m confused by your reply. I’m currently studying operant conditioning and I’ve not see N-???
  
  -R,. +R, -P, +P.
  
  The – means you remove X
  The + means you add X
  
  Reinforcement increases a behavior.
  Punishment decreases a behavior.
  
  Kerry also said:
  “First point, removal of pressure is not -R. Removal of pressure is N-. You are removing the negative stimulus. The author apparently is unfamiliar with the training quadrant. If you don’t understand that, you really don’t have a leg to stand on. There is R+, adding a positive reinforcement, R-, removing the positive reinforcer, as in, “No, that was not the correct response.” ”
  
  Uh… R- you said was removing the POSITIVE reinforcer? That makes no sense. There is no addition of something in -R. Are you confusing POSITIVE with a definition of “something pleasant” when in the operant conditioning the + symbol means the “addition of” something?
  
  And -R, the removal of something to produce desired behavior. I think with horses we assume that pressure and release of an aid is -R, or the removal of “something” (the aid) to get a desires behavior. But why do you assume this is BAD? Isn’t the horse is the operator, and really isn’t it up to the horse to decide if the application of a physical or mental pressure is perceived as painful / frightening / confusing / uncomfortable / too much?
  
  Kerry said: “Pressure is N+, removal of that pressure is N-.”
  
  This statement makes no sense to me as I can not find a mention of N in regards to operant conditioning quadrant on a quick Google search. You did mention +R and -R, so do you mean +P and -P?
  
  The punishment, or P, side of the quadrant refers to stopping behavior.
  
  -P would be the removal of X that would stop a behavior. For a child this could be taking away their video game for mouthing off. No dessert for mouthing off. The removal of a stimuli to stop behavior.
  
  +P for a child could be adding a spanking to stop the behavior of mouthing off, but it could also be giving them a writing assignment such as “I will not mouth off” for mouthing off. The addition of a stimuli to stop a behavior.
Kerry

May 31, 2019 at 5:43 pm Reply

One last comment… can horses (or any animal) be stressed by badly employed R+? Absolutely. Which is one reason I video my training sessions, so that if something goes south, I can see what it is that I am doing to create that. It’s usually something I’m unconsciously doing. Once I correct myself, the problem generally subsides.
Kerry

June 2, 2019 at 9:28 am Reply

If anyone is curious about the finer points of the learning quadrant, Mustang Maddy explains it very well here, starting at about 13:30. https://www.youtube.com/watch?v=GS7uxFWwF5M&t=2591s&fbclid=IwAR3ErIH7FPzLZv-MxCBQim-ehdBxDeernC2jAcPNfOe3IsAILFwp1TBSHyg
Jim Groesbeck

January 17, 2021 at 9:54 am Reply

I think you , Ross, are talking about the stress of -P Which is basically in this case, delaying gratification. A horse can get anxious. Which is stress.
On the other hand, once the behavior is established, then it is on auto pilot, and only needs intermittent reinforcement.

I do think there is a very fine line between supporting the searching/curiosity mechanisms and using the delay of gratification to establish a response.
Jim Groesbeck

January 17, 2021 at 10:06 am Reply

What Ross is referring to I think, is that the withholding of a reward in order to use positive reinforcement can create anxiety in the horse, which is counter productive to the aim of clicker training.
But if a person were to help, support, keep a horse in searching for something, and not tip it over into stress, the whole experience would be positive. It doesn’t mean we are poking him with a stick to do something else. It means we use his natural inclination to avoid trouble, to search for something different to do. Then we allow it.
blocking a wrong turn need not be construed as negative.
In other words, our negative is not very negative.
I try to operate within that realm when possible.
Ann

June 8, 2022 at 9:38 am Reply

Withholding of a reward is akin to failure to release pressure. Both are trainer mechanics issues, not method issues. In R+ situations that can be because of fumbling unfamiliarity with where the food is, how to get it, and how to deliver it to the horse. Another possible cause is “lumping” rather than “splitting” the requested behavior down into understandable slivers. I think the same can be said of R-. If a trainer doesn’t have the mechanics and feel for the try, the horse can easily get frustrated and confused. In both cases, it’s not the horse. It’s the trainer.

14 Comments

Leave a Reply Cancel reply