In part one of this series, the development of a whole horse ethogram for ridden horses – comprising 24 behaviours – was described1, and its use to differentiate between lame and non-lame horses based on video recordings2.
The demonstration of ≥8 behaviours was likely to indicate the presence of musculoskeletal pain. The improvements in behaviour, observed after lameness was abolished using diagnostic analgesia, indicated a causal relationship between pain and behaviour.
Part two looks at the ability of non-trained assessors to apply the ridden horse ethogram to video recordings, compares real-time application of the ethogram by a trained analyst with video assessment, and investigates the ability of veterinarians to apply the ethogram in real time to lame and non-lame horses.
Phase one
The first phase aimed to determine whether the ethogram could be applied by people from a variety of professional backgrounds without prior training in recognising pain-related behaviour3.
Anonymised video recordings of 21 lame horses before and after diagnostic analgesia (42 recordings) were assigned random numbers and assessed in numerical order. The recordings were reviewed by a trained assessor and 10 untrained assessors:
- two equine interns (recent veterinary graduates)
- a junior clinician (a graduate of five years)
- five equine technicians
- two equine nurses
The videos were assessed in real time, but could be stopped and replayed.
For inclusion in the study, the horses had to be ridden by the same person (a professional rider) before and after diagnostic analgesia; footage had to include both working trot rising, and working canter on the left and right reins.
The duration of the recordings ranged from three minutes to five minutes, both before and after diagnostic analgesia among different horses; however, within individual horses was time matched before and after diagnostic analgesia. The lameness grades ranged from 1 to 4 out of 84, with 2 the most frequent grade.
For the trained assessor, the number of behaviours exhibited by lame horses before diagnostic analgesia ranged from 3 to 12 out of 24 (median 10; mean 8.9). After lameness and overall performance had been substantially improved using diagnostic analgesia, the number of behaviours ranged from 0 to 6 out of 24 (median 3; mean 3). The difference in behaviour scores between before and after diagnostic analgesia ranged from 2 to 12 out of 24 (median 6; mean 6).
For untrained assessors, the number of behaviours exhibited by the 21 lame horses prior to diagnostic analgesia ranged from 3.6 to 11.6 out of 24 (median 9; mean 9.1). After lameness and overall performance had been substantially improved using diagnostic analgesia, the number of behaviours observed ranged from 1.6 to 8.5 out of 24 (median 4.5; mean 4.2). The difference in behaviour scores between before and after diagnostic analgesia ranged from 1.6 to 8.5 (median 4.5; mean 3.6).
For all the assessors (trained and untrained), the decrease in behaviour scores after diagnostic analgesia was highly significant (Wilcoxon signed-rank, p<0.0001). Additionally, significant decreases existed in the mean behaviour scores for the trained analyst and the non-trained observers assessed independently.
Considering a threshold of 8 for the presence of musculoskeletal pain, agreement between the untrained and trained assessors was moderate (Fleiss’ kappa 0.49) and slight to poor (Fleiss’ kappa 0) before and after analgesia, respectively.
Untrained observers tended to overinterpret the ethogram, ignoring terms such as “repeated” (toe drag). We identified those behaviours – such as an intense stare, and repeated toe drag and timed behaviours (such as rushed or slow gait), which the assessors found most difficult to assess – to target future training.
Each horse acted as its own control (repeated measures design); the only variable was removal of pain. Reduction in behaviour scores after resolution of musculoskeletal pain provided further evidence that these behavioural markers were a likely reflection of pain.
It was not possible to hide the presence of lameness, which could have biased the assessors; however, the lameness was generally low-grade and not easy to determine by untrained individuals.
Recognition of these behaviours may be easier than identifying low-grade lameness, therefore facilitating identification of musculoskeletal pain.
Training has been previously demonstrated to improve the accuracy of the application of a facial expression ethogram for ridden horses5 – and it is likely that training would improve use of the ridden horse ethogram.
Phase two
The objectives of the second phase of the study6 were:
- to compare the real-time assessment of non-lame and lame horses with analysis of video recordings of the horses by a trained analyst
- to determine whether, after preliminary training, equine veterinarians could consistently apply the ridden horse ethogram in real time with live ridden horses (lame and non-lame) and in agreement with a trained analyst.
Ten equine veterinarians and two reserve veterinarians of variable experience were recruited. After undergoing online training on how to apply the ridden horse ethogram, each volunteer blindly assessed six sets of video recordings of lame or non-lame horses. Extensive feedback was provided to each volunteer to try to improve accuracy.
A total of 20 horse-rider combinations were recruited. The horses were a convenience sample, in regular work, and capable of working “on the bit”.
The 10 equine veterinarians and a trained assessor applied the ridden horse ethogram in real time to 20 horse-rider combinations performing a purpose-designed dressage test of 8.5 minutes’ duration. The test was of British Dressage preliminary standard (medium walk, working trot and canter), but, additionally, included consecutive 10m diameter circles in trot to the left and right, because this exercise is useful for lameness identification.
Video recordings of the test were acquired from the same position relative to the arena as the assessors. The video recordings were analysed retrospectively by the trained assessor. An independent experienced lameness clinician determined whether horses were lame and recorded the presence of gait abnormalities in canter (a stiff and stilted canter, close spatial and/or temporal placement of the hindlimbs, or a canter that lacked a suspension phase).
Prior to the test, each horse was assessed by a veterinary physiotherapist. The presence of thoracolumbosacral region muscle tension and/or pain likely to influence ridden performance was graded with a binary method as yes or no.
Saddle fit was assessed by a Society of Master Saddle Fitters-qualified saddle fitter, who determined whether the tack was likely to induce pain that may compromise performance (yes or no).
Rider skill level was assessed retrospectively from the video recordings by a British Horse Society Instructor and graded from 1 to 10, using the Fédération Equestre Internationale scale.
A total of 16 horses were lame and 4 were non-lame; 11 had an ill-fitting saddle; 14 had epaxial muscle tension/pain.
Rider skill level ranged from 3 to 8 out of 10 (mean 5.1; median 5). The expert determined total scores of between 3 and 6 out of 24 for the non-lame horses; two lame horses scored 3 and 6; 14 lame horses scored between 8 and 16 (Figures 1 to 4).




No significant difference existed in real-time scores and video-based scores for the trained assessor.
Good agreement existed between the expert’s scores and the mean test observers’ scores. Excellent consistency existed in overall agreement among raters (intraclass correlation 0.97; p<0.001).
A significant difference was present between behaviour scores, according to lameness status for real-time (p=0.017) and video (p=0.013) observations by the trained assessor, and for the test observers’ mean scores for real-time (p=0.03). A non-significant effect existed of abnormalities of canter on behaviour scores.
No effect existed of epaxial muscle tension or pain, saddle fit, or rider skill on equine behaviour scores.
It was not possible to conceal the presence of lameness, although, in the majority of horses, this was mild (grade ≤2 out of 8). The assessors had to concentrate hard to evaluate the presence of 24 behavioural markers repeatedly throughout the test; however, the presence of subtle lameness was, nonetheless, a potential cause of bias.
It was concluded the ethogram was applied by veterinarians, after preliminary training, with adequate consistency, with differentiation between non-lame and most lame horses in real time.
After appropriate training in its application, the ethogram may provide a useful tool for determining the presence of musculoskeletal pain in horses performing poorly and can be used to monitor response to treatment.
The volunteer veterinarians were unanimous in recognising the potential usefulness of the ethogram, facilitating the identification of likely musculoskeletal pain even if lameness was not obvious.
Substantial evidence
A substantial body of evidence now exists to indicate the ridden horse ethogram is a useful tool for the recognition of musculoskeletal pain in ridden horses. However, not all lame horses will score ≥8.
In phase two, no significant effect of saddle fit on equine behaviour was present; however, we have previously observed changes in horses’ behaviour after changing from a saddle with excessively tight tree points to a better fitting saddle.
Phase two saw no correlation between rider skill level and equine behaviour. Rider skill level may influence which behaviours are demonstrated by a horse, but, to date, a major change in the number of behaviours exhibited has not been observed, although further investigation is merited.
However, in a study comparing the effects of four riders of similar skill, but ranging from 10% to more than 20% of horse bodyweight – riding six horses in a crossover design – a significant correlation was seen between behaviour scores and rider bodyweight7.
This tool is used daily for the evaluation of all horses presented for lameness or poor performance evaluation, or for pre-purchase examinations. While some genuinely naughty horses exist that may exhibit only one behaviour – for example, rearing – the author believes horse owners and trainers need education to recognise that the demonstration of multiple behaviours is usually a reflection of an underlying pain-related condition.
Horses that are unwilling, unresponsive to cues from the rider, spooky or tense – or that have an unsteady head carriage – often have musculoskeletal pain.
Phase two of this work was supported by World Horse Welfare.