I’ve had a number of enquiries recently about how to calculate the standard error of measurement (SEM) for a range of different repeatability studies. This has struck me as odd because in my mind the SEM is a simple and clearly defined measure and given this it seems quite obvious to me how to calculate it.

On looking at a range of text books though I think I can see what the problem is. As I’ve pointed out in a previous post the SEM is almost always presented as a derivative of the intra-class correlation coefficient (ICC). Portney and Watkins for example introduce it through the formula SEM = SD√(1-ICC). For those not used to maths this looks bad enough on its own. When they probe a little further, however, they will find that the ICC itself is an esoteric output from a specifically structured ANOVA. No wonder so many give up and assume that the SEM is the rather abstract product of some largely incomprehensible calculations.

But nothing could be further from the truth. The SEM is simply the standard deviation of a number of measurements made on the same person. Bland and Altman actually recommend that it should be referred to as the within-subject standard deviation to make this clear (although I think SEM is so well established now that this is a battle not worth fighting). If you understand what a standard deviation is and how it represents variability on measurements from different people (and everyone the most basic interest in clinical measurement really should) then you should also understand what the SEM is and hown it represents variability within measurements taken on the same person. In a very real sense it is the SEM that is the primary measure of repeatability and the ICC should be seen as a derivative of it rather than vice versa.

Most importantly if you know how to calculate a standard deviation (either with a pencil and paper, calculator, or spreadsheet) then you already know how to calculate the SEM. You just use the same equation to calculate the SD of a number of measurements made on the same person rather than the those made on a number of different people. If the measurements have been made by a number of different assessors working in a particular gait lab then the SEM can be taken as representative of the lab as a whole. If they have all been made by the same assessor then they are only really valid when that individual is making the measurements.

If you make measurements on more than one person (and you should in any well designed repeatability study) then you can calculate the within-subject standard deviation for each person and you will find that this varies a little from person to person. This is where the only mildly complicated step comes in the calculations in that the overall SEM is the *root mean square* average of these within subject standard deviations (rather than the simple arithmetic mean).

Just to show how straightforward the calculations are I’ve prepared a document outlining how to do the sums which you can download at this link. All the data, figures and calculations for the examples are also available in these two Excel spreadsheets (here and here). If you want to listen to a more general talk about repeatability studies then there is one on my YouTube channel which uses the same examples. This is a recording of an open virtual classroom giving publicity to our MSc in Clinical Gait Analysis by distance learning so you’ll have to listen to a couple of minutes sales pitch before you get to the interesting bit!

PS Apologies to some of my recent students who probably wish they had had access to these resources a long time ago!

Thanks Richard

I think some confusion comes about because SEM is commonly referred to as the Standard Error of the MEAN – that is the SD of sample means taken from the same population?

I’d agree entirely Matt. When I talk about this (in the YouTube lecture linked to in the post for example) I generally stop to remind people that the

standard error of measurementand thestandard error of the meanare quite different concepts even if they have the same abbreviation. I’ve forgotten to do this in this blog post – thanks for the reminder 🙂 .It is another good reason for adopting the term

within-subject standard deviationas Bland and Altman suggest but unfortunately I think the window of opportunity for adopting this terminology has passed and we’ll just have to learnt to cope with terminology we’ve got.Thanks Richard, both for the lecture and the material here on the blog – good stuff! But the links to the example Excel-files seem to have gone bad though – could you relink them maybe?

Think I’ve sorted this. The spreadsheets just seemed to have disappeared from my personal storage space at WordPress. Let me know it this happens again.