Many of us rely on our power data to set training zones and measure training stress, but can you trust that what your power meter is telling you?
Since becoming commercially available in 1989, bike power meters are a crucial component of most cyclists’ and triathletes’ training armoury. Nowadays, many of us have power meters built into our turbo trainers thanks to the advent of ‘smart’ trainers, which are clever enough to ensure our indoor sessions mimic real-world conditions like hills and rough surfaces.
The calculation of power (PO = force x angular velocity) for most power meters is the same. Torque measured through strain gauges, combined with crank angular velocity (i.e. cadence) equals PO. However, there will be some variance in different power meter’s accuracy and reliability…
For example, I ran both my TT bike power meter (Quarq) and my smart trainer power meter (Tacx Neo 2T) during a Zwift Time Trial. My Quarq recorded via my Garmin Edge 810 and my Tacx via Zwift.
As you can see from the images I extracted from TrainingPeaks, there's a 37-watt difference when comparing normalised power! This is the difference between riding at threshold or riding above Vo2 - a whole zone.
The cadence reading is very close to being a match, showing that both power meters are assumed to be working correctly, whilst they have up to date software and are being operated in the same environmental conditions. The Quarq zero offset was also checked and within Srams acceptable range of +/-1000.
Is your power meter reliable and valid?
To test a power meter, you’re looking for two key factors:
- Validity (how accurate is the power meter at measuring Watts)
- Reliability (how good is the power meter at being consistent with measuring Watts)
When compared with a gold standard measure (usually a very expensive and manually calibrated Lode exercise bike) inside a laboratory, power meters look valid (within 5% is good, within 2% is excellent).
However, once they've been put into a box, thrown in a van, crossed the Atlantic in a plane at altitude and in freezing temperatures, and then heated up in the back of a delivery truck on their way to the buyer, they likely become less valid or reliable.
To measure reliability, companies will test their power meters at a frequency of expected use by the consumer (i.e. multiple times a week) in the lab. However, a study comparing Garmin Vector pedals, a Stages power crank and an SRM power meter showed variations of up to -16.5% when testing in real world conditions.
It also matters where the power meter is placed and the size of the power meter. Small units are more susceptible to vibration and temperature change. Units placed closest to the point of wattage transfer (bottom of shoe to pedal) can be argued to be more valid as this is the true wattage put into the bike, but some argue it's the wattage going into the drive train that's key.
Pedal-based power meters and crank arms would be the closest you can get to the point of transfer. Crank-based and bottom bracket-based power meters are known as drive train power meters.
You then get the argument that it’s the watts that make it down the drive train into the rear wheel that really count, and so we get hub-based power meters or power meters that predict your power based on wheel speed, weight and air resistance.
Due to chains and cassettes being susceptible to wear and tear, as well as foot/cleat position being prone to variance, crank-based power meters are seen as being the best place to measure watts.
I’m using a crank-based Quarq power meter and a hub-based Tacx power meter. It seems unlikely that the 37W difference I experienced is being lost through my chain. The reading difference between the two power meters is over 10%, which is nowhere near an accepted variance.
Ultimately, a power meter needs to be valid, but what it really needs to be is reliable. Because we use power to set training zones and measure training stress.
It doesn’t matter so much if it isn’t in the excellent category of valid, but it must be in the excellent category of reliability. This is because training zones and training stress are set based on your individualised power number.
Which power meter do you trust?
I know I’ll race and do important training sessions on my TT bike, so the Quarq is the power meter I use.
But I use Zwift a lot and it will be important that I can track numbers there too, so I will have to keep the wattage difference in mind. Remember, the difference between them is a whole training zone!
I’m fortunate to have another bike with a Quarq power meter on as well, so I decided to do a little testing. I used a protocol that involved a sub-maximal stage that’s 5 minutes long with a 5-minute recovery period, and then 2-minute and 10 seconds maximal power stages with 5 and 2-minute recovery respectively.
On two separate days I tested both the Quarq on my TT bike (Quarq TT) and the Quarq on my road bike (Quarq RB). I used the Tacx trainer as the 'baseline'…
The data above shows that the Quarq TT is consistently outside of an acceptable limit compared to the Tacx. It also gets worse as the power goes higher. This could explain the reason for the large watt difference I saw when doing my earlier Zwift Time Trial. The Quarq RB is almost the same as the Tacx up to the maximal 10 seconds sprints where a good (4%) and then unacceptable (8%) difference is found.
I can't say the Tacx is correct, but the data set strongly suggests it is, thanks to the Quarq RB being very similar. However, does it change which power meter I use? Well... no, not really.
I will still race and Zwift on my TT and use my road bike for social riding, so technically still using the Quarq TT data is correct. I just can't boast truthfully about my power readings! I guess not all power data is equal after all...