Sunteți pe pagina 1din 4

Statistics Notes: The use of transformation when comparing two means -- Bland and Alt...

Page 1 of 4

Search all BMJ ProductsSearch

Search

Search bmj.com

BMJ 1996;312:1153 (4 May)

Education and debate


Statistics Notes: The use of transformation when comparing two means
J Martin Bland, professor of medical statistics,a Douglas G Altman, head b
a

Department of Public Health Sciences, St George's Hospital Medical School, London SW17 0RE, b ICRF
Medical Statistics Group, Centre for Statistics in Medicine, Institute of Health Sciences, PO Box 777, Oxford OX3
7LF
Correspondence to: Professor Bland.
The usual statistical technique used to compare the means of two groups is a confidence interval or significance test based on
the t distribution. For this we must assume that the data are samples from normal distributions with the same variance. Table 1
shows the biceps skinfold measurements for 20 patients with Crohn's disease and nine patients with coeliac disease.

Table 1--Biceps skinfold thickness (mm) in two groups of


patients
---------------------------------------------------------Crohn's disease

Coeliac disease

---------------------------------------------------------1.8

2.8

4.2

6.2

1.8

3.8

2.2

3.2

4.4

6.6

2.0

4.2

2.4

3.6

4.8

7.0

2.0

5.4

2.5

3.8

5.6

10.0

2.0

7.6

2.8

4.0

6.0

10.4

3.0

---------------------------------------------------------Mean=4.72

Mean=3.53

SD=2.42

SD=1.96

The data have been put into order of magnitude, and it is fairly obvious that the distribution is skewed and far from normal.
When, as here, the assumption of normality is wrong we can often transform the data to another scale where the assumption
of normality is reasonable. The transformation which achieves a normal distribution should also give us similar variances.1
Table 2 shows the results of analyses using the square root, logarithmic, and reciprocal transformations. The log
transformation gives the most similar variances and so gives the most valid test of significance. It also gives a reasonable
approximation to a normal distribution.

http://www.bmj.com/cgi/content/full/312/7039/1153

4/22/2008

Statistics Notes: The use of transformation when comparing two means -- Bland and Alt...

Page 2 of 4

Table 2--Biceps skinfold thickness compared for two


groups of patients, using different transformations
-----------------------------------------------------------------------Two sample

95% Confidence

ttest, 27 df
-------------Transformation

interval for difference


on transformed
scale

Variance
ratio,
larger/smaller

-----------------------------------------------------------------------None, raw data

1.28

0.21

-0.71 mm to 3.07 mm

1.52

Square root

1.38

0.18

-0.140 to 0.714

1.16

Logarithm

1.48

0.15

-0.114 to 0.706

1.10

-1.65

0.11

-0.203 to 0.022

1.63

Reciprocal

Confidence intervals for transformed data are more difficult to interpret, however. Unlike the case of a single sample,2 the
confidence limits for the difference between means cannot be transformed back to the original scale. If we try to do this the
square root and reciprocal limits give ludicrous results. The lower limit for the square root transformation is negative. If we
square this we get a positive lower limit and the confidence interval does not contain zero, even though the difference is not
significant. If the observed difference were exactly zero the confidence limits would be equal in magnitude but opposite in sign.
Transforming back by squaring would make them equal. For the reciprocal transformation the upper limit is very small (0.022)
and transforming back by taking the reciprocal again gives 45.5. There is no way that the difference between mean skinfold in
these two groups could be 45.5 mm. Thus the confidence interval for a difference cannot be interpreted on the untransformed
scale for these transformations.
Only the log transformation gives interpretable (and thus useful) results after we transform back. Using the antilog
transformation, we get a confidence interval of 0.89 to 2.03, but these are not limits for the difference in millimetres. How could
they be, for they do not contain zero, yet the difference is not significant? They are in fact the 95% confidence limits for the
ratio of the geometric mean2 for patients with Crohn's disease to the geometric mean for patients with coeliac disease. If there
were no difference the expected value of this ratio would be 1, not 0, and so lie within the limits. This procedure works because
when we take the difference between the logarithms of the two geometric means we get the logarithm of their ratio, not of their
difference.3 We thus have the logarithm of a pure number and we antilog this to give the dimensionless ratio of the two
geometric means. The logarithmic transformation is strongly preferable to other transformations for this reason. Fortunately, for
medical measurements it often achieves the desired effect.

1. Bland JM, Altman DG. Transforming data. BMJ 1996;312:770. [Free Full Text]
2. Bland JM, Altman DG. Transformations, means, and confidence intervals. BMJ 1996;312:1079.
[Free Full Text]
3. Bland JM, Altman DG. Logarithms. BMJ 1996;312:700. [Free Full Text]

Related Article
Believability of relative risks and odds ratios in abstracts: cross sectional study
Peter C Gtzsche
BMJ 2006 333: 231-234. [Abstract] [Full Text] [PDF]

This article has been cited by other articles:




Okely, A. D., Booth, M. L., Hardy, L., Dobbins, T., Denney-Wilson, E. (2008). Changes in Physical
Activity Participation From 1985 to 2004 in a Statewide Survey of Australian Adolescents. Arch
Pediatr Adolesc Med 162: 176-180 [Abstract] [Full text]

http://www.bmj.com/cgi/content/full/312/7039/1153

4/22/2008

Statistics Notes: The use of transformation when comparing two means -- Bland and Alt...

Page 3 of 4

Froment, P, Vigier, M, Negre, D, Fontaine, I, Beghelli, J, Cosset, F L, Holzenberger, M, Durand, P


(2007). Inactivation of the IGF-I receptor gene in primary Sertoli cells highlights the autocrine
effects of IGF-I. J Endocrinol 194: 557-568 [Abstract] [Full text]

DeLeon Ortega, J. E., Sakata, L. M., Kakati, B., McGwin, G. Jr, Monheit, B. E., Arthur, S. N.,
Girkin, C. A. (2007). Effect of Glaucomatous Damage on Repeatability of Confocal Scanning Laser
Ophthalmoscope, Scanning Laser Polarimetry, and Optical Coherence Tomography. IOVS 48:
1156-1163 [Abstract] [Full text]

Gotzsche, P. C (2006). Believability of relative risks and odds ratios in abstracts: cross sectional
study. BMJ 333: 231-234 [Abstract] [Full text]

Curtis, J. P., Portnay, E. L., Wang, Y., McNamara, R. L., Herrin, J., Bradley, E. H., Magid, D. J.,
Blaney, M. E., Canto, J. G., Krumholz, H. M. (2006). The Pre-Hospital Electrocardiogram and Time
to Reperfusion in Patients With Acute Myocardial Infarction, 2000-2002: Findings From the
National Registry of Myocardial Infarction-4. J Am Coll Cardiol 47: 1544-1552 [Abstract] [Full text]

Nallamothu, B. K., Wang, Y., Magid, D. J., McNamara, R. L., Herrin, J., Bradley, E. H., Bates, E.
R., Pollack, C. V. Jr, Krumholz, H. M., for the National Registry of Myocardial Infarction, (2006).
Relation Between Hospital Specialization With Primary Percutaneous Coronary Intervention and
Clinical Outcomes in ST-Segment Elevation Myocardial Infarction: National Registry of Myocardial
Infarction-4 Analysis. Circulation 113: 222-229 [Abstract] [Full text]

Sevrukov, A. B., Bland, J. M., Kondos, G. T. (2005). Serial Electron Beam CT Measurements of
Coronary Artery Calcium: Has Your Patient's Calcium Score Actually Changed?. Am. J.
Roentgenol. 185: 1546-1553 [Abstract] [Full text]

Halfvarson, J, Standaert-Vitse, A, Jarnerot, G, Sendid, B, Jouault, T, Bodin, L, Duhamel, A,


Colombel, J F, Tysk, C, Poulain, D (2005). Anti-Saccharomyces cerevisiae antibodies in twins with
inflammatory bowel disease. Gut 54: 1237-1243 [Abstract] [Full text]

Bradley, E. H., Herrin, J., Wang, Y., McNamara, R. L., Webster, T. R., Magid, D. J., Blaney, M.,
Peterson, E. D., Canto, J. G., Pollack,, C. V. Jr, Krumholz, H. M. (2004). Racial and Ethnic
Differences in Time to Acute Reperfusion Therapy for Patients Hospitalized With Myocardial
Infarction. JAMA 292: 1563-1572 [Abstract] [Full text]

Locatelli, L., Zivadinov, R., Grop, A., Zorzon, M. (2004). Frontal parenchymal atrophy measures in
multiple sclerosis. Mult Scler 10: 562-568 [Abstract]

Pillow, J.J., Ljungberg, H., Hulskamp, G., Stocks, J. (2004). Functional residual capacity
measurements in healthy infants: ultrasonic flow meter versus a mass spectrometer. Eur Respir J
23: 763-768 [Abstract] [Full text]

Sutton, T. M., Stewart, R. A. H., Gerber, I. L., West, T. M., Richards, A. M., Yandle, T. G., Kerr, A.
J. (2003). Plasma natriuretic peptide levels increase with symptoms and severity of mitral
regurgitation. J Am Coll Cardiol 41: 2280-2287 [Abstract] [Full text]

Shahar, E., Redline, S., Young, T., Boland, L. L., Baldwin, C. M., Nieto, F. J., O'Connor, G. T.,
Rapoport, D. M., Robbins, J. A. (2003). Hormone Replacement Therapy and Sleep-disordered
Breathing. Am. J. Respir. Crit. Care Med. 167: 1186-1192 [Abstract] [Full text]

http://www.bmj.com/cgi/content/full/312/7039/1153

4/22/2008

Statistics Notes: The use of transformation when comparing two means -- Bland and Alt...

Page 4 of 4

Wijeysundera, D. N., Rao, V., Beattie, W. S., Ivanov, J., Karkouti, K. (2003). Evaluating Surrogate
Measures of Renal Dysfunction After Cardiac Surgery. Anesth. Analg. 96: 1265-1273
[Abstract] [Full text]

Vickers, A. J, Altman, D. G (2001). Statistics Notes: Analysing controlled trials with baseline and
follow up measurements. BMJ 323: 1123-1124 [Full text]

Bergus, G. R., Chapman, G. B., Levy, B. T., Ely, J. W., Oppliger, R. A. (1998). Clinical Diagnosis
and the Order of Information. Med Decis Making 18: 412-417 [Abstract]

Briggs, A. H., Gray, A. M. (1998). Power and Sample Size Calculations for Stochastic CostEffectiveness Analysis. Med Decis Making 18: S81-S92 [Abstract]

Kerry, S. M, Bland, J M. (1998). Analysis of a trial randomised in clusters. BMJ 316: 54-54
[Full text]

Azizi, M., Ezan, E., Nicolet, L., Grognet, J.-M., Menard, J. (1997). High Plasma Level of N-AcetylSeryl-Aspartyl-Lysyl-Proline : A New Marker of Chronic Angiotensin-Converting Enzyme Inhibition.
Hypertension 30: 1015-1019 [Abstract] [Full text]

Contact us - Privacy policy - Web site terms & conditions - Site map
HighWire Press - Feedback - Help - 1996 BMJ Publishing Group Ltd.

http://www.bmj.com/cgi/content/full/312/7039/1153

4/22/2008

S-ar putea să vă placă și