Sunteți pe pagina 1din 8

Inferred HLA Haplotype Information for

Donors From Hematopoietic Stem Cells


Donor Registries
Pierre-Antoine Gourraud, Phillipe Lamiraux,
Nabil El-Kadhi, Colette Raffoux, and
Anne Cambon-Thomsen
ABSTRACT: Human leukocyte antigen (HLA) matching remains a key issue in the outcome of transplantation.
In hematopoietic stem cell transplantation with unrelated
donors, the matching for compatible donors is based on
the HLA phenotype information. In familial transplantation, the matching is achieved at the haplotype level
because donor and recipient share the block-transmitted
major histocompatibility complex region. We present a
statistical method based on the HLA haplotype inference
to refine the HLA information available in an unrelated
situation. We implement a systematic statistical inference
of the haplotype combinations at the individual level. It
computes the most likely haplotype pair given the phenotype and its probability. The method is validated on
301 phase-known phenotypes from CEPH families (Centre dEtude du Polymorphisme Humain). The method is
further applied to 85,933 HLA-A B DR typed unrelated
ABBREVIATIONS
BMD bone marrow donor
CEPH Centre dEtude du Polymorphisme Human
HLA human leukocyte antigen

INTRODUCTION
Allogeneic hematopoietic stem cell (HSC) transplantation is now a well-established curative therapy for an
increasing number of hematologic diseases [13]. The

From the Faculty of Medicine, INSERM, Toulouse, France (P.A.G.,


A.C.-T.), Laboratoire Epitech de Recherche en Informatique Appliquee, Le
Kremlin Bictre, France (P.L., N.E.-K.), and FGM France Greffe de
Molle, French Registry of Hematopoetic cells Donors, Paris, France (C.R.).
Address reprint requests to: Pierre-Antoine Gourraud, INSERM Unit
558, Faculty of Medicine, 37 alles Jules Guesde, F-31000, Toulouse,
France; E-mail: gourraud@cict.fr.
Supported by EFG (Etablissement Franais des Greffes) Grant 2003,
EU contract MADO No: QLG7-CT-2001-00065.
Received December 10, 2004; revised January 3, 2005; accepted January
11, 2005.
Human Immunology 66, 563570 (2005)
American Society for Histocompatibility and Immunogenetics, 2005
Published by Elsevier Inc.

donors from the French Registry of hematopoietic stem


cells donors (France Greffe de Molle). The average value
of prediction probability is 0.761 (SD 0.199) ranging
from 0.26 to 1. Correlations between phenotype characteristics and predictions are also given. Homozygosity
(OR 2.08; [2.022.14] p 103) and linkage disequilibrium (p 103) are the major factors influencing the
quality of prediction. Limits and relevance of the method
are related to limits of haplotype estimation. Relevance
of the method is discussed in the context of HLA
matching refinement. Human Immunology 66, 563570
(2005). American Society for Histocompatibility and
Immunogenetics, 2005. Published by Elsevier Inc.
KEYWORDS: Donor registry; HLA haplotypes; population immunogenetics; statistical application; transplantation

HSC
OR

hematopoietic stem cell


odds ratio

role of human leukocyte antigen (HLA) matching between donor and recipient has been studied by many
groups over the past years, but its optimal level remains unclear [4, 5]. The development of molecular
typing techniques allows a refined matching and
thus contributes to reduce risk of graft immunologic
failure from host-versus-graft and graft-versus-host
allorecognition.
The best donor remains an HLA-matched relative, but
such a donor is not always available. In 70% of the cases,
a search for an unrelated HLA-matched donor is performed
among the 9.1 million bone marrow donors (BMDs) gathered in 54 stem cell donor registries from 40 countries and
0198-8859/05/$see front matter
doi:10.1016/j.humimm.2005.01.011

564

37 cord blood registries from 21 countries from BMDs


worldwide (http://www.bmdw.org) and the World Marrow Donor Association (http://www.worldmarrow.org/).
Nevertheless, the amount of HLA information taken
into account is different. Indeed, through typing the
patients relatives, the actual level of HLA information
used in familial HSC transplantation is the HLA haplotypes: matching is thus for two haplotypes (genoidentical situation) or only one (semi haplo-identical
situation) segregating in the family. In contrast, in
unrelated situations, the haplotype information is
known in the patient but not in the donor: that is,
there is an asymmetry in the information available.
This is usually solved by taking into account the most
minimally shared information; namely, the phenotypic
one.
The large content of the BMD registries enables the
estimate of HLA frequencies in a given population
[6 10]. The HLA population genetics data have always
been a relevant field to apply maximum likelihood
estimation of haplotype frequencies [1113]. Because
of the structure of the major histocompatibility complex region, such a method has successfully overcome
the lack of phase information at an individual level to
produce haplotype frequencies in populations. Besides
their interest from the population point of view, we
investigate here their possible use for the selection of
unrelated donors from BMD registries in individual
cases. The aim is to study how much population frequency information can be used to upgrade the donor
information taken into account for the individual decision at haplotype level rather than downgrading the
patient one at phenotype level. Knowing the genetic
background of donors throughout the registries, we
implemented a systematic statistical inference of haplotype pairs at the individual level. It computes the
most likely haplotype pair given the phenotype and
haplotype frequency information in the donors population as additional information. Incomplete phenotype
and use of HLA nomenclature is allowed.
Genetic properties influencing the accuracy of the
prediction are discussed and may be of interest in genetic
epidemiology as an example of individual haplotype inference procedure.

POPULATION AND METHODS


As reminded in the following sections, a diploid three
contiguous locus phenotype can result in a maximum
of four distinct phase configurations on the chromosomes.

P.A. Gourraud et al.

HLA Phenotype
Locus A-Locus B-Locus DR
(1, 2)-(8, 44)-(4, 3)
1-8-3, 2-44-4
Possible
1-44-3, 2-8-4
pairs of
1-8-4, 2-44-3
haplotype
1-44-4, 2-8-3

For a K-ploid phenotype of R contiguous loci, n, the


number of possible pairs of haplotypes is n KHr1,
where the number of heterozygous loci is HR. There is
only one possible pair if only one locus is heterozygous.
The proposed algorithm deals with this issue.
Algorithm
Given haplotype frequencies, the algorithm computes
the likelihood for each possible phase. Then, it selects the
one with the maximum value:
HLA Phenotype
Locus 1-locus 2-locus
A-B-C, a-b-c L1 2 fABC fabc

A-B-c, a-b-C L2 2 fABc fabC


A-b-C, a-B-c L3 2 fAbC faBc
A-b-c, a-B-C L4 2 fAbc faBC

If the obtained pair of haplotypes is homozygous, the


likelihood of such (unambiguous) pair is the squared
value of the haplotype frequency estimation.
The probability p of the most likely pair of haplotypes
is:
P

max(Li, i n * i I)
I

i1

(1)

Li

Where p is the prediction probability of the most


likely haplotype pair; i is a natural integer used to
enumerate the different haplotype pairs, Li is the likelihood of haplotype pair I as defined previously, given
haplotype frequencies and Hardy-Weinberg equilibrium; and I is the overall number of possible haplotype
pairs indexed by i.
The method ability to find the most likely haplotype
pair is given by mean median (measure of central
tendency) and percentiles (a value on a scale of 100 that
indicates the percentage of the distribution of the phase
prediction value that is equal to or below it) of the
distribution of P probability defined in Equation 1 over
the considered sample.
Several alternative estimations can be provided:

Inferred Haplotype Information for Donor Selection

1. Phenotypes sometimes include ambiguous codes. If


those are specified, their handling is implemented in
the algorithm. For example, if A9 must be solved
considering A23 and A24 as possible alleles, the
algorithm can produce the corresponding possible
pairs and compute the corresponding likelihood. An
example is given following in the event that DR3
must be solved considering DR17 and DR18 as possible alleles:
HLA Phenotype
Locus A-Locus B-Locus DR
(1, 2)-(8, 44)-(4, 3)
HLA-DR 3 HLA-DR 17 OR 18
1-8-18, 2-44-4

1-44-18, 2-8-4
1-8-4, 2-44-18
1-44-4, 8-2-18
1-8-17, 2-44-4
1-44-17, 2-8-4

Possible
pairs of
haplotype

1-8-4, 2-44-17
1-44-4, 2-8-18

Haplotype prediction software achieves the same computations over the set of possible phases that is deducted
from the implementation of nomenclature codes.
2. Phenotypes are sometimes incomplete. To predict the
possible haplotypes in such cases, the algorithm produces all possible haplotype pairs corresponding to
the incomplete phenotype and computes the corresponding likelihood.
During the phase prediction, a set of options manages
the implementation of the nomenclature and the replacement of missing values in the phenotype. These features
are implemented in a software named haplopred (available on request to the authors). Computations are easily
achieved on a personal computer. It is a C-written software developed with corresponding libraries to make it
usable in a flexible way to BMD registry computer
system management.
The algorithm presented requires a set of haplotype
frequencies. It has been applied to two sets of HLA
data. The first one consists of individuals with known
haplotypes from family segregation to validate the estimation of haplotype pairs predicted by statistics. The
second one consists of unrelated phase-unknown individuals from the French BMD Registry to describe the
outcome of the method.

565

Application on Phase-Known Data


Centre dEtude du Polymorphisme Human (CEPH)
families have been used to apply the algorithm on
phase-known data (data available on request). HLA-A,
-B, -DR haplotypes were deduced from the familial
study of HLA segregation [14]: 301 different pairs of
HLA-A, -B, -DR haplotypes were obtained from 39
families. The algorithm for prediction deals with each
phenotype. The outcome is compared with the actual
phase as defined by the study of segregation. The 2
test assesses the statistical significance of the predicted
accuracy of the method.
Application on Phase-Unknown Population Data
Potential donors from the French BMD Registry typed
for HLA-AB and -DRB1 were used (N 85,933). The
haplotype estimation is based on the likelihood methods implemented within an expectation maximization
algorithm according to the previously implemented
procedures [6, 7]. As an approximation, all individuals
with only one allele at a given locus were analyzed as
homozygous at this locus. The description of the population can be found on the annual report of the French
Registry (http://www.fgm.fr). The debate on the use of
the BMD Registry to infer HLA haplotype frequencies
has been largely discussed, as has the potential bias
(such as selection on HLA-DR typing) [6 10].
Prediction statistics distribution and properties are
presented on the results obtained from the BMDs dataset. A prediction probability is given to each most likely
haplotype pair assigned in the context of the search for
unrelated hematopoietic stem cell donor.
The likelihood of each haplotype pair is based on the
phenotype of the individual and on the population haplotype frequencies. A priori, each haplotype pair has the
same chances to occur, thus defining a minimal prediction value. For example, in a phenotype with three
heterozygous loci, four pairs of haplotypes are possible.
In this case, the minimal prediction value is 25%. This
minimal value would be the one obtained in absence of
gametic disequilibrium.
The detailed description of the outcome of this prediction is given in the set of the HLA-ABDR phenotypes
of French BMD Registry. The influence of several factors
has been evaluated. Key factors that are correlated with
the quality of the prediction outcome were quantified by
odds ratio (OR) and tested using the 2 test.
RESULTS
Of 301 phase-known phenotypes from CEPH families,
the observed number of correct predictions is 69.4%
(n 209/301). According to the prediction probabil-

566

P.A. Gourraud et al.

TABLE 1 The 10th, 25th, 50th, 75th, and 90th


percentile of haplotype inference
probability for HLA phenotype for 85,933
French unrelated bone marrow donors
HLA-A, -B, -DR phenotypes (bottom) and
HLA-A HLA-B phenotypes (top)
Phenotype A, B
Percentile (%)

Value

CI (95%)

10
25
(Median) 50
75
90

0.592
0.733
0.937
0.997
1

0.5900.594
0.7300.737
0.9370.937
0.9960.997
X

Phenotype A, B, DR
Percentile (%)

Value

CI (95%)

10
25
(Median) 50
75
90

0.481
0.588
0.794
0.956
0.994

0.4790.484
0.5860.59
0.7920.797
0.9550.957
0.9930.994

Abbreviation: HLA human leukocyte antigen.


Last column reports 95% confidence intervals (CIs) of the percentiles
estimated.

ities, the average ability of the algorithm to correctly


predict an individual haplotype pair is expected to be
76.64% (standard deviation 20.4%). The observed
number of correct predictions is not significantly different from the expected number (2 test, p 0.17).
Among the 85,933 phenotypes typed for HLA-A, -B,
-DR, the average value of HLA-ABDR haplotype pairs
prediction probability is 0.761 (standard deviation
19.9%), ranging from 0.26 to 1. Table 1 shows the

FIGURE 1 Distribution of
haplotype prediction given phenotypes on 85,933 French unrelated bone marrow donors. Distribution of prediction obtained
on human leukocyte antigen
(HLA)-A HLA-B phenotypes are
given in white. Distribution of
prediction obtained on HLA-A,
HLA-B, HLA-DR phenotypes
are given in black.

distribution of the prediction probability according to


their 10th, 25th, 50th, 75th, and 90th percentiles for
HLA-ABDR haplotype pairs and HLA-AB haplotype
pairs prediction. The distribution of the prediction probability in HLA-A, -B, -DR phenotypes and HLA-A, -B
phenotypes is given in Figure 1.
Many huge differences do exist between phenotypes,
suggesting the interplay of various parameters: allele and
haplotype frequency, homozygosity, and linkage
disequilibrium.
To clarify the different factors influencing the outcome of the prediction, examples are given in Table 2.
This table is divided into four parts according to the
prediction reliability. Individuals with at least two loci
considered as homozygous represent two thirds (4760/
6223) of the predicted haplotype pairs given as nonambiguous (p1) (Table 2, Part 1). The remaining ones
(1463/6223) correspond to different situations as
shown in Table 2, part 2. They include the associations
of a frequent haplotype and a rarer one, or two quite
rare haplotypes in a strong linkage disequilibrium.
This also applied to predictions close to 1, including
some phenotypes made of two frequent haplotypes
(Table 2, Part 3).
As expected, the level of prediction is low prediction
when phenotypes include several frequent alleles in low
linkage disequilibrium. Examples are given in Table 2,
part 4.
A few factors seem to be the major ones influencing
the likelihood of the prediction: the degree of homozygosity, the frequency of the alleles, the frequency of
haplotypes predicted, and the linkage disequilibrium
(the nonrandom association of alleles at two physically
linked loci). Their consequences have been quantified by
OR and underscore the following facts:

Inferred Haplotype Information for Donor Selection

567

TABLE 2 Table of examples of haplotype pair prediction on 85,933 French unrelated bone marrow donors
Part 1: Phenotype with at least two homozygous loci
Haplotype 1

Frequency 1

Haplotype 2

Frequency 2

Prediction value

3-7-15
3-7-1
2-35-11
2-44-1
2-62-4
1-8-15

0,0257
0,0032
0,0032
0,0056
0,0091
0,0035

3-7-15
3-40-1
2-35-13
2-27-1
2-62-13
1-8-17

0,0257
0,0001
0,0021
0,0035
0,0046
0,0227

1
1
1
1
1
1

Part 2: Nonambiguous haplotype pair prediction


Haplotype 1

Frequency 1

Haplotype 2

Frequency 2

Prediction value

2-60-11
1-8-7
1-8-17
3-64-13
1-37-7
30-18-17
2-39-14
2-7-17
26-38-13
2-41-13

0,0012
0,0023
0,0227
0,0002
0,0003
0,0040
0,0004
0,0005
0,0032
0,0008

29-44-11
31-44-7
3-53-17
3-7-8
25-44-7
30-39-1
25-18-14
25-8-17
66-41-13
11-35-103

0,0026
0,0007
1,4e-5
0,0011
0,0001
4,0e-5
0,0003
0,0002
0,0003
0,0007

1
1
1
1
1
1
1
1
1
1

Part 3: Haplotype pair prediction 95%


Haplotype 1

Frequency 1

Haplotype 2

Frequency 2

Prediction value

1-8-3
1-8-17
3-7-15
2-13-7
2-12-4
3-14-7
1-8-17
2-44-16
1-8-13
1-8-3
2-60-13
1-17-7
2-57-7
1-8-3
1-8-3
24-57-7

0,0227
0,0227
0,0257
0,0035
0,0015
0,0006
0,0227
0,0022
0,0030
0,0227
0,0046
0,0018
0,0045
0,0227
0,0227
0,0011

3-35-1
2-62-4
30-13-7
29-44-7
29-12-07
03-35-11
1-44-16
68-65-13
3-18-13
11-5-15
23-44-7
3-35-1
30-18-3
11-56-1
2-62-2
24-62-4

0,0125
0,0091
0,0044
0,0273
0,0016
0,0034
8,4e-5
0,00085
0,0005
3,5e-5
0,0082
0,0125
0,0039
0,0004
0,0002
0,0016

0,9921
0,9855
0,9938
0,9844
0,9511
0,9961
0,9985
0,9782
0,9875
0,9697
0,9649
0,9868
0,9789
0,9964
0,9566
0,9795

Part 4: Haplotype pair prediction 35%


Haplotype 1

Frequency 1

Haplotype 2

Frequency 2

Prediction value

2-51-4
1-51-4
2-44-4
2-44-4
2-51-13
2-44-11
2-60-13
3-51-11
24-62-11
2-18-1
3-56-11

0,0039
0,0008
0,0207
0,0207
0,0058
0,0074
0,0046
0,0013
0,0018
0,0010
0,0001

24-62-13
2-44-7
24-51-1
24-60-13
28-44-11
24-27-1
3-51-4
11-35-13
32-35-1
32-39-4
24-51-13

0,0031
0,0104
0,0003
0,0003
0,0010
0,0007
0,0010
0,0022
0,0007
0,0001
0,0009

0,3468
0,3448
0,3166
0,3273
0,3339
0,3150
0,3488
0,3335
0,3253
0,3096
0,2915
(Continued)

568

P.A. Gourraud et al.

TABLE 2 (Continued)
Part 4: Haplotype pair prediction 35%
Haplotype 1

Frequency 1

Haplotype 2

Frequency 2

Prediction value

11-18-13
2-51-14
2-21-13
2-62-13
2-51-7

0,0002
0,0017
0,0002
0,0046
0,0028

32-44-1
3-18-10
3-40-4
3-27-4
24-49-13

0,0003
2,5e-5
0,0002
0,0007
0,0005

0,3235
0,3295
0,3314
0,3215
0,3473

Part 1 presents the prediction of the most likely haplotype pair of human leukocyte antigen (HLA)-A B DR phenotypes, which has at least two homozygous loci.
Part 2 presents the prediction of the most likely haplotype pair of HLA-A B DR phenotypes, which result is nonambiguous two homozygous loci. Part 3 presents
the prediction of the most likely haplotype pair of HLA-A B DR phenotypes for which prediction value is between 95% and 1. Finally, Part 4 presents the
prediction of the most likely haplotype pair of HLA-A B DR phenotypes, for which prediction value is below 35%.

1. The presence of at least one homozygous locus is


associated with an increased prediction probability
(OR 2.08; [2.022.14] p 10.3).
2. The presence of the most frequent HLA alleles in the
phenotype is not correlated with a high prediction
value (p 0.05). In absence of a significant linkage
disequilibrium, the allele frequency does not play a
key role in the prediction outcome (data not shown).
3. Linkage disequilibrium also increases the prediction
probability. Having at least a standardized pair-wise
linkage disequilibrium |D'LociXY| 0.5 is positively correlated with higher prediction probability
(|D'HLA*AHLA*B|; OR 1.74 [1.69 1.79] p 10.3)
(|D'HLA*BHLA*DR|; OR 1.12 [1.09 1.16] p
10.3). Interestingly, further analysis (not shown) indicates that the positive linkage disequilibrium measures (association) are correlated with increased phase
prediction probability (p 10.3), whereas negative
linkage disequilibrium measures (repulsion) are correlated with decreased phase prediction probability (p
10.3). Moreover, the positive linkage disequilibrium
between three loci is strongly associated with higher
prediction probability (D'HLA*AHLA*BHLA*D 0.01;
OR 6.7 [6.46 6.95] p 10.3).
Homozygosity and positive linkage disequilibrium
are the major factors influencing the prediction quality.
DISCUSSION
In this article, we addressed the possibility of predicting
haplotype phase from individual HLA phenotypes. We
described the outcome of the proposed method on 301
phase-known individuals of the CEPH panel and on
85,933 HLA-ABDR phenotypes of the French BMD
Registry. Key factors correlated with the quality of the
prediction outcome were homozygosity and the linkage
disequilibrium, which clearly increase the prediction
probability. This study on individual phase inference is

dedicated to HLA phenotypes and shows that part of the


information available in familial situations can be
reached by statistical method in an unrelated situation.
The haplotype prediction method also contributes to
show that individual haplotype prediction could be of
practical use. The method is a way to incorporate the
current knowledge of the HLA region linkage disequilibrium through the registries in the interpretation of
their phenotype. Other studies focusing on haplotype
inference in HLA region deal preferentially with some
association studies than transplantation genetics [15,
16]. In other genetic regions, several studies have demonstrated the power of haplotype prediction. Examples
are available in Xu et al. [17] for five single nucleotide
polymorphism (SNP)s in N-acetyl transferase 2 gene
(NAT2, 8p22) and for five SNPs on the X chromosome
(Xp11.4) or in Orzack et al. for nine apolipoprotein E
sites (APOE 19q13.22) [18]. The relevance of such an
approach is always discussed regarding the phase predicted rather than the prediction value. Thus the relevance of the haplotype prediction methods must be assessed depending on the kind of markers used and on the
genomic region considered. Validation of the predicted
haplotype phase probability in itself is difficult in practice. For each phenotype, it would require a significant
sample of phase-known HLA data from unrelated
individuals.
The haplotype approach is powerful because it reduces the number of theoretic possible phenotypes
analyzed.
The method assumes that the haplotype frequency
estimations and the phenotypes under analysis are drawn
from the same population. The individual HLA data
from CEPH families and the potential BMD phenotype
data were recruited within the French population. The
French BMD Registry provides haplotype frequency estimations. The fact that the registry haplotype frequencies are used as approximations of the source population

Inferred Haplotype Information for Donor Selection

[6 10] has been previously discussed [19]. In the French


Registry, the DR-typing bias has been reduced in recent
years. Nevertheless, at the individual level, haplotype
phase inference is limited by the origins of the donors.
The phase prediction for individuals in non-Caucasian
CEPH families (Amish and Venezuelan, for example)
provides evidence that applications should be restricted
to the population used for the haplotype frequency estimation. Thus different population haplotype frequency
estimation should be used in cases of different genetic
background. In the context of HLA and transplantation,
haplotype analysis has been mainly used at the population level, especially to model the likelihood to find a
donor [20, 21]. However, the results presented here show
that an application at the individual level may also help
to assess the degree of haplotype matching for unrelated
transplantation.
No matter the sample size, no matter the number of
haplotype frequency estimations used to compute the
phase prediction, the method is limited: no prediction
probability can be lower than 0.25 (threshold for random
attribution of phase in three heterozygous loci phenotypes). It is possible that none of the haplotypes required
to explain the phenotypes were estimated in the reference
sample. In this case, it is not possible to compute prediction probability. Even if confidence intervals may be
computed by bootstrap methods of haplotype frequency
estimation, sampling errors on very rare haplotypes remains the main source of variability of the estimation [22].
Haplotype pair predictions may be routinely given to
clinicians by the prediction tools described here. They
have been implemented by France Greffe de Moelle. On
request, haplotype phase prediction is performed using
the latest estimation of HLA-ABDR haplotype frequencies in the source population. While sorting to indicate
the predicted pair of haplotypes, Registry data remain
unchanged. In some cases, the user may explore the
likelihood of all possible haplotype pairs.
By the implementation of simple statistics, the
method presented provides more HLA information to
help decide for soliciting donors. Such a tool implemented in the BMD Registries may be of practical
interest in several aspects. It would evaluate the chances
for a phenotype to match a given haplotype. The results
presented here suggest that most (about 76%) of the
HLA-A, -B, -DR phenotype matching individuals are
matched at the haplotype level in the French BMD
Registry. Some of them are compatible with the phenotype level only. It confirms from the statistical point of
view previous finding that show that ancestral haplotypes (strong linkage or positive linkage disequilibrium)
increased survival in unrelated transplantation [23, 24].
A similar study could be interesting using HLA haplotype frequencies at higher resolution level to investi-

569

gate to which extent higher level of HLA-typing influences the matching at haplotype level in unrelated
situation. Handling other HLA locus or genetic markers
in the HLA region may also help to define the compatibility at haplotype level.
The findings presented here can be applied to assess
the degree of HLA matching in any kind of transplantation or when typing relatives is not possible. For example, the phase prediction probabilities assessing the
HLA-ABDR haplotype matching may indicate further
HLA-typing requirements. It can also characterize the
identity for one haplotype in unrelated situations when
only partially incompatible donors are available.
We demonstrated here that taking advantage of the
genetic structure of HLA data allows accessing more
information than expected. It has a general relevance as a
decision element used in the assessment of compatibility
in transplantation. Thoughtful statistics considerations
on immunogenetics and populations may allow the development of practical tools of clinical relevance.
ACKNOWLEDGMENT

The authors wish to gratefully acknowledge the help of the


France Greffe de Moelle staff and the bioinformatics platform
of Genopole Toulouse Midi-Pyrnes.

REFERENCES
1. Mughal TI, Goldman JM: Chronic myeloid leukaemia:
current status and controversies. Oncology (Huntingt)
18:837, 2004.
2. Petersdorf EW, Anasetti C, Martin PJ, Gooley T, Radich J,
Malkki M, Woolfrey A, Smith A, Mickelson E, Hansen JA:
Limits of HLA mismatching in unrelated hematopoietic cell
transplantation. Blood 104:2976, 2004.
3. Petersdorf EW, Anasetti C, Martin PJ, Hansen JA: Tissue
typing in support of unrelated hematopoietic cell transplantation. Tissue Antigens 61:1, 2003.
4. Hansen JA, Yamamoto K, Petersdorf E, Sasazuki T: The
role of HLA matching in hematopoietic cell transplantation. Rev Immunogenet 1:359, 1999.
5. Petersdorf EW, Mickelson EM, Anasetti C, Martin PJ,
Woolfrey AE, Hansen JA: Effect of HLA mismatches on
the outcome of hematopoietic transplants. Curr Opin
Immunol 11:521, 1999.
6. Gourraud PA, Genin E, Cambon-Thomsen A: Handling
missing values in population data: consequences for maximum likelihood estimation of haplotype frequencies. Eur
J Hum Genet 12:805, 2004.
7. Lonjou C, Clayton J, Cambon-Thomsen A, Raffoux C:
HLA -A, -B, -DR haplotype frequencies in France
implications for recruitment of potential bone marrow
donors. Transplantation 60:375, 1995.
8. Martinetti M, Degioanni A, DAronzo AM, Benazzi E,

570

9.

10.

11.

12.

13.
14.

15.

16.

P.A. Gourraud et al.

Carpanelli R, Castellani L, Cenzuales S, De Biase U, De


Filippo C, De Giuli A, Gerosa A, Fare M, Ferrioli G,
Galvani G, Lombardo C, Malagoli A, Marchesi S,
Mascaretti L, Motta F, Sioli V, Rinaldini C, Rizzolo L,
Pascutto C, Bernardinelli L, Salvaneschi L: An
immunogenetic map of Lombardy (Northern Italy). Ann
Hum Genet 66:37, 2002.
Muller CR, Ehninger G, Goldmann SF: Gene and haplotype frequencies for the loci HLA-A, HLA-B, and
HLA-DR based on over 13,000 german blood donors.
Hum Immunol 64:137, 2003.
Rendine S, Borelli I, Barbanti M, Sacchi N, Roggero S,
Curtoni ES: HLA polymorphisms in Italian bone marrow donors: a regional analysis. Tissue Antigens 52:135,
1998.
Piazza A: Haplotypes and linkage disequilibrium from
three-locus phenotypes. Histocompatibility Testing,
Munksgaard: Kissmeyer-Nielsen, eds, Copenhagen,
923, 1975.
Yasuda N: Estimation of haplotype frequency and linkage
disequilibrium parameter in the HLA system. Tissue Antigens 12:315, 1978.
Morton NE, Simpson SP, Lew R, Yee S: Estimation of
haplotype frequencies. Tissue Antigens 22:257, 1983.
Bugawan TL, Klitz W, Blair A, Erlich HA: High-resolution HLA class I typing in the CEPH families: analysis of
linkage disequilibrium among HLA loci. Tissue Antigens
56:392, 2000.
Cordell HJ, Clayton DG: A unified stepwise regression
procedure for evaluating the relative effects of polymorphisms within a gene using case/control or family data:
application to HLA in type 1 diabetes. Am J Hum Genet
70:124, 2002.
Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland
GA: Score tests for association between traits and haplotypes when linkage phase is ambiguous. Am J Hum
Genet 70:425, 2002.

17. Xu CF, Lewis K, Cantone KL, Khan P, Donnelly C,


White N, Crocker N, Boyd PR, Zaykin DV, Purvis IJ:
Effectiveness of computational methods in haplotype prediction. Hum Genet 110:148, 2002.
18. Orzack SH, Gusfield D, Olson J, Nesbitt S, Subrahmanyan
L, Stanton VP Jr: Analysis and exploration of the use of
rule-based algorithms and consensus methods for the inferral of haplotypes. Genetics 165:915, 2003.
19. Schipper RF, Oudshoorn M, DAmaro J, van der Zanden
HG, de Lange P, Bakker JT, Bakker J, van Rood JJ:
Validation of large data sets, an essential prerequisite for
data analysis: an analytical survey of the Bone Marrow
Donors Worldwide. Tissue Antigens 47:169, 1996.
20. Mori M, Graves M, Milford EL, Beatty PG: Computer
program to predict likelihood of finding and HLAmatched donor: methodology, validation, and application.
Biol Blood Marrow Transplant 2:134, 1996.
21. Kollman C, Abella E, Baitty RL, Beatty PG, Chakraborty
R, Christiansen CL, Hartzman RJ, Hurley CK, Milford E,
Nyman JA, Smith TJ, Switzer GE, Wada RK, Setterholm
M: Assessment of optimal size and composition of the
U.S. National Registry of hematopoietic stem cell donors.
Transplantation 78:89, 2004.
22. Fallin D, Schork NJ: Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization
algorithm for unphased diploid genotype data. Am J Hum
Genet 67:947, 2000.
23. Tay GK, Witt CS, Christiansen FT, Charron D, Baker D,
Herrmann R, Smith LK, Diepeveen D, Mallal S, McCluskey
J, et al: Matching for MHC haplotypes results in improved
survival following unrelated bone marrow transplantation.
Bone Marrow Transplant 15:381, 1995.
24. Christiansen FT, Witt CS, Dawkins RL: Questions in
marrow matching: the implications of ancestral haplotypes for routine practice. Bone Marrow Transplant 8:83,
1991.