Documente Academic
Documente Profesional
Documente Cultură
net/publication/233461470
CITATIONS READS
11 550
3 authors, including:
Some of the authors of this publication are also working on these related projects:
Citius Altius Sanius - Injury free exercise for everyone View project
All content following this page was uploaded by Frank van der Meulen on 22 February 2019.
Quality Engineering
Publication details, including instructions for authors and subscription information:
http://www.informaworld.com/smpp/title~content=t713597292
To cite this Article van der Meulen, Frank , Vermaat, Thijs and Willems, Pieter(2011) 'Case Study: An Application of
Logistic Regression in a Six Sigma Project in Health Care', Quality Engineering, 23: 2, 113 — 124
To link to this Article: DOI: 10.1080/08982112.2011.553761
URL: http://dx.doi.org/10.1080/08982112.2011.553761
This article may be used for research, teaching and private study purposes. Any substantial or
systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or
distribution in any form to anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses
should be independently verified with primary sources. The publisher shall not be liable for any loss,
actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly
or indirectly in connection with or arising out of the use of this material.
Quality Engineering, 23:113–124, 2011
Copyright # Taylor & Francis Group, LLC
ISSN: 0898-2112 print=1532-4222 online
DOI: 10.1080/08982112.2011.553761
PROCESS DESCRIPTION
All over the world health care is facing serious issues. Costs are increasing
and the quality of care consistently fails to meet expectations (cf. institute of
Medicine 2001). Quality improvement is therefore a major strategic issue in
health care organizations and improvements have to be implemented to
Address correspondence to Dr. Frank
van der Meulen, Delft Institute of reduce costs and increase quality. The Six Sigma program is an effective
Applied Mathematics, Delft management methodology, developed in industry and also adopted in
University of Technology, Mekelweg
4, 2628 CD, Delft, The Netherlands. health care; see Barry (2002) and Bisgaard (2009). Six Sigma improvement
E-mail: f.h.vandermeulen@tudelft.nl projects are executed by a fixed step-by-step approach, the DMAIC
113
roadmap. It encompasses five phases: the define, To measure the number of participants and ses-
measure, analyze, improve, and control phases. Pro- sions each month, one simply looks at the number
jects are executed by project leaders. The DMAIC of invoices. To assess whether this measurement
roadmap assists them in organizing their findings in procedure is valid, a comparison between a sample
a structured manner. For a description of these of invoices and the corresponding list of participat-
phases, see De Mast et al. (2006) and Breyfogle ing patients from the department was made. These
(2003). matched perfectly, validating the chosen measure-
In 2005 the Virga Jesse Hospital in Hasselt, ment procedure.
Belgium, decided to use the Six Sigma method to A large data set with 516 treated patients was
improve their processes. In this article we will available. Of these patients, 49% participated in the
explore a project on the retention of heart rehabili- rehabilitation program. For each patient we have
tation patients. Its aim was to attract more patients the following data available:
in the rehabilitation program or reduce the fraction
of patients who drop out during the program. A suc- . distance between the patient’s home to the hospi-
cesfull project will increase the hopital’s revenues tal in kilometers (x1, numerical)
and will be beneficial to patients’ health as well. . age (x2, numerical)
We will discuss the project, focusing on the analyze . mobility; whether or not the patient has a car (x3,
and improve phases. categorical)
Downloaded By: [Meulen, Frank van der] At: 09:41 9 March 2011
After cardiac surgery, patients with heart disease . gender (x4, categorical)
are treated in the cardiology nursing department. . place of residence
When a patient’s condition is stable, he or she is dis- . participation; whether or not the patient participa-
charged from the department and goes home. For tes in the rehabilitation program (Y, binary; Y ¼ yes
safety reasons, patients are advised to join the hospi- if the patient shows up at least once, else Y ¼ no).
tal’s cardiac rehabilitation program. In addition to
psychological support and advice on a healthier diet CTQ1 is directly related to participation. In fact,
and a less stressful lifestyle, patients in this program the value of CTQ1 in a month is the sum of all
participate in physical training under full supervision patients i with Y ¼ yes in that particular month. In this
of a physical therapist. Patients visit the rehabilitation sense, Y is a more informative measurement than
center two or three times a week for a 2-h session, CTQ1, because we can relate Y to patient character-
with a maximum of 45 sessions. istics. The influence of the place of residence is
Many patients treated at the nursing department captured by variable x1.
do not enroll in the rehabilitation program after dis-
charge. Moreover, many patients who do start the
program leave halfway through, before being physi-
ANALYSIS AND INTERPRETATION
cally fit. The latter is called a dropout patient. In both Over the year 2005 the first CTQ (the number of
cases the hospital loses revenues; every visit is participating patients) was on average 33 patients
charged individually. each month, with a standard deviation of 4.9 patients
each month. Based on the process capability and
process knowledge, the objective of the project
DATA COLLECTION was to increase the average number of participants
The measure phase starts with the definition of the to 36. This number had been attained a number of
internal critical to quality characteristics (CTQs). In times in the past and both cardiologists and physi-
this project the strategic focal point is the increase cians claimed that such an increase was feasible.
of revenue, which links directly to the following The second CTQ (the number of sessions) was on
CTQs: average 29 sessions for the patients participating in
2005. Note that the maximum number of sessions
. CTQ1: the number of patients who participate in per patient in a program was 45. The objective for
the rehabilitation program every month the second CTQ was to increase the average number
. CTQ2: the number of sessions per participant of sessions to 32 for each patient. The average
activity (even though the center was open late), We now give a detailed analysis of the statistics used
. 16% of the patients could not join the program due in the improve phase. The project supervisor con-
to other obligations (vacations, social obligations), vinced the project leader to complete the improve
. 12% of the patients dropped out for a medical phase before proceeding with the above-mentioned
reason provided by the doctor, actions.
. 8% of the patients had their own rehabilitation
facilities.
Analyzing Each Factor Separately
These factors were the cause of 78% of the drop- Our first step consists of studying the relation
out. However, none of these causes can be influ- between Y participation and each influence factor
enced easily. Therefore, focus shifted to CTQ1. (denoted by xi) separately. It is useful to screen the
Based on brainstorm sessions with cardiologists, data in this way before using more advanced
physical therapists, patients, and other interested techniques.
parties, the following influence factors for CTQ1
were raised: 1. The first studied factor is distance. Whether the
number of kilometers affects whether the patient
. Patients should be informed of the rehabilitation will join the program is normally analyzed by
program at a much earlier stage. means of logistic regression. A first simple
. information on the rehabilitation program should approach consists of making boxplots for distance
be much more precise and attractive. vs. participation. Looking at these plots, we
. Cardiologists should stimulate patients to partici- immediately noticed two patients with very long
pate in and finish the rehabilitation program. distances (200 km) to the hospital compared to
. Patients should train with a heart rate monitor the other patients. These patients were closely
(polar watch) to improve their feelings of safety. related to one of the physicians and therefore
. Patients desire a smaller exercise room and are had chosen the hospital considered here. For this
more comfortable when not with other patients. reason, these patients were excluded from all
. Patients are not likely to show up during summer further statistical analysis. The left-hand figure of
holiday. Figure 1 contains boxplots of the data from which
these two outliers were removed. This figure sug-
Factors that seemed to be most important can be gests that patients with a short distance to the hos-
summarized as patient attention factors. These factors pital tend to participate more often in the program.
In the right-hand figure of Figure 1 a more constructing this figure can be found on Howard
informative plot is made. We divided the range Seltman’s Website, http://www.stat.cmu.edu/
of distance into eight approximately equally sized hseltman=files=LREDA.R. From the constructed
Downloaded By: [Meulen, Frank van der] At: 09:41 9 March 2011
groups. Within each group we computed the rela- plot we clearly see that the further a patient lives
tive frequency of patients participating. Because from the hospital, the lower the probability that a
there are ties in the distance values, not all groups patient will join the rehabilitation program.
were exactly the same size. The diameter of the 2. The factor age can be analyzed in a similar way;
circle for a group is proportional to the size of that see Figure 2. This factor suggests that the prob-
particular group. To visualize a pattern among the ability of joining the rehabilitation program
points, we added a smoother through these decreases with age. Moreover, at approximately
points. A smoother is a nonparametric regression age 65 there seems to be a change point in the
fit, which can be constructed by many methods. decrease of the fraction of participating patients.
Here, we chose Friedman’s ‘‘super smoother,’’ 3. The bar chart for mobility (left-hand picture in
which is implemented in the statistical software Figure 3) clearly indicates that the probability of
package R (function ‘‘supsmu’’). Details about joining the program is influenced by whether
the construction of this smoother are of minor the patient has access to a car. Table 1 sum-
importance at this stage, but the interested reader marizes these data. The data suggest that having
may consult Friedman (1984). The R code for a car at one’s disposal increases the probability
for joining the program. There are missing values amount can be considered a break-even point. To
in the data set: for 71 patients, mobility was not calculate this break-even point, we need a relation
registered. between the probability that a patient will join the
4. The factor gender can be analyzed in a similar program and the various influence factors as an
Downloaded By: [Meulen, Frank van der] At: 09:41 9 March 2011
way as mobility. There were 13 missing values ensemble. In the next section we will show how a
for gender in the data set. The bar chart (right- logistic regression model can be used to accomplish
hand picture in Figure 3) indicates that this factor this. An introduction to logistic regression can
has a minor influence on participation. Table 2 be found in many textbooks; see, for example,
summarizes these data. McCullagh and Nelder (1989) and Myers et al.
(2002).
The analysis suggests that the accessibility of the
hospital has to be improved, especially for those Logistic Regression Model for
people living far away from the hospital. Hiring a taxi
service would definitely improve accessibility,
Modeling the Probability That a
though it is obvious that the costs for this service Patient Will Join the Program
exceed the revenues of one additional session. It is In this section, we model the relation between Y
of major interest to find out how much money can and all influence factors simultaneously. In a logistic
be invested to improve accessability of the hospital regression model, we assume that all Yi (the
while still ensuring increased revenues. This maximal response for the ith patient) are independent and
identically distributed, with
TABLE 2 Influence of Gender on Participation Here xi ¼ (1, xi1, xi2, xi3, xi4) is the vector of predic-
tors (including an intercept) for patient i and f is a
Number of Percentage joining
Mobility patients the program function that has yet to be estimated. We use dummy
variables in that xi3 ¼ 1 if mobility ¼ ‘‘no car’’ and
Female 377 52
zero otherwise. Similarly, xi4 ¼ 1 if gender ¼
Male 124 45
‘‘male’’ and zero otherwise. A generalized linear
Coefficients:
(in the last two columns we added 95% confidence intervals). We conclude that
p^i 5:777 0:0675 distancei 0:0599 agei if patient i has a car
log ¼
1 p^i 3:840 0:0675 distancei 0:0599 agei if patient i does not have a car
regression, the deviance residuals are defined by However, we can still consider leverage values. A
high leverage value indicates that a point is an outlier
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi
Downloaded By: [Meulen, Frank van der] At: 09:41 9 March 2011
covariate patterns and replicated measurements for quence, may yield different results for the same
each covariate pattern, goodness of fit can be problem (see Pigeon and Heyse 1999). This illus-
assessed by methods for categorical data. A typical trates the sensitivity of the Hosmer-Lemeshow test
example of such a method is Pearson’s chi-square to the grouping method. Furthermore, it has been
test. This approach cannot be pursued here, because reported that the power of the Hosmer-Lemeshow
both distance and age are continuous and hence for test is low compared to certain competitors (see
these covariates replicated measurements are not Hosmer et al. [1997], where a comprehensive com-
available. As a solution, grouping of the data has parison of goodness-of-fit tests is given). More recent
been advocated. The Hosmer-Lemeshow test is a work on this topic was performed by Xie et al.
well-known example of this approach; see, for (2008), in which groups were obtained by cluster
example, chapter 5 of Hosmer and Lemeshow analysis in the covariate space. From the work by
(2000). For this test, the user has to specify a number Hosmer et al. (1997) it follows that the le Cessie-van
of groups G. A default choice is 10. Groups are Houwelingen-Copas-Hosmer (CHCH test) unweighted
formed by computing the 0, 1=G, 2=G, . . . , 1-quan- sum of squares test for global goodness of fit per-
tiles of the vector of predicted probabilities (if forms quite well in simulations. Because this statistic
G ¼ 10, these are simply the deciles). Let Oi,0 is readily explained and also implemented in R in
patient equals its estimated expected response, the hospital decided not to discriminate between
which is therefore given by p^i :¼ pi ðb^Þ. The CHCH patient characteristics.
test is a studentized version of Therefore, we calculated the average value of d
for all patients without a car in the hospital using
X
n
[2] and the fitted model. This average probability
T ¼ ðYi p^i Þ2 : equals 0.35. Hence, the maximum amount that can
i¼1
be invested to ensure transport for each patient
For large data sets critical values can be obtained equals 0.35 times 29 sessions on average times
from the standard normal distribution. Because the 22.83 euros per session ¼ 232 euros. Because there
data considered here contain over 400 patients, the are approximately 15 patients a month without a
work by Hosmer et al. (1997) suggested that the test
should have about 90% power to detect moderate
departures from linearity.
Applying the test to our data and model gives