ROC Curves Studentslev3

Receiver Operating Characteristic (ROC) Curves
Assessing the predictive properties of a test statistic Decision Theory
2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH
Binary Prediction Problem

Conceptual Framework
Suppose we have a test statistic for predicting the presence or absence of disease.
True Disease Status Pos Neg
Test Criterion
Pos Neg

Test Criterion
Pos Neg

Test Criterion
Pos Neg
TP
TP = True Positive

Test Criterion
Pos Neg

Test Criterion
Pos Neg
FP
FP = False Positive

Test Criterion
Pos Neg

Test Criterion
Pos Neg
FN
FN = False Negative

Test Criterion
Pos Neg

True Disease Status Pos Neg TN
TN = True Negative
Test Criterion
Pos Neg

Test Criterion
Pos Neg
TP FN
P
FP TN
N P+ N


Test Properties
True Disease Status Pos Neg Test Pos TP FP Criterion Neg FN TN P N Accuracy = Probability that the test yields a correct result. = (TP+TN) / (P+N)
P+ N

Test Properties
Test Criterion
Pos Neg
True Disease Status Pos Neg TP FP FN TN
P N P+ N Sensitivity = Probability that a true case will test positive = TP / P Also referred to as True Positive Rate (TPR) or True Positive Fraction (TPF).

Test Properties
True Disease Status Pos Neg Test Pos TP FP Criterion Neg FN TN P N P+ N Specificity = Probability that a true negative will test negative = TN / N Also referred to as True Negative Rate (TNR) or True Negative Fraction (TNF).

Test Properties
True Disease Status Pos Neg Test Pos TP FP Criterion Neg FN TN P N P+ N 1-Specificity = Prob that a true negative will test positive = FP / N Also referred to as False Positive Rate (FPR) or False Positive Fraction (FPF).

Test Properties
Test Criterion
Pos Neg
Positive Predictive Value (PPV)
True Disease Status Pos Neg TP FP FN TN P N P+ N = Probability that a positive test will truly have disease
= TP / (TP+FP)

Test Properties True Disease Status Pos Neg Test Pos TP FP Criterion Neg FN TN P N P+ N Negative Predictive = Probability that a negative test Value (NPV) will truly be disease free = TN / (TN+FN)

Example True Disease Status Pos Test Criterion Pos Neg 27 73 100 Se = 27/100 = .27 Sp = 727/900 = .81 FPF = 1- Sp = .19
Neg 173 727 900 200 800 1000
Acc = (27+727)/1000 = .75 PPV = 27/200 = .14 NPV = 727/800 = .91

Test Properties
Of these properties, only Se and Sp (and hence FPR) are considered invariant test characteristics. Accuracy, PPV, and NPV will vary according to the underlying prevalence of disease. Se and Sp are thus fundamental test properties and hence are the most useful measures for comparing different test criteria, even though PPV and NPV are probably the most clinically relevant properties.
ROC Curves
Now assume that our test statistic is no longer binary, but takes on a series of values (for instance how many of five distinct risk factors a person exhibits). Clinically we make a rule that says the test is positive if the number of risk factors meets or exceeds some threshold (#RF > x) Suppose our previous table resulted from using x = 4. Lets see what happens as we vary x.
ROC Curves
Impact of using a threshold of 3 or more RFs
Test Criterion
Pos Neg
.27 .81
45 55
200 700
245 800 755
200
Se = 27/100 = .45
100 900 1000 .75 Acc = (27+727)/1000 = .75 PPV = 27/200 = .18 .91 NPV = 727/800 = .93
.14
Sp = 727/900 = .78 FPF = 1- Sp = .22

Se , Sp , and interestingly both PPV and NPV
ROC Curves
Summary of all possible options
Threshold 6 5 4 3 2 1 0
TPR 0.00 0.10 0.27 0.45 0.73 0.98 1.00
FPR 0.00 0.11 0.19 0.22 0.27 0.80 1.00
As we relax our threshold for defining disease, our true positive rate (sensitivity) increases, but so does the false positive rate (FPR). The ROC curve is a way to visually display this information.
ROC Curves
Summary of all possible options
Threshold 6 5 4 3 2 1 0
TPR 0.00 0.10 0.27 0.45 0.73 0.98 1.00
FPR 0.00 0.11 0.19 0.22 0.27 0.80 1.00
x=2
x=4 x=5
The diagonal line shows what we would expect from simple guessing (i.e., pure chance).
What might an even better ROC curve look like?

ROC Curves
Summary of a more optimal curve
Threshold 6 5 4 3 2 1 0
TPR 0.00 0.10 0.77 0.90 0.95 0.99 1.00
FPR 0.00 0.01 0.02 0.03 0.04 0.40 1.00
Note the immediate sharp rise in sensitivity. Perfect accuracy is represented by upper left corner.
ROC Curves
Use and interpretation
The ROC curve allows us to see, in a simple visual display, how sensitivity and specificity vary as our threshold varies. The shape of the curve also gives us some visual clues about the overall strength of association between the underlying test statistic (in this case #RFs that are present) and disease status.
ROC Curves
The ROC methodology easily generalizes to test statistics that are continuous (such as lung function or a blood gas). We simply fit a smoothed ROC curve through all observed data points.
ROC Curves
See demo from www.anaesthetist.com/mnm/stats/roc/index.htm
ROC Curves
Area under the curve (AUC)
The total area of the grid represented by an ROC curve is 1, since both TPR and FPR range from 0 to 1. The portion of this total area that falls below the ROC curve is known as the area under the curve, or AUC.
Area Under the Curve (AUC)

Interpretation
The AUC serves as a quantitative summary of the strength of association between the underlying test statistic and disease status. An AUC of 1.0 would mean that the test statistic could be used to perfectly discriminate between cases and controls. An AUC of 0.5 (reflected by the diagonal 45 line) is equivalent to simply guessing.

Interpretation
The AUC can be shown to equal the MannWhitney U statistic, or equivalently the Wilcoxon rank statistic, for testing whether the test measure differs for individuals with and without disease. It also equals the probability that the value of our test measure would be higher for a randomly chosen case than for a randomly chosen control.

Interpretation
controls
cases
TPR
AUC ~ 0.540 0 FPR ROC Curve


Interpretation
controls
cases TPR
AUC ~ .95 0 FPR ROC Curve


Interpretation
What defines a good AUC? Opinions vary Probably context specific

What may be a good AUC for predicting COPD may be very different than what is a good AUC for predicting prostate cancer

Interpretation
http://gim.unmc.edu/dxtests/roc3.htm .90-1.0 = excellent .80-.90 = good .70-.80 = fair .60-.70 = poor .50-.60 = fail Remember that <.50 is worse than guessing!

Interpretation
www.childrens-mercy.org/stats/ask/roc.asp .97-1.0 = excellent .92-.97 = very good .75-.92 = good .50-.75 = fair
ROC Curves
Comparing multiple ROC curves
Suppose we have two candidate test statistics to use to create a binary decision rule. Can we use ROC curves to choose an optimal one?
ROC Curves
Adapted from curves at: http://gim.unmc.edu/dxtests/roc3.htm

ROC Curves
http://en.wikipedia.org/w iki/Receiver_operating_ characteristic
ROC Curves
We can formally compare AUCs for two competing test statistics, but does this answer our question? AUC speaks to which measure, as a continuous variable, best discriminates between cases and controls? It does not tell us which specific cutpoint to use, or even which test statistic will ultimately provide the best cutpoint.
ROC Curves
Choosing an optimal cutpoint The choice of a particular Se and Sp should reflect the relative costs of FP and FN results. What if a positive test triggers an invasive procedure? What if the disease is life threatening and I have an inexpensive and effective treatment? How do you balance these and other competing factors? See excellent discussion of these issues at www.anaesthetist.com/mnm/stats/roc/index.htm
ROC Curves
Generalizations
These techniques can be applied to any binary outcome. It doesnt have to be disease status.
In fact, the use of ROC curves was first introduced during WWII in response to the challenge of how to accurately identify enemy planes on radar screens.
ROC Curves
Final cautionary notes We assume throughout the existence of a gold standard for measuring disease, when in practice no such gold standard exists.
COPD, asthma, even cancer (can we truly rule out the absence of cancer in a given patient?)
As a result, even Se and Sp may not be inherently stable test characteristics, but may vary depending on how we define disease and the clinical context in which it is measured.
Are we evaluating the test in the general population or only among patients referred to a specialty clinic? Incorrect specification of P and N will vary in these two settings.

ROC Curves Studentslev3

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

ROC Curves Studentslev3

Încărcat de

Drepturi de autor:

Formate disponibile

Receiver Operating Characteristic (ROC) Curves

Assessing the predictive properties of a test statistic Decision Theory

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

Binary Prediction Problem

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

Binary Prediction Problem

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

Binary Prediction Problem

Binary Prediction Problem

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

Binary Prediction Problem

Binary Prediction Problem

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

Binary Prediction Problem

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

Binary Prediction Problem

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

Binary Prediction Problem

Binary Prediction Problem

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

Binary Prediction Problem

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

Binary Prediction Problem

Binary Prediction Problem

True Disease Status Pos Neg TP FP FN TN

Binary Prediction Problem

Binary Prediction Problem

Binary Prediction Problem

Positive Predictive Value (PPV)

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

Binary Prediction Problem

Binary Prediction Problem

Neg 173 727 900 200 800 1000

Acc = (27+727)/1000 = .75 PPV = 27/200 = .14 NPV = 727/800 = .91

Binary Prediction Problem

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

True Disease Status Pos Neg

245 800 755

Sp = 727/900 = .78 FPF = 1- Sp = .22

Se , Sp , and interestingly both PPV and NPV

TPR 0.00 0.10 0.27 0.45 0.73 0.98 1.00

FPR 0.00 0.11 0.19 0.22 0.27 0.80 1.00

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

TPR 0.00 0.10 0.27 0.45 0.73 0.98 1.00

FPR 0.00 0.11 0.19 0.22 0.27 0.80 1.00

What might an even better ROC curve look like?

TPR 0.00 0.10 0.77 0.90 0.95 0.99 1.00

FPR 0.00 0.01 0.02 0.03 0.04 0.40 1.00

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

Area Under the Curve (AUC)

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

Area Under the Curve (AUC)

Area Under the Curve (AUC)

AUC ~ 0.540 0 FPR ROC Curve

Area Under the Curve (AUC)

AUC ~ .95 0 FPR ROC Curve

Area Under the Curve (AUC)

What defines a good AUC? Opinions vary Probably context specific

2009, KAISER PERMANENTE CENTER FOR HEALTH RESEARCH

Area Under the Curve (AUC)