Documente Academic
Documente Profesional
Documente Cultură
Thomas Brandenburger
South Dakota State University
Alfred Furth
CAPITAL Services
Chet Wiermanski
Aether Analytics LLC
February 6, 2014
Abstract
Data is central to improvements in any predictive model, and credit reporting agencies have
recently enhanced the way their data is stored and reported. This paper discusses how to leverage
the new data by creating credit trend characteristics to account for movements in credit behaviors
that are indicative of future credit performance. A case study was conducted to evaluate the
benefits of the new data coupled with a trend-calculation technique in predicting whether a
customer will charge-off their credit card account within the first year of acquisition. Test and
control scorecards were created to evaluate the incremental benefits of adding credit trend
characteristics to a standard credit risk scorecard. Comparing these scorecards, fit statistics
indicated that lift is gained by using trend characteristics. This new technique is one example of
how time-varying data will help credit issuers better predict risk. Consequently consumers will
benefit from receiving more appropriate offers based on their enhanced risk profile.
Keywords: predictive model; scorecards; logistic regression; data mining; credit cards
Page 2 of 13
1. Introduction
Credit reporting agencies consistently seek to improve the quality of their data, which is
manifested in the reliability of their credit models. A revolutionizing enhancement to the way
data is stored and reported is surfacing. Until recently, credit balances, payments, past due
amounts, and credit limits were limited to the most recent single point-in-time for each account.
A new enhancement will offer a time series dimension of dollar amounts for credit balances,
payments, past due amounts, and credit limits reported. One way for credit card issuers to
leverage this new data is to create credit trend characteristics that account for movements in
credit behaviors.
In order to assess the potential of this new data and accompanying trend characteristic
technique, a case study was performed on a credit card portfolio. The goal was to ascertain the
benefits of the new data in predicting whether a customer will charge-off their credit card
account within the first year of acquisition by using trend characteristics obtained from time-
varying data compared to using standard credit risk characteristics. A standard credit risk
scorecard using static point-in-time characteristics was created and used as the control. Credit
trend characteristics were calculated by estimating the linear trend of the time-varying
characteristics. Combining the trend characteristics with the traditional credit characteristics, a
second scorecard was created and used as the test. Fit statistics were applied to compare the two
models, and the results indicated significant lift was gained by using trend characteristics. This
new technique is one of many likely to be implemented by modelers to leverage the new data
2. Background
Credit scoring has evolved into a critical tool for assessing future credit performance
across a variety of important consumer lending dimensions. Lenders invest heavily in the
Page 3 of 13
development of custom credit scoring systems, seeking new data sources and techniques to
improve the performance of their models. Enhanced data is often relied upon to develop more
robust models. Custom credit scores typically rely heavily upon data from credit reporting
agencies. Credit characteristics derived from a consumer credit file typically evaluate a
consumers lending history at individual points in time and is therefore a static representation of
the consumers behavior. The few trend elements that could be derived from a traditional
consumer credit report have typically been ignored. The recent availability of monthly time
series dollar amounts populated within credit balances, payments, past due amounts and credit
limit fields may offer significant incremental information over static credit characteristics [1].
Many lenders obtain credit scores and summarized credit characteristics of their credit
portfolios on a monthly basis; however, consumer credit report characteristics evaluated over
multiple points of time have not been relied upon by lenders nor made available from consumer
credit reporting agencies for account acquisition, credit underwriting, or account management
purposes. The reason for not using credit scores and/or credit characteristics trends to manage a
credit portfolio stems from the Fair Credit Reporting Act (1996) which requires furnishers and
users of consumer credit report information to ensure that the consumer credit information is
accurate and up to date [2]. For example, if a characteristic was disputed on a consumers
account with a separate lender, the update would not be reflected in the original lenders time-
varying data. This is why the validated enhanced credit bureau data is a necessity for lenders
using time-varying data. To ensure data accuracy, the U.S. credit reporting system relies heavily
upon consumers to dispute inaccurate information. This requires the ability for consumers to
review data on their credit file and to file a dispute when the consumer believes content on their
credit report is inaccurate. Until recently the process for consumers to view and dispute the
Page 4 of 13
accuracy of time-varying data did not exist. With this process in place, time-varying data can
now be used for pre-approved credit solicitation, account review and credit underwriting for
credit applications.
Credit bureau data is the foundation of nearly every underwriting tool used to evaluate
the current risk level of consumers. Due to the datas static nature, there has been a lack of focus
on characteristics that measure the direction or stability of risk. The standard set of credit
characteristics available from credit bureaus and custom credit characteristics derived from
consumer credit report information are usually in the form of the number of current credit
behaviors, time since the occurrence of certain events, and combinations of these two
dimensions.
Credit trend characteristics extract and summarize the trend of information associated
with traditional credit scoring characteristics. A traditional credit scoring data set used to create
a custom credit score has rows of distinct customer information. Columns are populated by credit
In the new enhanced credit data, each account will now have rows representing different
points in time for many credit characteristics. To capture the trend found within each of the
credit characteristics, a credit trend characteristic for each credit characteristic per account can be
calculated [3]. This trend is the rate of change of the credit characteristic with respect to time.
Graphically this is equivalent to the slope of the line of best fit through the data points of that
Figure 1 shows the time line of data used to fit traditional and enhanced models. Traditional
credit characteristics are usually captured in the current month to estimate the performance target
Page 5 of 13
captured in future months. The credit trend characteristics measure the rate of change of
characteristics over the past several months. The formula for estimating the slope in a simple
linear regression model is given in equation (1) where is the number of observations, and
= (1)
2 ( )2
Applying equation (1), a credit trend characteristic representing the rate of change with respect to
time can be calculated. Equation (2) shows the calculation for the credit trend characteristic, , ,
observations, is the time of the observation, and ,, is the value of the observation at time .
,
,
,
, =1 ,, =1 =1 ,,
, = 2 . (2)
, 2 ,
, =1 (=1 )
A general logistic regression model utilizing trend and standard characteristics is shown in
equation (3) where is the estimated probability of a binary event, is the estimated parameter
log (1) = 0 + 1 1 + 2 1 + 3 2 + 4 2 + + 1 + . (3)
Using credit trend characteristics, customer behavior can be defined with more precision. As an
example, two customers with the same current past due balance can be ranked by their credit
worthiness according to their past due balance credit trend characteristic. The customer whose
current past due balance has been growing in size will exhibit a trend up, which would equate to
a positive trend characteristic. A customer whose delinquent balance has been shrinking over
time will exhibit a downward trend. The second customer may be considered less risky.
Page 6 of 13
Figure 2 illustrates another example where trend characteristics improve the predictive
power of models. ID1 and ID2 represent separate customers. Traditionally the information about
utilization for these potential customers only consists of the most recent month. With the new
data, prior month utilizations are also available. In figure 2 and table 1, time zero represents the
current month and the negative times represent past months. The goal is to predict future
Both ID1 and ID2 have the same utilization of 0.5 at time 0. Using a traditional model, they
would be scored the same based on their utilization even though they have different previous
behaviors. ID1 has a decreasing utilization, and ID2 has an increasing utilization. Accounts with
increasing utilization are typically riskier than accounts with decreasing utilization even though
they may have the same current utilization. Using a model that encompasses the past utilization
through trend characteristics, ID1 would appropriately be scored higher than ID2 because ID1
has a negative utilization trend characteristic and ID2 has a positive utilization trend
characteristic.
Since the original characteristics included in the model are also indirectly accounted for in
the credit trend characteristic calculation, there is an inherent issue of the characteristics being
correlated with each other. This correlation when measured empirically was not a factor in the
data for the case study. Further, a simulation analysis showed if the variance of the credit
characteristics is reasonably constant over time, then correlation will not be a major issue.
Page 7 of 13
4. Case Study
A case study was conducted to measure the potential benefits of adding credit trend
characteristics to a standard logistic regression model. Two models were created, a control or
Champion model and a test or Challenger model. Using credit card solicitation data, the models
were made to predict if customers will charge off their credit cards within the next 12 months
after acquisition. To simplify the analysis, the credit characteristics used in the models were
restricted to a subset derived from a credit reporting agencys enhanced data including monthly
credit balances, payments, past due amounts, credit limits, and calculated utilization credit
characteristics.
The candidate credit characteristics of the Champion model were traditional static credit
characteristics from the most recent month. The candidate characteristics for the Challenger
model were the same candidate characteristics as in the Champion model along with their
The data was partitioned into a training and validation data set. Sixty percent of the
sample making up 3,779 accounts was allocated to the training data set, and forty percent making
up 2,520 accounts was allocated to the validation data set. As a standard practice, each of the
potential credit characteristics was coarse classified by their weights of evidence before the
After coarse classifying the characteristics, the same variable selection process was
conducted for the Champion and Challenger models similar to the method outlined in [5]. Using
the selected credit characteristics, each model was made using logistic regression. The credit
characteristics selected and their Wald Chi-square and p-values in each model are shown in
tables 2 and 3.
According to the Wald Chi-Square statistic, each parameter in both models is significant at an
= 0.1 level of significance. Tables 2 and 3 show the addition of the credit trend characteristics
changed what original characteristics were significant since the trend characteristics contain
some of the same information about the customers behavior. As an example, the Credit Line
characteristic is no longer significant after adding the trend characteristics because the trend
Table 4 shows the fit statistics of the models including the Kolmogorov Smirnov (KS)
and the area under the receiver operating characteristic curve (AUC). The magnitude of the fit
statistics is relatively small because the goal was not to build the best model, but to simply
compare the effect of adding the trend characteristics. The Challenger model has higher fit
statistics than the Champion model in both the training and validation datasets.
Figure 3 show the receiver operating characteristic (ROC) curves for the Champion and
According to the validation data set using a 0.2 false positive rate baseline, the Challenger model
can attain a 0.36 true positive rate while the Champion model can only attain a 0.26 true positive
rate. A company could target the least likely customers to charge-off with credit card offers and
Page 9 of 13
5. Conclusion
Traditionally, data provided by credit reporting agencies has been very reliable at
associated with the credit characteristics studied enhanced the ability to predict credit worthiness
over traditional credit bureau information. Credit reporting agencies are on the brink of providing
new time-varying monthly information which will provide modelers the trend perspective to
To leverage this new enhanced data, users of consumer credit report information will
need to utilize new modeling techniques and invest in credit bureau aggregation software that
will allow them to create custom credit trend characteristics tailored to specific aspects of the
portfolios and target. By adding credit trend characteristics into their credit scoring and credit
decision support platforms, lenders and users of consumer credit report information should
expect to derive significant improvement across all aspects of the consumer credit life cycle.
Page 10 of 13
References
3. Morrison J, Marrying Credit Scoring and Time-Series Data. The RMA Journal:
5. Thomas L, Edelman D, Crook N. Credit Scoring and Its Applications. Society for
Page 12 of 13
Figures
Page 13 of 13