Optimization Heuristics for Determining Internal Rating Grading Scales

Keywords

Credit Risk, Probability of Default, Clustering, Threshold Accepting, Differential Evolution.

Review Status

General review by COMISEF Wiki Admin, 10/11/2008

Abstract

Basel II imposes minimum capital requirements on banks related to the default risk of their credit portfolio. Banks using an internal rating approach compute the minimum capital requirements from pooled probabilities of default . These pooled probabilities can be calculated by clustering credit borrowers into different buckets. The clustering problem can become very complex when the Basel II regulations and real-world constraints are taken into account. Search heuristics have already proven to show remarkable performance in tackling this problem as complex as it is. A Threshold Accepting algorithm is proposed, which exploits the inherent discrete nature of the clustering problem. It is demonstrated that it can be a valuable alternative to methodologies already proposed in the literature, such as standard k-means and Differential Evolution. Besides considering several clustering objectives for a given number of buckets, we extend the analysis further by introducing new methods to determine the optimal number of buckets in which to cluster the bank clients.

1 Basel II and Clustering of Credit Risk

1.1 Minimum Capital Requirements (MCR)

Basel II emphasizes is on the adequacy of shareholders' capital for a given risk profile. The bank's Value at Risk borrower i is equal to this debtor's exposure at default (EADi) times the fraction (loss given default, LGDi) of EADi that may not be recovered.

A bank may account for the expected part of VaR (i.e., VaR times borrower i's probability of default PDi) by provisioning. However, under negative economic conditions the conditional probability of default (PDc,i) is likely to exceed the expected PDi and thus may cause losses in excess of provisions. In order to ensure the stability of the banking system, banks are required by Basel II to hold regulatory capital (RC) that is related to these unexpected losses.

1.2 Credit Risk Rating

In order to compute the minimum regulatory capital most accurately, banks assess the clients' "riskiness" by evaluating their probability of default over the subsequent 12 months. Afterwards, clients are pooled together in buckets and are assigned the same 'pooled' PD. RC can be then computed by treating the mean PD of all borrowers in bucket b as a proxy of an individual borrower's PD.

Computing RC from pooled PDs results in an approximation error. Therefore, Basel II requires banks to assign borrowers to buckets meaningfully. We propose a methodology not only to tackle the problem of determining the PD buckets width (clustering the clients into a given number of PD-buckets), but also to determine the optimal number of buckets in which to partition the banks' clients and validate ex post the classification system.

This problem is complex to tackle since there is a trade-off between having a small number of large buckets and a high number of small buckets. In fact, clients belonging to the same buckets are assigned the same pooled PD. Hence, we would like to have a large number of buckets in order to minimize the loss of precision. However, in such a case it would be difficult to validate the consistency of the rating scheme ex post, since the number of defaults in each bucket would probably be too low for statistical validation. On the contrary, if the number in which to partition the clients is small, buckets tend to be too wide which might lead to an overstatement of the capital charge.

2 Two Optimization Heuristics for Credit Risk Bucketing

To determine the optimal PD buckets structure we introduce two heuristic optimization techniques. Threshold Accepting (TA) and Differential Evolution (DE) algorithm. We face the PD bucketing problem as a clustering one, i.e., we want to determine the optimal partition of N bank clients in B buckets with respect to a given objective function and subject to some constraints.

Given the inherent discrete nature of the problem, TA is a better alternative than DE. While both heuristics find reliable solutions and they need little or no parameter tuning, TA is faster (less computational load). That's because we alter only one bucket threshold per iteration and we compute only those two buckets' fitness and update objective function value of current solution. As a consequence we observe tremendous increase in search speed.2 In DE we need to evaluate the fitness of all buckets always as we alter all buckets every time.

3 Endogenous Determination of Number of Buckets

Validation of Actual Number of Defaults

In order for defaults to be predicted correctly the actual number of defaults in any buckets b should lie inside the interval of minimum and maximum predicted number of defaults. Then, we can state with confidence $1-\alpha$ that the credit risk rating system is suitable for predicting defaults in that bucket.

A crucial factor driving the precision of any ex post evaluation is the number of borrowers per bucket. So, defaults (D) are based on the mean predicted $\overline P \overline D_{b} +- \varepsilon$ and the number of borrowers $N_{b}$.

Given that the actual default for a loan is a binary variable, the number of actual defaults within a bucket can be modeled by the binomial distribution. Consequently, a $1-\alpha$ confidence interval for $D_{b}^{a}$ is defined by:

(1)
\begin{eqnarray} P_{int} & = & P_{b} \left( D_{b,min} \leq D^a_b \leq D_{b,max} \right) \nonumber \\ {} & = & \sum_{k=D_{b,min}}^{D_{b,max}} {N_b \choose k} \cdot \overline P\overline D_b^k \cdot \left(1- \overline P\overline D_b) \right)^{N_b-k} {} & \geq 1-\alpha. \end{eqnarray}

The choice variable is $N_{b}$. For a given bucket b of size $N_{b}$ we just have to check whether the constraint $P_{int}\geq 1-\alpha$ is met.

Using this concept, we define a credit classification system as meaningful if it allows for an ex post validation at a given level of precision as described by the two parameters $\alpha$ and $\varepsilon$. If we find the actual number of defaults in any bucket b to lie outside the interval $[D_{b,min};D_{b,max}]$ we can state with confidence $1-\alpha$ that the credit risk rating system is not suitable for predicting defaults in that bucket.

However, it has to be taken into account that not all combinations of $\alpha$ and $\varepsilon$ will be feasible for a given total number of loans to be considered and taking into account the other constraints imposed by the Basel II framework. In fact, a rough back of the envelope calculation shows that for values of $\alpha = 0.95$ and $\varepsilon= 1\%$ would require more 60 000 observations (borrowers), while ouR sample has only 11 995.

Validating Unexpected Losses

An alternative to an ex post validation of predicted default rates is the validation of correct statement of unexpected losses. Banks should set aside equity capital to cover $UL= 1.06\cdot UL$. However, they don't want to keep more than the supervisory authority's minimum requirements. This objective can be modeled by stating that in no bucket b actual unexpected losses in a stress-situation $UL_{b}^{a}$ shall be smaller or larger than predicted unexpected losses $UL_{b}$ plus or minus some fraction $\varepsilon$ of bucket b's stake in total unexpected losses as measured by the percentage of its borrowers $N_{b}$ in the number of all borrowers N.

(2)
\begin{eqnarray} UL_b - \varepsilon \cdot \left( UL \cdot \frac{N_b}{N} \right) \leq UL_{b,a} \leq UL_{b} + \varepsilon \cdot \left( UL \cdot \frac{N_{b}}{N} \right) \, . \end{eqnarray}

Since we do not know the distribution of unexpected losses the equation is not operational. Given that we know the distribution of defaults we make a transformation on equation 2. We approximate $UL_{b,a}$ by $N_{b}\cdot \overline U \overline L_{b}$.
Then, divide equation by $\overline U \overline L_b$ and multiplying it with $\overline P \overline D_{b}$. For the equation to hold the actual number of defaults in bucket b must lie within an interval $[D_{b,min};D_{b,max}]$.

4 Results

First results indicate that the best number of buckets is between 10 and 12 for different objective functions. For a seven bucket setting an idealized solution-vector of buckets' mean PDs looks as follows,

$g_{s}=\left( 0.25\% ;0.55\% ;1.5\% ;4\% ;8\% ;14\% ;21\% \right)$.

The UL-constraint shapes the solution in a way that we must not reject the validity of the credit risk rating system if we find ex post actual $\overline P \overline D_{b}s$ that deviate from predicted $\overline P \overline D_{b}s$ by less than $\pm$ the allowed deviations (in percentage points) is given by,

$d = (0.2\%; 0.25\%; 0.4\%; 1\%; 1.8\%; 3.5\%; 6.5\%)$.

The constraints for different buckets are reasonalbe since on the first bucket is quite generous as it contains good borrowers that are unlikely to default. It is restrictive for mid-range borrowers (allowing actual mean PDs to only deviate from predicted mean PDs by roughly 1/4). This is reasonable since it is highly uncertain whether these borrowers will default and cause a high unexpected risk for the bank. The UL-constraint becomes more generous for the last bucket again (allowing actual mean PDs to deviate from predicted mean PDs by roughly 1/3). These borrowers' default is quite likely such that high provisions have already been recognized. Hence, a smaller portion of their default risk must be backed up with capital requirements.

5 Conclusions/Further Work

• The Basel II requires banks to group loans according to their creditworthiness and set aside equity in order to self-insure against unexpected losses from borrowers' defaults under negative economic conditions.
• This task can be tackled as a clustering problem with real-world constraints, imposed by Basel capital accord, increasing the complexity of the optimization problem. We suggest the application of TA. Further, we propose a way to determine the optimal number of buckets based on ex-post validation.
• Further research on this topic is required using larger datasets. Moreover, it is of special interest which confidence- and precision-levels may be used for different sample sizes. Finally, although the constraint imposed on unexpected losses has a strong theoretical support, one might also consider alternative formulations (resulting in a lower computational complexity) for calculating the constraints. The efficiency of the algorithm therefore could be improved even further.