Page 1 of 14

Advances in Social Sciences Research Journal – Vol. 11, No. 3

Publication Date: March 25, 2024

DOI:10.14738/assrj.113.16544.

Bakker, D. K., & Ong’eta, O. J. (2024). An Application of Multinomial Mis-Classification Cost Matrix For A P- To – P Lending Credit

Score. Advances in Social Sciences Research Journal, 11(3). 76-89.

Services for Science and Education – United Kingdom

An Application of Multinomial Mis-Classification Cost Matrix For

A P- To – P Lending Credit Score

Bakker, Daniel K.

University of Eastern Africa, Baraton

Ong’eta, Oyaro Jackson

University of Eastern Africa, Baraton

ABSTRACT

An emerging new form of online credit for lending, different from traditional sources of

finance, such as banks and building societies, where lenders provide loans to borrowers

directly is termed as

2

p

. Many of these credits are unsecured personal loans, thus credit

score of loans is vital to regulate the default risk and improve profit for lenders and

platforms. Standard two-fold classifiers may not be appropriate in this lending since there

are multiple credit classes and misclassification costs vary largely across classes in the

lending platforms. Cost Sensitive Classifiers have been studied extensively in this set of

lending, but none of them have analyzed this issue from the perspective of multinomial

classifications and measured the misclassification costs:

( ) ( )

1 2

ab ab &

a b a b

C C

 

, of different

credit grades using actual losses and opportunity costs. The research intends to model

credit score in

2

p

lending as a cost-sensitive multinomial classification problem. A

misclassification cost matrix is proposed for credit scoring with a set of equations and

models to estimate the costs. A replication study using a publicly available data is

conducted to evaluate the performance and validate the usefulness of the proposed

misclassification cost matrix with the help of an R statistical package developed to aid the

application of the model. The outcomes showed that the cost-sensitive multinomial

classifiers can significantly decrease the total cost, which is vital for the

2

p

survival and

profitability.

Keywords: Misclassification Cost Matrix, Default Risk, P2, Two-fold Classifier, R Package.

INTRODUCTION

In the past years, online peer-to-peer (P2) loaning, as a popular form of personal loan, has

developed in credit market. It transfers traditional way of face-to-face personal loans through

online services [1] It is cost effective, pervasive, convenient, efficient without the involvement

of traditional financial institutions [11].

P2 loaning, in comparison with the traditional banking system, has the following features.

• It facilitates transactions by linking borrowers and creditors directly. There are

electronic loan application forms filled in by the borrowers, including amounts, terms,

purposes, and personal information (such as age, job, address, and credit card among

others). Platforms provide available financial conditions and credit histories of

Page 2 of 14

77

Bakker, D. K., & Ong’eta, O. J. (2024). An Application of Multinomial Mis-Classification Cost Matrix For A P- To – P Lending Credit Score. Advances

in Social Sciences Research Journal, 11(3). 76-89.

URL: http://dx.doi.org/10.14738/assrj.113.16544

borrowers to creditors, who will decide whether to grant a loan, the amount and at what

interest rate. Platforms use various approaches to help lenders set interest rates. Some

platforms carry out an auction at which a borrower set her/his maximum interest rate

and creditors give their bids [9]. Another approach is to assign interest rates

automatically using borrowers’ credit grades, which are calculated based on borrowers’

characteristics [6].

• P2 loaning platforms charge service fees for transactions [13], instead of charging

borrowers higher interest rates than the cost of the money as traditional financial

institutions. P2 loaning process benefits both borrowers and lenders. While borrowers

can be granted money at lower costs than traditional financial institutions, creditors can

make more money than putting their money in banks. This benefit comes with the risk

of borrowers’ defaulting on the loans because many P2 loans are unsecured personal

loans and most creditors may have little knowledge about credit risk management [21].

In order to regulate the default rates and risks, P2 loaning platforms came up with classification

models to evaluate credit risks of loans and borrowers and suggest appropriate interest rates

for loan applications. The quality of these models is vital to the credit risk management and

sustainability of P2 lending platforms. Using experiences from financial institutions, P2 loaning

platforms adopt and develop classification algorithms to categorize borrowers into different

credit grades based on their characteristics and credit history, and recognize potential

borrowers who are likely to default [8,14].

Though it is a common practice in traditional credit rating to use standard cost-insensitive

binary classification algorithms [15,16], such as logistic regression, neural networks, and

decision trees [4], they are not appropriate in P2 loaning due to the following reasons

• There are more than two classes of credit grades in P2 lending and each credit grade

implies a certain level of risk. This implies that multinomial classification should be

considered in P2 credit grading.

• P2 credit data are imbalanced, meaning that number of samples in different credit grades

varies. By illustration, the number of ideal borrowers in the best grade or high-risk

borrowers in the worst grade is much smaller than the other grade groups.

• Misclassification costs are not uniform across classes in P2 loaning. In general, the cost

committing a type II error (classifying a loan with bad credit as a good one) is usually

greater than type I error (classifying a good one as bad) [5].

In a multinomial credit-grading scenario, classifying a sample of grade C into grade A, for

example, is more costly than classifying B into A. Therefore, standard cost-insensitive

multinomial classification, in which all errors have the same cost, is not suitable for credit rating

in P2 loaning. Cost-sensitive multinomial classifiers would fit well for credit rating in P2 loaning.

These cost-sensitive classifiers were developed for imbalanced data classification [15]. Various

cost-sensitive classifiers have been proposed for credit rating [2], with the aim of to minimize

total costs measured by a misclassification cost matrix [10]. This is not only necessary but also

important for cost-sensitive classification problems.

According to few studies in P2 lending [21], few studies in P2 loaning about cost-sensitive

classifiers have been done, but none of them have analyzed this issue from the perspective of

Page 3 of 14

78

Advances in Social Sciences Research Journal (ASSRJ) Vol. 11, Issue 3, March-2024

Services for Science and Education – United Kingdom

multinomial classifications and measured misclassification costs of different credit grades with

P2 loaning.

Misclassification costs are losses of creditors’ earnings due to misclassifying credit grades of

loans. It equals to the difference between the return of a loan when it is correctly classified and

the return of a loan when it is misclassified as other credit grade. This difference can be one of

the following situations:

i. If a loan is classified to a better credit grade with a lower interest rate, the risk to default

of the loan is underestimated and the interest rate of the loan is set lower than it should

be, which means that the interest maybe deficient to cover the risk that the lender bears.

The creditor will lose potential returns that they could have gotten, including an unpaid

risk that the borrower should pay for the higher-risk loan.

ii. If a loan is classified to a worse credit grade with a higher interest rate, borrowers might

be scared away or it may increase their chance to default, which causes opportunity

costs and financial losses to lenders.

Objective of the Study

This paper proposes a multinomial cost matrix that measures misclassification costs of P2 credit

grading by considering real losses and opportunity costs associated with P2 loaning. A set of

equations and models were developed to calculate these misclassification costs. The

parameters in the proposed equations and models are designed to calculate the cost matrix and

support P2 loaning platforms’ operations. A case study using data from one of the approved

Lending institutions within sub-Saharan region in Africa is conducted to demonstrate the

performances of the proposed cost matrix using some well-known cost-sensitive classifiers.

The results show that the proposed cost matrix can not only reveal the sources of losses caused

by misclassifications, but also reduce the total costs for the P2 platforms, which in comparison

to the cost-insensitive classification algorithms, is better.

LITERATURE REVIEW

Most classifiers aim to maximize accuracy and minimize misclassifications. Various

classification methods have been proposed for credit rating and risk management [17]

Standard classifiers treat the costs of misclassifications the same, which should not true in real

credit risk management. Many researches support the use of cost-sensitive classifiers in credit

rating. Sahin et al. (2013) proposed a cost-sensitive decision tree approach with varying

misclassification costs. It is successfully used in credit card fraud detection to decrease financial

losses. Alejo, Garca, Marques, Sanchez, and Antonio-Velazquez (2013) improved the Multilayer

Perceptron neural network using three misclassification cost functions and can be used to

improve the prediction effectively in credit rating. [2] suggested example-dependent cost- sensitive methods and proposed logistic regression and decision trees for credit scoring.

Cost Matrices

Misclassification cost can be described by a cost matrix

( ij)m m

C c

=

, where

ij c

indicates the cost

due to misclassifying an instance of class i as class j, and m is the number of classes. In

credit rating, the measurement of misclassification costs in C is not only a basic component of

cost- sensitive classification, but also vital for high quality credit rating. Real financial

indicators, like profit- based or financial loss-related measures, are well aligned with the