Page 1 of 14
Advances in Social Sciences Research Journal – Vol. 11, No. 3
Publication Date: March 25, 2024
DOI:10.14738/assrj.113.16544.
Bakker, D. K., & Ong’eta, O. J. (2024). An Application of Multinomial Mis-Classification Cost Matrix For A P- To – P Lending Credit
Score. Advances in Social Sciences Research Journal, 11(3). 76-89.
Services for Science and Education – United Kingdom
An Application of Multinomial Mis-Classification Cost Matrix For
A P- To – P Lending Credit Score
Bakker, Daniel K.
University of Eastern Africa, Baraton
Ong’eta, Oyaro Jackson
University of Eastern Africa, Baraton
ABSTRACT
An emerging new form of online credit for lending, different from traditional sources of
finance, such as banks and building societies, where lenders provide loans to borrowers
directly is termed as
2
p
. Many of these credits are unsecured personal loans, thus credit
score of loans is vital to regulate the default risk and improve profit for lenders and
platforms. Standard two-fold classifiers may not be appropriate in this lending since there
are multiple credit classes and misclassification costs vary largely across classes in the
lending platforms. Cost Sensitive Classifiers have been studied extensively in this set of
lending, but none of them have analyzed this issue from the perspective of multinomial
classifications and measured the misclassification costs:
( ) ( )
1 2
ab ab &
a b a b
C C
, of different
credit grades using actual losses and opportunity costs. The research intends to model
credit score in
2
p
lending as a cost-sensitive multinomial classification problem. A
misclassification cost matrix is proposed for credit scoring with a set of equations and
models to estimate the costs. A replication study using a publicly available data is
conducted to evaluate the performance and validate the usefulness of the proposed
misclassification cost matrix with the help of an R statistical package developed to aid the
application of the model. The outcomes showed that the cost-sensitive multinomial
classifiers can significantly decrease the total cost, which is vital for the
2
p
survival and
profitability.
Keywords: Misclassification Cost Matrix, Default Risk, P2, Two-fold Classifier, R Package.
INTRODUCTION
In the past years, online peer-to-peer (P2) loaning, as a popular form of personal loan, has
developed in credit market. It transfers traditional way of face-to-face personal loans through
online services [1] It is cost effective, pervasive, convenient, efficient without the involvement
of traditional financial institutions [11].
P2 loaning, in comparison with the traditional banking system, has the following features.
• It facilitates transactions by linking borrowers and creditors directly. There are
electronic loan application forms filled in by the borrowers, including amounts, terms,
purposes, and personal information (such as age, job, address, and credit card among
others). Platforms provide available financial conditions and credit histories of
Page 2 of 14
77
Bakker, D. K., & Ong’eta, O. J. (2024). An Application of Multinomial Mis-Classification Cost Matrix For A P- To – P Lending Credit Score. Advances
in Social Sciences Research Journal, 11(3). 76-89.
URL: http://dx.doi.org/10.14738/assrj.113.16544
borrowers to creditors, who will decide whether to grant a loan, the amount and at what
interest rate. Platforms use various approaches to help lenders set interest rates. Some
platforms carry out an auction at which a borrower set her/his maximum interest rate
and creditors give their bids [9]. Another approach is to assign interest rates
automatically using borrowers’ credit grades, which are calculated based on borrowers’
characteristics [6].
• P2 loaning platforms charge service fees for transactions [13], instead of charging
borrowers higher interest rates than the cost of the money as traditional financial
institutions. P2 loaning process benefits both borrowers and lenders. While borrowers
can be granted money at lower costs than traditional financial institutions, creditors can
make more money than putting their money in banks. This benefit comes with the risk
of borrowers’ defaulting on the loans because many P2 loans are unsecured personal
loans and most creditors may have little knowledge about credit risk management [21].
In order to regulate the default rates and risks, P2 loaning platforms came up with classification
models to evaluate credit risks of loans and borrowers and suggest appropriate interest rates
for loan applications. The quality of these models is vital to the credit risk management and
sustainability of P2 lending platforms. Using experiences from financial institutions, P2 loaning
platforms adopt and develop classification algorithms to categorize borrowers into different
credit grades based on their characteristics and credit history, and recognize potential
borrowers who are likely to default [8,14].
Though it is a common practice in traditional credit rating to use standard cost-insensitive
binary classification algorithms [15,16], such as logistic regression, neural networks, and
decision trees [4], they are not appropriate in P2 loaning due to the following reasons
• There are more than two classes of credit grades in P2 lending and each credit grade
implies a certain level of risk. This implies that multinomial classification should be
considered in P2 credit grading.
• P2 credit data are imbalanced, meaning that number of samples in different credit grades
varies. By illustration, the number of ideal borrowers in the best grade or high-risk
borrowers in the worst grade is much smaller than the other grade groups.
• Misclassification costs are not uniform across classes in P2 loaning. In general, the cost
committing a type II error (classifying a loan with bad credit as a good one) is usually
greater than type I error (classifying a good one as bad) [5].
In a multinomial credit-grading scenario, classifying a sample of grade C into grade A, for
example, is more costly than classifying B into A. Therefore, standard cost-insensitive
multinomial classification, in which all errors have the same cost, is not suitable for credit rating
in P2 loaning. Cost-sensitive multinomial classifiers would fit well for credit rating in P2 loaning.
These cost-sensitive classifiers were developed for imbalanced data classification [15]. Various
cost-sensitive classifiers have been proposed for credit rating [2], with the aim of to minimize
total costs measured by a misclassification cost matrix [10]. This is not only necessary but also
important for cost-sensitive classification problems.
According to few studies in P2 lending [21], few studies in P2 loaning about cost-sensitive
classifiers have been done, but none of them have analyzed this issue from the perspective of
Page 3 of 14
78
Advances in Social Sciences Research Journal (ASSRJ) Vol. 11, Issue 3, March-2024
Services for Science and Education – United Kingdom
multinomial classifications and measured misclassification costs of different credit grades with
P2 loaning.
Misclassification costs are losses of creditors’ earnings due to misclassifying credit grades of
loans. It equals to the difference between the return of a loan when it is correctly classified and
the return of a loan when it is misclassified as other credit grade. This difference can be one of
the following situations:
i. If a loan is classified to a better credit grade with a lower interest rate, the risk to default
of the loan is underestimated and the interest rate of the loan is set lower than it should
be, which means that the interest maybe deficient to cover the risk that the lender bears.
The creditor will lose potential returns that they could have gotten, including an unpaid
risk that the borrower should pay for the higher-risk loan.
ii. If a loan is classified to a worse credit grade with a higher interest rate, borrowers might
be scared away or it may increase their chance to default, which causes opportunity
costs and financial losses to lenders.
Objective of the Study
This paper proposes a multinomial cost matrix that measures misclassification costs of P2 credit
grading by considering real losses and opportunity costs associated with P2 loaning. A set of
equations and models were developed to calculate these misclassification costs. The
parameters in the proposed equations and models are designed to calculate the cost matrix and
support P2 loaning platforms’ operations. A case study using data from one of the approved
Lending institutions within sub-Saharan region in Africa is conducted to demonstrate the
performances of the proposed cost matrix using some well-known cost-sensitive classifiers.
The results show that the proposed cost matrix can not only reveal the sources of losses caused
by misclassifications, but also reduce the total costs for the P2 platforms, which in comparison
to the cost-insensitive classification algorithms, is better.
LITERATURE REVIEW
Most classifiers aim to maximize accuracy and minimize misclassifications. Various
classification methods have been proposed for credit rating and risk management [17]
Standard classifiers treat the costs of misclassifications the same, which should not true in real
credit risk management. Many researches support the use of cost-sensitive classifiers in credit
rating. Sahin et al. (2013) proposed a cost-sensitive decision tree approach with varying
misclassification costs. It is successfully used in credit card fraud detection to decrease financial
losses. Alejo, Garca, Marques, Sanchez, and Antonio-Velazquez (2013) improved the Multilayer
Perceptron neural network using three misclassification cost functions and can be used to
improve the prediction effectively in credit rating. [2] suggested example-dependent cost- sensitive methods and proposed logistic regression and decision trees for credit scoring.
Cost Matrices
Misclassification cost can be described by a cost matrix
( ij)m m
C c
=
, where
ij c
indicates the cost
due to misclassifying an instance of class i as class j, and m is the number of classes. In
credit rating, the measurement of misclassification costs in C is not only a basic component of
cost- sensitive classification, but also vital for high quality credit rating. Real financial
indicators, like profit- based or financial loss-related measures, are well aligned with the