The question is one of identifying what datasets are being used to arrive at credit scores, and if boundaries are needed, beyond which the law must step in.
If the recent Cambridge Analytica revelations have taught us something, it is that seemingly innocuous pieces of data can be used to influence voter behaviour during elections. Not a day seems to go by without some new usage of data being touted as the next big breakthrough in data analytics. It is important that these new claims are not taken at face value and that the insights they offer are examined before being accepted. One of the fields in which this must be done is that of credit rating.
The traditional way of generating credit scores involved evaluating a person’s credit history. A higher score would reflect a good credit history; the converse would be true for a person with a mediocre or poor history of repaying debts. This system does not work if an individual does not have a credit history. The conservative position in such cases would be to avoid extending credit at all. But an ever-increasing market for credit and stiff competition among the entities that do the lending means there is a need to discover newer ways of gauging an individual’s credit-worthiness.
The proliferation of data makes this process easier. In today’s data-rich world, a rating agency is never short of datasets on which to base its scores. The question then becomes one of identifying what datasets are being used to arrive at credit scores, and if there is a need to set boundaries, beyond which the law must step in.
A look at a few real-life examples helps contextualise this dilemma. In a recent interview, the COO of TransUnion Cibil stated the firm was looking to gain access to alternative datasets to determine an individual’s credit score. This included telecom and electricity bill payment records.
A different example is that of an entity such as Lenddo, which uses several non-repayment related data to generate credit scores. This includes data from browsers, social media networks, analytics around the filling of forms, etc. Proprietary algorithms process these datasets and generate an individual’s credit score. There are also recent reports that Paytm will launch a scoring system that is based on an individual’s online transactions, which might go beyond simple repayment of bills owed.
These examples reflect a continuum for future models of credit rating. We started with the traditional models that looked at credit histories. Now we have models that look at repayment of other kinds and models that do not restrict themselves to repayment histories.
The use of datasets that go beyond repayment histories must be approached with a healthy dose of scepticism. These datasets have no direct connection with repayments of any kind. Given this, it is natural to question whether an individual’s circle of friends or her propensity to use block letters to fill up an application form somehow signifies a predilection towards defaulting on her loans. In these cases, it is necessary to ensure that credit rating agencies are not passing off a correlation as a causation. In the absence of documented evidence, there is a risk of this resulting in a self-fulfilling system, with the insidious effect of propagating existing disadvantages and prejudices in society.
The use of repayment histories other than formal loans, such as utilities and telecom services, is more justifiable. These models help identify an individual’s past behaviour when it comes to paying off debts and could be a valuable input in assigning a credit score.
Having said that, there is merit in urging caution even here.
In particular, it must be asked if the model is only looking at the factum of a debt being repaid, or if it is also examining other ancillary details. For example, would it reflect on the credit score if an individual paid a telecom bill through cash or through a wire transfer? Would an individual who repays an electricity bill on the same day that she receives it get a higher rating as opposed to someone who pays it on the last day of payment? Would the credit score depend on the quantum of the bill being paid? If the answers to these questions are in the affirmative, there is cause for concern.
These scenarios signify a stretching of the continuum and make any task of determining the permissible limits of credit rating more problematic. A possible intervention in such cases could be in the form of requiring rating agencies to only have access to limited information, namely whether an individual has defaulted on a payment obligation or not.
The situation becomes more complicated when one considers the issue of consent and privacy. On the one hand, a case can be made for the continued operation of any model of credit rating so long as an individual has consented to it. On the other hand, given the opacity involved in taking disparate datasets and generating a credit score out of them, it is worth questioning just how informed such consent can ever be.
Expanding the reach of credit to individuals who have no credit histories is important. It helps level the playing field and ensures that individuals outside the traditional marketplace for credit are not deprived of its benefits. This is also how credit rating agencies justify the use of new-fangled approaches to generating credit scores. At the same time, it is necessary to look beyond the veil of this laudable objective to understand the methodology being adopted by agencies. This should help determine the extent of regulation required in the future for this field.
Ajay Patri is programme associate at the Takshashila Institution. He works on the intersection of technology and policy. He can be followed on twitter @ajaybpatri.