DataCredit_With_External_Factors

Citation Author(s):
Jomark
Noriega
UNIVERSIDAD NACIONAL MAYOR DE SAN MARCOS
Submitted by:
Jomark Noriega
Last updated:
Wed, 10/16/2024 - 01:01
DOI:
10.21227/qy2q-1f11
Data Format:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

This dataset offers a comprehensive mix of financial, demographic, temporal, and external factor data to help predict credit delinquency. It includes key information such as loan terms, credit balances, and effective interest rates, along with client details like salary, marital status, and profession.

In addition to tracking historical credit behavior and overdue days at different time points, the dataset incorporates critical external factors, including climate change, social unrest, and global crises like COVID-19, which may influence payment delays and financial behavior.

With this broad scope, the dataset is well-suited for building machine learning models that can improve credit risk management by analyzing the combined effects of financial, socio-demographic, and external influences.

Instructions: 

1. Columns: Each column is described with its name, data type, and meaning. Familiarize yourself with these details to understand what each field represents.

2. Data Types: Columns are classified as integer, decimal, bit, or string. This tells you how to handle the data:

Integer: Whole numbers.
Decimal: Numbers with decimal points, often for financial values.
Bit: Binary values (0 or 1).
String: Text fields, like income source codes.

3. Data Origins: Columns may contain extracted, calculated, or time series data, indicating whether they are raw values, derived metrics, or changing over time.

4. Delinquency Data: Fields like DiasVencido and Class track overdue payments. Binary classifications help flag cases where payment delays exceed certain thresholds (e.g., 30 or 29 days).

5. External Factors: Includes variables related to external events like COVID 19, climate change and social unrest, useful for analyzing their impact on delinquency.

6. Normalized Data: Some fields, such as SalarioNormalizado, are adjusted relative to other values, so they may not reflect the original scale.

7. Time Series Data: Delinquency information is available for multiple months. Ensure consistency when analyzing trends over time.