Decision Trees/ Machine Learning
Durga Gaddam
August 29, 2016
Objective:
The objective of the article is to identify the risk of a bank loan. In this article we will develop a credit approval model using C5.0 decision trees.
Decision Trees:
Decision Trees is one of the most widely used Machine Learning Algorithm. Terminology used in Decision Trees
- Root node- Beginning of decision tree
- Decision Nodes- Which help in making choices
- Branches- potential outcome of a decision
- Leaf Nodes or Terminal Nodes- Used to terminate the decision.
Decision trees use self-learning process called Recursive Partitioning, or divide and conquer method.
C5.0 Decision tree algorithm:
Entropy:
The algorithm C5.0 uses the technique called entropy, which quantifies the randomness, or disorder within a set of class values
Entropy (S) = ∑ci=1pilog2(pi)∑i=1cpilog2(pi)
Infromation Gain
C5.0 algorithm uses Information gain to split for the data set. The data set is divided into two parts. Split1 and Split2. This method is known as Information Gain
Info Gain(F)= Entropy(S1)- Entropy(S2)
Entropy (S) = \sum_{i=1}^w_i Entropy(p_i)\sum_{i=1}^w_i Entropy(p_i)
Step1: Collecting the Data
Step2: Exploring and preparing the Data
Step3: Training the data model
Step4: Evaluating the model performance
Step1: Improving model performance
Step1: Collecting the Data
The present data is extracted from http://archive.ics.uci.edu/ml/
Step2: Exploring and preparing the Data
##library(ggplot2)
credit <- read.csv("credit.csv") str(credit)
## 'data.frame': 1000 obs. of 21 variables:
## $ checking_balance : Factor w/ 4 levels "< 0 DM","> 200 DM",..: 1 3 4 1 1 4 4 3 4 3 ...
## $ months_loan_duration: int 6 48 12 42 24 36 24 36 12 30 ...
## $ credit_history : Factor w/ 5 levels "critical","delayed",..: 1 5 1 5 2 5 5 5 5 1 ...
## $ purpose : Factor w/ 10 levels "business","car (new)",..: 8 8 5 6 2 5 6 3 8 2 ...
## $ amount : int 1169 5951 2096 7882 4870 9055 2835 6948 3059 5234 ...
## $ savings_balance : Factor w/ 5 levels "< 100 DM","> 1000 DM",..: 5 1 1 1 1 5 4 1 2 1 ...
## $ employment_length : Factor w/ 5 levels "> 7 yrs","0 - 1 yrs",..: 1 3 4 4 3 3 1 3 4 5 ...
## $ installment_rate : int 4 2 2 2 3 2 3 2 2 4 ...
## $ personal_status : Factor w/ 4 levels "divorced male",..: 4 2 4 4 4 4 4 4 1 3 ...
## $ other_debtors : Factor w/ 3 levels "co-applicant",..: 3 3 3 2 3 3 3 3 3 3 ...
## $ residence_history : int 4 2 3 4 4 4 4 2 4 2 ...
## $ property : Factor w/ 4 levels "building society savings",..: 3 3 3 1 4 4 1 2 3 2 ...
## $ age : int 67 22 49 45 53 35 53 35 61 28 ...
## $ installment_plan : Factor w/ 3 levels "bank","none",..: 2 2 2 2 2 2 2 2 2 2 ...
## $ housing : Factor w/ 3 levels "for free","own",..: 2 2 2 1 1 1 2 3 2 2 ...
## $ existing_credits : int 2 1 1 1 2 1 1 1 1 2 ...
## $ job : Factor w/ 4 levels "mangement self-employed",..: 2 2 4 2 2 4 2 1 4 1 ...
## $ dependents : int 1 1 2 2 2 2 1 1 1 1 ...
## $ telephone : Factor w/ 2 levels "none","yes": 2 1 1 1 1 2 1 2 1 1 ...
## $ foreign_worker : Factor w/ 2 levels "no","yes": 2 2 2 2 2 2 2 2 2 2 ...
## $ default : int 1 2 1 1 2 1 1 1 1 2 ...
table(credit$checking_balance)
##
## < 0 DM > 200 DM 1 - 200 DM unknown
## 274 63 269 394
table(credit$savings_balance)
##
## < 100 DM > 1000 DM 101 - 500 DM 501 - 1000 DM unknown
## 603 48 103 63 183
##Here DM indicates currency of Germany Deutsche Marks(DM)
summary(credit$months_loan_duration)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.0 12.0 18.0 20.9 24.0 72.0
summary(credit$amount)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 250 1366 2320 3271 3972 18420
Through this we can observe that the minimum loan duration was 4 and maximum duration was 72
require(ggplot2)
## Loading required package: ggplot2
## Warning: package 'ggplot2' was built under R version 3.3.1
qplot(credit$months_loan_duration, xlab="Number of Months", main = "Loan Duration",geom="histogram", binwidth=2)
credit$default <-factor(credit$default, levels=c("1","2"), labels=c("No","Yes")) table(credit$default)
##
## No Yes
## 700 300
a <- sample(1000,900)
The default vector in the dataset indicates the response of whether the applicant met the agreed payment terms.
To prepare training data and testing data, we need to divide the data randomly
set.seed(12354)
train_sample <- sample(1000,800) credit_train <- credit[train_sample,] credit_test <- credit[-train_sample,] prop.table(table(credit_train$default))
##
## No Yes
## 0.70125 0.29875
prop.table(table(credit_test$default))
##
## No Yes
## 0.695 0.305
Step-3 Training the data Model
we need to remove the 21st column from the data model
##install.packages("C50")
##library(C50)
require(C50)
## Loading required package: C50
## Warning: package 'C50' was built under R version 3.3.1
credit_model <- C5.0(credit_train[-21], credit_train$default) credit_model
##
## Call:
## C5.0.default(x = credit_train[-21], y = credit_train$default)
##
## Classification Tree
## Number of samples: 800
## Number of predictors: 20
##
## Tree size: 43
##
## Non-standard options: attempt to group attributes
summary(credit_model)
##
## Call:
## C5.0.default(x = credit_train[-21], y = credit_train$default)
##
##
## C5.0 [Release 2.07 GPL Edition] Tue Aug 30 19:23:22 2016
## -------------------------------
##
## Class specified by attribute `outcome'
##
## Read 800 cases (21 attributes) from undefined.data
##
## Decision tree:
##
## checking_balance in {> 200 DM,unknown}: No (369/48)
## checking_balance in {< 0 DM,1 - 200 DM}:
## :...property = real estate:
## :...months_loan_duration <= 11: No (33/1)
## : months_loan_duration > 11:
## : :...checking_balance = 1 - 200 DM: No (38/7)
## : checking_balance = < 0 DM:
## : :...age > 51: No (5)
## : age <= 51:
## : :...savings_balance = > 1000 DM: Yes (0)
## : savings_balance in {101 - 500 DM,501 - 1000 DM,
## : : unknown}: No (6/1)
## : savings_balance = < 100 DM:
## : :...personal_status = divorced male: No (2)
## : personal_status in {female,married male}: Yes (18/2)
## : personal_status = single male:
## : :...months_loan_duration > 18: Yes (3)
## : months_loan_duration <= 18:
## : :...installment_rate <= 3: No (4)
## : installment_rate > 3: Yes (4/1)
## property in {building society savings,other,unknown/none}:
## :...credit_history in {critical,delayed}:
## :...savings_balance in {> 1000 DM,101 - 500 DM,501 - 1000 DM,
## : : unknown}: No (30/4)
## : savings_balance = < 100 DM:
## : :...credit_history = delayed:
## : :...installment_rate <= 2: No (8/2)
## : : installment_rate > 2: Yes (11/1)
## : credit_history = critical:
## : :...months_loan_duration <= 27:
## : :...other_debtors in {co-applicant,
## : : : guarantor}: Yes (5/1)
## : : other_debtors = none: No (34/5)
## : months_loan_duration > 27:
## : :...age <= 32: Yes (10/1)
## : age > 32: No (3)
## credit_history in {fully repaid,fully repaid this bank,repaid}:
## :...residence_history <= 1: No (37/13)
## residence_history > 1:
## :...savings_balance = 501 - 1000 DM: Yes (3/1)
## savings_balance = > 1000 DM:
## :...age <= 27: Yes (2)
## : age > 27: No (5)
## savings_balance = unknown:
## :...existing_credits > 1: No (3)
## : existing_credits <= 1:
## : :...checking_balance = < 0 DM: Yes (12/3)
## : checking_balance = 1 - 200 DM: No (15/4)
## savings_balance = 101 - 500 DM:
## :...personal_status in {divorced male,female,
## : : married male}: Yes (14)
## : personal_status = single male:
## : :...property = other: No (5)
## : property in {building society savings,unknown/none}:
## : :...employment_length = > 7 yrs: No (2)
## : employment_length in {0 - 1 yrs,1 - 4 yrs,
## : 4 - 7 yrs,
## : unemployed}: Yes (5)
## savings_balance = < 100 DM:
## :...credit_history in {fully repaid,
## : fully repaid this bank}: Yes (26/3)
## credit_history = repaid:
## :...other_debtors in {co-applicant,
## : guarantor}: No (11/4)
## other_debtors = none:
## :...purpose in {domestic appliances,education,
## : furniture,others,
## : retraining}: Yes (21/4)
## purpose = repairs: No (2)
## purpose = business:
## :...job in {mangement self-employed,
## : : skilled employee,
## : : unemployed non-resident}: Yes (3)
## : job = unskilled resident: No (2)
## purpose = car (used):
## :...amount <= 8072: No (6/1)
## : amount > 8072: Yes (5)
## purpose = car (new):
## :...installment_rate > 3: Yes (9)
## : installment_rate <= 3:
## : :...housing in {for free,rent}: No (7/1)
## : housing = own: Yes (7/1)
## purpose = radio/tv:
## :...existing_credits > 1: Yes (2)
## existing_credits <= 1:
## :...dependents <= 1: No (11/4)
## dependents > 1: Yes (2)
##
##
## Evaluation on training data (800 cases):
##
## Decision Tree
## ----------------
## Size Errors
##
## 42 113(14.1%) <<
##
##
## (a) (b) <-classified as
## ---- ----
## 543 18 (a): class No
## 95 144 (b): class Yes
##
##
## Attribute usage:
##
## 100.00% checking_balance
## 53.88% property
## 39.75% credit_history
## 39.75% savings_balance
## 27.13% residence_history
## 20.63% months_loan_duration
## 15.88% other_debtors
## 9.63% purpose
## 7.75% age
## 7.13% personal_status
## 6.25% installment_rate
## 5.63% existing_credits
## 1.75% housing
## 1.63% dependents
## 1.38% amount
## 0.88% employment_length
## 0.63% job
##
##
## Time: 0.0 secs
Explaining the Summary of credit model:
Here 800 cases were studied and the following decision were made:
- The first line in the summary indicates that if the checking balance of an individual is unknown or greater than 200 DM the classify as not likely to default.
- If the checking balance is less than 0 DM or between 1 and 200 DM, then consider the given factors.
Step-4 Evaluating Model performance:
credit_pred <- predict(credit_model, credit_test) ##library(gmodels) require(gmodels)
## Loading required package: gmodels
## Warning: package 'gmodels' was built under R version 3.3.1
CrossTable(credit_test$default, credit_pred, prop.chisq= FALSE, prop.c=FALSE, prop.r=FALSE, dnn=c('actual default','predicted default' ))
##
##
## Cell Contents
## |-------------------------|
## | N |
## | N / Table Total |
## |-------------------------|
##
##
## Total Observations in Table: 200
##
##
## | predicted default
## actual default | No | Yes | Row Total |
## ---------------|-----------|-----------|-----------|
## No | 123 | 16 | 139 |
## | 0.615 | 0.080 | |
## ---------------|-----------|-----------|-----------|
## Yes | 42 | 19 | 61 |
## | 0.210 | 0.095 | |
## ---------------|-----------|-----------|-----------|
## Column Total | 165 | 35 | 200 |
## ---------------|-----------|-----------|-----------|
##
##
Step-5 Improvin Model performance
credit_boost10 <- C5.0(credit_train[-21], credit_train$default, trials=10) credit_boost10
##
## Call:
## C5.0.default(x = credit_train[-21], y = credit_train$default, trials = 10)
##
## Classification Tree
## Number of samples: 800
## Number of predictors: 20
##
## Number of boosting iterations: 10
## Average tree size: 35.3
##
## Non-standard options: attempt to group attributes
summary(credit_boost10)
##
## Call:
## C5.0.default(x = credit_train[-21], y = credit_train$default, trials = 10)
##
##
## C5.0 [Release 2.07 GPL Edition] Tue Aug 30 19:23:22 2016
## -------------------------------
##
## Class specified by attribute `outcome'
##
## Read 800 cases (21 attributes) from undefined.data
##
## ----- Trial 0: -----
##
## Decision tree:
##
## checking_balance in {> 200 DM,unknown}: No (369/48)
## checking_balance in {< 0 DM,1 - 200 DM}:
## :...property = real estate:
## :...months_loan_duration <= 11: No (33/1)
## : months_loan_duration > 11:
## : :...checking_balance = 1 - 200 DM: No (38/7)
## : checking_balance = < 0 DM:
## : :...age > 51: No (5)
## : age <= 51:
## : :...savings_balance = > 1000 DM: Yes (0)
## : savings_balance in {101 - 500 DM,501 - 1000 DM,
## : : unknown}: No (6/1)
## : savings_balance = < 100 DM:
## : :...personal_status = divorced male: No (2)
## : personal_status in {female,married male}: Yes (18/2)
## : personal_status = single male:
## : :...months_loan_duration > 18: Yes (3)
## : months_loan_duration <= 18:
## : :...installment_rate <= 3: No (4)
## : installment_rate > 3: Yes (4/1)
## property in {building society savings,other,unknown/none}:
## :...credit_history in {critical,delayed}:
## :...savings_balance in {> 1000 DM,101 - 500 DM,501 - 1000 DM,
## : : unknown}: No (30/4)
## : savings_balance = < 100 DM:
## : :...credit_history = delayed:
## : :...installment_rate <= 2: No (8/2)
## : : installment_rate > 2: Yes (11/1)
## : credit_history = critical:
## : :...months_loan_duration <= 27:
## : :...other_debtors in {co-applicant,
## : : : guarantor}: Yes (5/1)
## : : other_debtors = none: No (34/5)
## : months_loan_duration > 27:
## : :...age <= 32: Yes (10/1)
## : age > 32: No (3)
## credit_history in {fully repaid,fully repaid this bank,repaid}:
## :...residence_history <= 1: No (37/13)
## residence_history > 1:
## :...savings_balance = 501 - 1000 DM: Yes (3/1)
## savings_balance = > 1000 DM:
## :...age <= 27: Yes (2)
## : age > 27: No (5)
## savings_balance = unknown:
## :...existing_credits > 1: No (3)
## : existing_credits <= 1:
## : :...checking_balance = < 0 DM: Yes (12/3)
## : checking_balance = 1 - 200 DM: No (15/4)
## savings_balance = 101 - 500 DM:
## :...personal_status in {divorced male,female,
## : : married male}: Yes (14)
## : personal_status = single male:
## : :...property = other: No (5)
## : property in {building society savings,unknown/none}:
## : :...employment_length = > 7 yrs: No (2)
## : employment_length in {0 - 1 yrs,1 - 4 yrs,
## : 4 - 7 yrs,
## : unemployed}: Yes (5)
## savings_balance = < 100 DM:
## :...credit_history in {fully repaid,
## : fully repaid this bank}: Yes (26/3)
## credit_history = repaid:
## :...other_debtors in {co-applicant,
## : guarantor}: No (11/4)
## other_debtors = none:
## :...purpose in {domestic appliances,education,
## : furniture,others,
## : retraining}: Yes (21/4)
## purpose = repairs: No (2)
## purpose = business:
## :...job in {mangement self-employed,
## : : skilled employee,
## : : unemployed non-resident}: Yes (3)
## : job = unskilled resident: No (2)
## purpose = car (used):
## :...amount <= 8072: No (6/1)
## : amount > 8072: Yes (5)
## purpose = car (new):
## :...installment_rate > 3: Yes (9)
## : installment_rate <= 3:
## : :...housing in {for free,rent}: No (7/1)
## : housing = own: Yes (7/1)
## purpose = radio/tv:
## :...existing_credits > 1: Yes (2)
## existing_credits <= 1:
## :...dependents <= 1: No (11/4)
## dependents > 1: Yes (2)
##
## ----- Trial 1: -----
##
## Decision tree:
##
## foreign_worker = no: No (28.3/3.2)
## foreign_worker = yes:
## :...checking_balance = unknown:
## :...installment_plan in {bank,stores}:
## : :...other_debtors in {co-applicant,guarantor}: No (3.2)
## : : other_debtors = none:
## : : :...purpose in {business,car (new),car (used),domestic appliances,
## : : : education,others,repairs,
## : : : retraining}: Yes (39.8/10.3)
## : : purpose in {furniture,radio/tv}: No (18.1/2.3)
## : installment_plan = none:
## : :...amount <= 1381: No (48.1/4.5)
## : amount > 1381:
## : :...purpose in {car (used),domestic appliances,others,
## : : retraining}: No (28.5)
## : purpose in {business,car (new),education,furniture,radio/tv,
## : : repairs}:
## : :...credit_history = delayed: Yes (24.7/11.1)
## : credit_history in {fully repaid,
## : : fully repaid this bank}: No (0.8)
## : credit_history = critical:
## : :...amount <= 6887: No (48.2/2.3)
## : : amount > 6887: Yes (5.3/0.8)
## : credit_history = repaid:
## : :...dependents > 1: No (10.2/2.3)
## : dependents <= 1:
## : :...existing_credits > 1: Yes (17.6/4)
## : existing_credits <= 1:
## : :...age <= 23: Yes (9.2/2.4)
## : age > 23: No (48.6/9.1)
## checking_balance in {< 0 DM,> 200 DM,1 - 200 DM}:
## :...other_debtors = co-applicant: Yes (24/7.8)
## other_debtors = guarantor:
## :...purpose in {business,car (new)}: Yes (6.9/0.8)
## : purpose in {car (used),domestic appliances,education,furniture,
## : others,radio/tv,repairs,retraining}: No (24.3/2.4)
## other_debtors = none:
## :...employment_length = 0 - 1 yrs: Yes (84.5/25.3)
## employment_length in {> 7 yrs,1 - 4 yrs,4 - 7 yrs,unemployed}:
## :...credit_history = delayed: No (33.6/12.5)
## credit_history in {fully repaid,
## : fully repaid this bank}: Yes (39/13.2)
## credit_history = critical:
## :...age > 39: No (23.6/2.3)
## : age <= 39:
## : :...installment_plan in {bank,stores}: Yes (7.7/1.6)
## : installment_plan = none: No (45.2/20.6)
## credit_history = repaid:
## :...savings_balance in {> 1000 DM,
## : 501 - 1000 DM}: No (10.2)
## savings_balance in {< 100 DM,101 - 500 DM,unknown}:
## :...job = mangement self-employed: No (25.1/10.2)
## job = unemployed non-resident: Yes (6.9/1.6)
## job = unskilled resident:
## :...dependents <= 1: No (28.1/6.3)
## : dependents > 1: Yes (7.7/0.8)
## job = skilled employee:
## :...checking_balance = > 200 DM: No (11.8/2.3)
## checking_balance in {< 0 DM,1 - 200 DM}:
## :...personal_status in {divorced male,
## : married male}: Yes (15.4/2.4)
## personal_status = single male:
## :...amount <= 6110: No (32.8/10.2)
## : amount > 6110: Yes (6.2)
## personal_status = female:
## :...existing_credits > 1: Yes (3.2)
## existing_credits <= 1: [S1]
##
## SubTree [S1]
##
## property in {building society savings,real estate,unknown/none}: Yes (22.3/2.4)
## property = other: No (11.1/3.2)
##
## ----- Trial 2: -----
##
## Decision tree:
##
## checking_balance = unknown:
## :...other_debtors = co-applicant: Yes (12.2/4.5)
## : other_debtors = guarantor: No (6.8/2.9)
## : other_debtors = none:
## : :...installment_plan = stores: No (15/5.6)
## : installment_plan = bank:
## : :...installment_rate <= 1: No (3.5)
## : : installment_rate > 1:
## : : :...months_loan_duration <= 16: No (12.7/1.9)
## : : months_loan_duration > 16: Yes (27.9/8.3)
## : installment_plan = none:
## : :...amount <= 1381: No (37)
## : amount > 1381:
## : :...age > 32: No (95.5/12.2)
## : age <= 32:
## : :...personal_status = divorced male: Yes (2.5/0.6)
## : personal_status in {married male,
## : : single male}: No (39.9/9.3)
## : personal_status = female:
## : :...purpose in {business,car (used),radio/tv}: No (9.3)
## : purpose in {car (new),domestic appliances,education,
## : furniture,others,repairs,
## : retraining}: Yes (29/8.1)
## checking_balance in {< 0 DM,> 200 DM,1 - 200 DM}:
## :...property = unknown/none:
## :...housing = own: Yes (24.2/2.7)
## : housing in {for free,rent}:
## : :...employment_length in {> 7 yrs,0 - 1 yrs,1 - 4 yrs,
## : : 4 - 7 yrs}: Yes (57.1/17.2)
## : employment_length = unemployed: No (19/5)
## property in {building society savings,other,real estate}:
## :...age > 47:
## :...personal_status in {divorced male,female,single male}: No (37.4/4.8)
## : personal_status = married male: Yes (3.5)
## age <= 47:
## :...purpose in {business,car (used),repairs,retraining}: No (61.5/18.4)
## purpose in {domestic appliances,education,others}: Yes (25/8.1)
## purpose = car (new):
## :...installment_rate > 2: Yes (65.1/20.4)
## : installment_rate <= 2:
## : :...telephone = none: No (20.1/2.7)
## : telephone = yes: Yes (9.7/1.9)
## purpose = radio/tv:
## :...months_loan_duration <= 8: No (6.5)
## : months_loan_duration > 8:
## : :...employment_length in {> 7 yrs,4 - 7 yrs}: No (23.9/7.5)
## : employment_length in {0 - 1 yrs,unemployed}: Yes (26.2/9.8)
## : employment_length = 1 - 4 yrs:
## : :...months_loan_duration <= 15: No (12.6/3.2)
## : months_loan_duration > 15: Yes (26.2/5.2)
## purpose = furniture:
## :...installment_plan = stores: No (8.6)
## installment_plan in {bank,none}:
## :...other_debtors = guarantor: No (5.1)
## other_debtors in {co-applicant,none}:
## :...employment_length in {> 7 yrs,
## : unemployed}: Yes (15.9/1.9)
## employment_length in {0 - 1 yrs,
## : 4 - 7 yrs}: No (26.8/7.8)
## employment_length = 1 - 4 yrs:
## :...personal_status = divorced male: No (4.6)
## personal_status in {female,married male,
## : single male}:
## :...telephone = none: Yes (23.2/6.1)
## telephone = yes: No (6.3/1.4)
##
## ----- Trial 3: -----
##
## Decision tree:
##
## checking_balance in {> 200 DM,unknown}:
## :...employment_length in {> 7 yrs,4 - 7 yrs}: No (149.6/27.5)
## : employment_length in {0 - 1 yrs,1 - 4 yrs,unemployed}:
## : :...amount > 4139: Yes (53.8/19.7)
## : amount <= 4139:
## : :...other_debtors in {co-applicant,guarantor}: Yes (14.4/4.6)
## : other_debtors = none: No (114.4/28.3)
## checking_balance in {< 0 DM,1 - 200 DM}:
## :...savings_balance in {> 1000 DM,501 - 1000 DM,unknown}:
## :...foreign_worker = no: No (2.6)
## : foreign_worker = yes:
## : :...property = building society savings: Yes (19.7/7.4)
## : property in {other,real estate,unknown/none}: No (67.3/15.4)
## savings_balance in {< 100 DM,101 - 500 DM}:
## :...months_loan_duration > 27:
## :...employment_length in {> 7 yrs,0 - 1 yrs,1 - 4 yrs,
## : : 4 - 7 yrs}: Yes (83.9/18.1)
## : employment_length = unemployed: No (9/2.3)
## months_loan_duration <= 27:
## :...credit_history in {fully repaid,
## : fully repaid this bank}: Yes (34.3/8.6)
## credit_history in {critical,delayed,repaid}:
## :...other_debtors = guarantor: No (19.8/3.2)
## other_debtors in {co-applicant,none}:
## :...personal_status in {divorced male,
## : married male}: Yes (41.9/17.1)
## personal_status = single male:
## :...savings_balance = 101 - 500 DM: No (13.6/1.5)
## : savings_balance = < 100 DM:
## : :...existing_credits > 1: No (41.6/11)
## : existing_credits <= 1:
## : :...employment_length in {> 7 yrs,
## : : unemployed}: Yes (17.6/3.2)
## : employment_length in {0 - 1 yrs,1 - 4 yrs,
## : 4 - 7 yrs}: No (41/12.9)
## personal_status = female:
## :...existing_credits > 2: Yes (5.9/0.5)
## existing_credits <= 2:
## :...amount > 8978: Yes (5.3)
## amount <= 8978:
## :...installment_plan in {bank,
## : stores}: No (5.4/1.6)
## installment_plan = none:
## :...other_debtors = co-applicant: No (2.6/0.5)
## other_debtors = none:
## :...installment_rate <= 1: No (6.9/1.2)
## installment_rate > 1: [S1]
##
## SubTree [S1]
##
## credit_history = critical: No (11.8/3.5)
## credit_history = delayed: Yes (2.2/0.5)
## credit_history = repaid:
## :...job = mangement self-employed: No (4.8)
## job in {skilled employee,unemployed non-resident,
## unskilled resident}: Yes (30.7/9.2)
##
## ----- Trial 4: -----
##
## Decision tree:
##
## months_loan_duration <= 7:
## :...amount <= 4139: No (36.7/2.3)
## : amount > 4139: Yes (4.7/0.4)
## months_loan_duration > 7:
## :...purpose in {domestic appliances,repairs}: Yes (27.7/9.1)
## purpose in {others,retraining}: No (16/4.7)
## purpose = car (used):
## :...amount <= 11054: No (65.8/11)
## : amount > 11054: Yes (5.7)
## purpose = education:
## :...housing in {for free,rent}: No (18.7/5)
## : housing = own:
## : :...age <= 44: Yes (24.5/6)
## : age > 44: No (2.3)
## purpose = business:
## :...housing = for free: No (1.5/0.4)
## : housing = rent: Yes (9.9/2)
## : housing = own:
## : :...savings_balance in {> 1000 DM,101 - 500 DM,501 - 1000 DM,
## : : unknown}: No (32.6/6.3)
## : savings_balance = < 100 DM:
## : :...installment_plan = bank: No (6.3)
## : installment_plan in {none,stores}:
## : :...personal_status in {divorced male,married male,
## : : single male}: Yes (32.4/9.2)
## : personal_status = female: No (9.1/1.1)
## purpose = radio/tv:
## :...checking_balance in {> 200 DM,unknown}: No (79.8/24.4)
## : checking_balance = < 0 DM:
## : :...months_loan_duration > 30: Yes (6)
## : : months_loan_duration <= 30:
## : : :...job in {mangement self-employed,
## : : : unemployed non-resident}: No (6.4/1)
## : : job in {skilled employee,unskilled resident}:
## : : :...housing = own: No (18.9/6.8)
## : : housing in {for free,rent}: Yes (11.2/1.5)
## : checking_balance = 1 - 200 DM:
## : :...other_debtors in {co-applicant,guarantor}: No (12.6/2.5)
## : other_debtors = none:
## : :...personal_status = divorced male: No (0)
## : personal_status = married male: Yes (9.8/1.7)
## : personal_status in {female,single male}:
## : :...existing_credits <= 1: No (29.6/8.5)
## : existing_credits > 1: Yes (8.2/2.4)
## purpose = car (new):
## :...installment_plan = stores: Yes (2.1)
## : installment_plan = bank:
## : :...age <= 60: Yes (33.3/7.8)
## : : age > 60: No (3.3)
## : installment_plan = none:
## : :...savings_balance in {> 1000 DM,101 - 500 DM}: No (19.5/5.8)
## : savings_balance in {501 - 1000 DM,unknown}: Yes (37.5/16.5)
## : savings_balance = < 100 DM:
## : :...installment_rate <= 2: No (22.9/5.2)
## : installment_rate > 2:
## : :...amount > 2329: Yes (20.7/2)
## : amount <= 2329:
## : :...checking_balance in {> 200 DM,unknown}: No (10.6)
## : checking_balance in {< 0 DM,1 - 200 DM}:
## : :...housing = for free: Yes (4.3)
## : housing in {own,rent}: No (29.1/11.7)
## purpose = furniture:
## :...installment_plan = stores: No (7.7)
## installment_plan in {bank,none}:
## :...other_debtors = guarantor: No (5.6)
## other_debtors in {co-applicant,none}:
## :...months_loan_duration <= 16:
## :...checking_balance in {< 0 DM,> 200 DM,unknown}: No (35.9/3.2)
## : checking_balance = 1 - 200 DM: Yes (14.4/4.9)
## months_loan_duration > 16:
## :...dependents > 1: Yes (8)
## dependents <= 1:
## :...housing = for free: No (3.9)
## housing in {own,rent}:
## :...savings_balance in {> 1000 DM,501 - 1000 DM,
## : unknown}: No (11.6/1.5)
## savings_balance = 101 - 500 DM: Yes (4/1)
## savings_balance = < 100 DM:
## :...job in {mangement self-employed,
## : unemployed non-resident}: Yes (10.1)
## job in {skilled employee,unskilled resident}:
## :...telephone = none: Yes (29.4/9.7)
## telephone = yes: No (9.9/2.8)
##
## ----- Trial 5: -----
##
## Decision tree:
##
## checking_balance = < 0 DM:
## :...foreign_worker = no: No (13.5/3)
## : foreign_worker = yes:
## : :...job = mangement self-employed: No (31.9/11.2)
## : job = unemployed non-resident: Yes (7.1/1.6)
## : job = unskilled resident:
## : :...employment_length in {> 7 yrs,unemployed}: No (7.9)
## : : employment_length = 0 - 1 yrs: Yes (6.4)
## : : employment_length in {1 - 4 yrs,4 - 7 yrs}:
## : : :...purpose in {business,car (used),furniture,others,repairs,
## : : : retraining}: No (8.7)
## : : purpose in {car (new),domestic appliances,education,
## : : radio/tv}: Yes (20.4/7.3)
## : job = skilled employee:
## : :...credit_history = critical: No (27.1/10.7)
## : credit_history in {delayed,fully repaid,
## : : fully repaid this bank}: Yes (32.6/7.9)
## : credit_history = repaid:
## : :...savings_balance in {> 1000 DM,501 - 1000 DM}: No (3.9)
## : savings_balance in {< 100 DM,101 - 500 DM,unknown}:
## : :...existing_credits > 1: Yes (5.8)
## : existing_credits <= 1:
## : :...other_debtors in {co-applicant,none}: Yes (74.8/20.3)
## : other_debtors = guarantor: No (3.2)
## checking_balance in {> 200 DM,1 - 200 DM,unknown}:
## :...amount > 9857: Yes (30/8.2)
## amount <= 9857:
## :...job = unemployed non-resident: No (7.3)
## job = mangement self-employed:
## :...employment_length = 4 - 7 yrs: No (6.7)
## : employment_length in {> 7 yrs,0 - 1 yrs,1 - 4 yrs,unemployed}:
## : :...savings_balance in {> 1000 DM,101 - 500 DM}: Yes (12.5/3.2)
## : savings_balance in {501 - 1000 DM,unknown}: No (16.9/5.4)
## : savings_balance = < 100 DM:
## : :...residence_history <= 1: No (5.9)
## : residence_history > 1:
## : :...other_debtors in {co-applicant,guarantor}: No (2.4)
## : other_debtors = none:
## : :...dependents > 1: Yes (2.8)
## : dependents <= 1:
## : :...housing = for free: No (3.8)
## : housing in {own,rent}: Yes (32/9.1)
## job in {skilled employee,unskilled resident}:
## :...installment_plan = stores: No (16/5.1)
## installment_plan = bank:
## :...installment_rate <= 2: No (21.9/4.2)
## : installment_rate > 2:
## : :...personal_status in {divorced male,female,
## : : single male}: Yes (37/10.1)
## : personal_status = married male: No (3.6)
## installment_plan = none:
## :...savings_balance in {> 1000 DM,unknown}:
## :...other_debtors in {co-applicant,none}: No (87/14.1)
## : other_debtors = guarantor: Yes (2.8/0.4)
## savings_balance in {< 100 DM,101 - 500 DM,501 - 1000 DM}:
## :...other_debtors = co-applicant: Yes (12.2/4.7)
## other_debtors = guarantor: No (10.7/1.6)
## other_debtors = none:
## :...age > 50: No (13)
## age <= 50:
## :...checking_balance = unknown: No (94.8/27)
## checking_balance = > 200 DM:
## :...job = unskilled resident: Yes (7.5/0.4)
## : job = skilled employee:
## : :...existing_credits <= 2: No (24/5.3)
## : existing_credits > 2: Yes (2.4)
## checking_balance = 1 - 200 DM:
## :...employment_length in {> 7 yrs,4 - 7 yrs,
## : unemployed}: No (39.4/11.2)
## employment_length = 0 - 1 yrs:
## :...housing in {for free,own}: No (24.9/7.2)
## : housing = rent: Yes (9.7/2.3)
## employment_length = 1 - 4 yrs: [S1]
##
## SubTree [S1]
##
## personal_status = divorced male: No (2.7)
## personal_status in {female,married male,single male}: Yes (26.8/6.8)
##
## ----- Trial 6: -----
##
## Decision tree:
##
## checking_balance = unknown:
## :...employment_length = unemployed: Yes (14.9/6.4)
## : employment_length = 4 - 7 yrs:
## : :...age <= 22: Yes (9.4/2.6)
## : : age > 22: No (35.6/2.4)
## : employment_length = 0 - 1 yrs:
## : :...other_debtors = co-applicant: Yes (2.9)
## : : other_debtors = guarantor: No (1.6)
## : : other_debtors = none:
## : : :...amount <= 4594: No (19/4.6)
## : : amount > 4594: Yes (10.7/0.7)
## : employment_length = 1 - 4 yrs:
## : :...installment_rate <= 1: No (11.3)
## : : installment_rate > 1:
## : : :...installment_plan in {bank,stores}: Yes (14.7/4.5)
## : : installment_plan = none: No (66.9/24.7)
## : employment_length = > 7 yrs:
## : :...property in {building society savings,real estate}: No (21)
## : property in {other,unknown/none}:
## : :...months_loan_duration > 26: No (12.5)
## : months_loan_duration <= 26:
## : :...existing_credits <= 1: No (20/4.2)
## : existing_credits > 1: Yes (22/6.8)
## checking_balance in {< 0 DM,> 200 DM,1 - 200 DM}:
## :...property = unknown/none:
## :...job = unskilled resident: Yes (8.6)
## : job in {mangement self-employed,skilled employee,
## : : unemployed non-resident}:
## : :...age <= 22: No (4.8)
## : age > 22:
## : :...housing in {own,rent}: Yes (27.9/4.9)
## : housing = for free:
## : :...installment_plan = stores: Yes (0)
## : installment_plan = bank: No (12.1/3.9)
## : installment_plan = none:
## : :...job = mangement self-employed: No (17/6.5)
## : job in {skilled employee,
## : unemployed non-resident}: Yes (28/7.6)
## property in {building society savings,other,real estate}:
## :...age > 47:
## :...installment_plan in {bank,none}: No (35.4/4.5)
## : installment_plan = stores: Yes (4.2/0.3)
## age <= 47:
## :...savings_balance in {> 1000 DM,101 - 500 DM,
## : 501 - 1000 DM}: No (62.7/22.8)
## savings_balance = unknown:
## :...amount <= 1484: Yes (15.1/2.2)
## : amount > 1484: No (30.3/4.9)
## savings_balance = < 100 DM:
## :...amount > 8072: Yes (15.2/1.2)
## amount <= 8072:
## :...purpose in {car (used),domestic appliances,furniture,
## : others,repairs}: No (86.9/28.4)
## purpose in {education,retraining}: Yes (17.4/6)
## purpose = business:
## :...installment_plan in {bank,stores}: No (6.3)
## : installment_plan = none: Yes (22/9.1)
## purpose = radio/tv:
## :...months_loan_duration > 39: Yes (5.8)
## : months_loan_duration <= 39:
## : :...other_debtors in {co-applicant,
## : : guarantor}: No (9.8/1.2)
## : other_debtors = none:
## : :...dependents > 1: Yes (7.8/1.8)
## : dependents <= 1: [S1]
## purpose = car (new):
## :...checking_balance = > 200 DM: No (4.2)
## checking_balance in {< 0 DM,1 - 200 DM}:
## :...other_debtors in {co-applicant,
## : guarantor}: Yes (11.4)
## other_debtors = none:
## :...foreign_worker = no: No (3.2)
## foreign_worker = yes:
## :...installment_plan in {bank,
## : stores}: Yes (4.3)
## installment_plan = none: [S2]
##
## SubTree [S1]
##
## employment_length in {> 7 yrs,4 - 7 yrs,unemployed}: No (8.1)
## employment_length in {0 - 1 yrs,1 - 4 yrs}:
## :...installment_plan in {bank,none}: Yes (38.8/14.6)
## installment_plan = stores: No (3.3)
##
## SubTree [S2]
##
## job in {mangement self-employed,unemployed non-resident}: Yes (3.6)
## job = unskilled resident: No (16.8/5.5)
## job = skilled employee:
## :...installment_rate <= 2: No (7/1.4)
## installment_rate > 2: Yes (19.4/3.9)
##
## ----- Trial 7: -----
##
## Decision tree:
##
## checking_balance in {> 200 DM,unknown}:
## :...foreign_worker = no: No (6.8)
## : foreign_worker = yes:
## : :...purpose in {car (used),domestic appliances,others,
## : : retraining}: No (37.1/2.8)
## : purpose in {business,car (new),education,furniture,radio/tv,repairs}:
## : :...employment_length in {> 7 yrs,4 - 7 yrs,unemployed}: No (133.4/37.2)
## : employment_length in {0 - 1 yrs,1 - 4 yrs}:
## : :...amount <= 1264: No (10.8)
## : amount > 1264: Yes (127.8/57.8)
## checking_balance in {< 0 DM,1 - 200 DM}:
## :...property = real estate:
## :...foreign_worker = no: No (4.7)
## : foreign_worker = yes:
## : :...months_loan_duration <= 11: No (21.2/1.5)
## : months_loan_duration > 11:
## : :...savings_balance in {> 1000 DM,101 - 500 DM,
## : : 501 - 1000 DM}: No (8.5)
## : savings_balance in {< 100 DM,unknown}:
## : :...age <= 48: Yes (66.1/26.6)
## : age > 48: No (6)
## property in {building society savings,other,unknown/none}:
## :...residence_history <= 1:
## :...employment_length in {> 7 yrs,0 - 1 yrs,1 - 4 yrs,
## : : 4 - 7 yrs}: No (51.2/14.8)
## : employment_length = unemployed: Yes (7.2/0.7)
## residence_history > 1:
## :...credit_history in {critical,delayed}:
## :...installment_rate <= 1: No (8.6)
## : installment_rate > 1:
## : :...savings_balance in {> 1000 DM,101 - 500 DM,
## : : unknown}: No (20.7/3.8)
## : savings_balance = 501 - 1000 DM: Yes (3.9/1.9)
## : savings_balance = < 100 DM:
## : :...credit_history = delayed: Yes (10.8/1.5)
## : credit_history = critical:
## : :...installment_plan = bank: Yes (6.4/1.8)
## : installment_plan in {none,stores}: No (46.3/20.3)
## credit_history in {fully repaid,fully repaid this bank,repaid}:
## :...employment_length in {> 7 yrs,0 - 1 yrs}: Yes (86.7/17.9)
## employment_length in {1 - 4 yrs,4 - 7 yrs,unemployed}:
## :...months_loan_duration <= 9: Yes (7.3)
## months_loan_duration > 9:
## :...installment_rate <= 1: No (17.9/7)
## installment_rate > 1:
## :...job in {mangement self-employed,
## : unemployed non-resident}: No (17.2/4.2)
## job in {skilled employee,
## unskilled resident}: Yes (93.5/33.3)
##
## ----- Trial 8: -----
##
## Decision tree:
##
## checking_balance in {< 0 DM,1 - 200 DM}:
## :...amount > 8613: Yes (31.6/6.7)
## : amount <= 8613:
## : :...savings_balance in {> 1000 DM,501 - 1000 DM}: No (27.7/8.1)
## : savings_balance = 101 - 500 DM:
## : :...personal_status in {divorced male,female,
## : : : married male}: Yes (23.9/3.8)
## : : personal_status = single male:
## : : :...property in {building society savings,
## : : : unknown/none}: Yes (18.4/6.8)
## : : property in {other,real estate}: No (10.5)
## : savings_balance = unknown:
## : :...existing_credits > 1: No (16.8)
## : : existing_credits <= 1:
## : : :...checking_balance = < 0 DM: Yes (17.1/5.7)
## : : checking_balance = 1 - 200 DM: No (23.7/8.5)
## : savings_balance = < 100 DM:
## : :...months_loan_duration > 42: Yes (17/2.6)
## : months_loan_duration <= 42:
## : :...purpose in {business,car (used),others,repairs,
## : : retraining}: No (54.4/18.4)
## : purpose in {domestic appliances,education}: Yes (15.5/4.8)
## : purpose = radio/tv:
## : :...credit_history in {critical,delayed,repaid}: No (66.5/18.1)
## : : credit_history in {fully repaid,
## : : fully repaid this bank}: Yes (6.7/0.8)
## : purpose = car (new):
## : :...other_debtors in {co-applicant,guarantor}: Yes (11.8/1)
## : : other_debtors = none:
## : : :...installment_rate <= 2: No (17.2/4)
## : : installment_rate > 2:
## : : :...property in {building society savings,
## : : : real estate}: No (25.6/9.2)
## : : property in {other,unknown/none}: Yes (19.8/1.6)
## : purpose = furniture:
## : :...other_debtors = guarantor: No (4.3)
## : other_debtors in {co-applicant,none}:
## : :...installment_plan = stores: No (3.7)
## : installment_plan in {bank,none}:
## : :...residence_history <= 1: No (11.7/3.2)
## : residence_history > 1: Yes (55.1/18.9)
## checking_balance in {> 200 DM,unknown}:
## :...foreign_worker = no: No (5.8)
## foreign_worker = yes:
## :...purpose in {domestic appliances,education,others,radio/tv,repairs,
## : retraining}: No (123.9/31.9)
## purpose = car (used):
## :...residence_history <= 1: Yes (3.2/0.6)
## : residence_history > 1: No (23.9/1.1)
## purpose = furniture:
## :...months_loan_duration > 30: Yes (6.3/1.1)
## : months_loan_duration <= 30:
## : :...dependents <= 1: No (33.2/5.1)
## : dependents > 1: Yes (3.7/1)
## purpose = business:
## :...residence_history > 3: No (8.4)
## : residence_history <= 3:
## : :...checking_balance = > 200 DM: No (3)
## : checking_balance = unknown:
## : :...amount <= 2150: No (4.7)
## : amount > 2150: Yes (24/8)
## purpose = car (new):
## :...residence_history <= 1: No (6.9)
## residence_history > 1:
## :...installment_plan in {bank,stores}: Yes (13.3/3.3)
## installment_plan = none:
## :...existing_credits > 2: Yes (3)
## existing_credits <= 2:
## :...telephone = yes: No (23.3/3.1)
## telephone = none: [S1]
##
## SubTree [S1]
##
## credit_history = critical: No (4.7)
## credit_history in {delayed,fully repaid,fully repaid this bank,
## repaid}: Yes (25.6/9.6)
##
## ----- Trial 9: -----
##
## Decision tree:
##
## checking_balance in {> 200 DM,unknown}: No (265.6/47.6)
## checking_balance in {< 0 DM,1 - 200 DM}:
## :...savings_balance in {> 1000 DM,501 - 1000 DM}: No (30.7/14)
## savings_balance = 101 - 500 DM:
## :...credit_history in {critical,delayed}: No (17.5/2.3)
## : credit_history in {fully repaid,fully repaid this bank,repaid}:
## : :...other_debtors in {co-applicant,guarantor}: Yes (4.3)
## : other_debtors = none:
## : :...personal_status in {divorced male,female,
## : : married male}: Yes (18.8/1.9)
## : personal_status = single male: No (18.1/5.4)
## savings_balance = unknown:
## :...existing_credits > 1: No (15.7)
## : existing_credits <= 1:
## : :...months_loan_duration > 42: No (8.6)
## : months_loan_duration <= 42:
## : :...foreign_worker = no: No (3.5)
## : foreign_worker = yes:
## : :...age <= 41: Yes (25.5/6.6)
## : age > 41: No (10.8/1.5)
## savings_balance = < 100 DM:
## :...job = unskilled resident:
## :...property in {building society savings,other,
## : : real estate}: No (74.3/23.8)
## : property = unknown/none: Yes (4.6)
## job in {mangement self-employed,skilled employee,
## : unemployed non-resident}:
## :...other_debtors in {co-applicant,guarantor}: No (31.6/13.7)
## other_debtors = none:
## :...residence_history <= 1: No (46.4/19.6)
## residence_history > 1:
## :...credit_history in {delayed,fully repaid,
## : fully repaid this bank}: Yes (36.8/8.4)
## credit_history = critical:
## :...property = building society savings: No (13)
## : property in {other,real estate,unknown/none}:
## : :...telephone = none: Yes (13.3/2.5)
## : telephone = yes:
## : :...property in {other,unknown/none}: No (23.2/8.7)
## : property = real estate: Yes (4.7)
## credit_history = repaid:
## :...personal_status in {divorced male,
## : married male}: Yes (10.1/0.4)
## personal_status = female:
## :...age <= 49: Yes (39.1/8.3)
## : age > 49: No (4.3)
## personal_status = single male:
## :...employment_length = > 7 yrs: Yes (13.4)
## employment_length in {0 - 1 yrs,1 - 4 yrs,
## : 4 - 7 yrs,unemployed}:
## :...installment_rate <= 3: No (23.9/6.7)
## installment_rate > 3: Yes (19/5.2)
##
##
## Evaluation on training data (800 cases):
##
## Trial Decision Tree
## ----- ----------------
## Size Errors
##
## 0 42 113(14.1%)
## 1 35 178(22.3%)
## 2 34 171(21.4%)
## 3 25 180(22.5%)
## 4 45 182(22.8%)
## 5 41 155(19.4%)
## 6 44 169(21.1%)
## 7 23 228(28.5%)
## 8 38 157(19.6%)
## 9 26 149(18.6%)
## boost 46( 5.8%) <<
##
##
## (a) (b) <-classified as
## ---- ----
## 556 5 (a): class No
## 41 198 (b): class Yes
##
##
## Attribute usage:
##
## 100.00% checking_balance
## 100.00% months_loan_duration
## 100.00% amount
## 100.00% foreign_worker
## 99.88% other_debtors
## 99.50% purpose
## 98.13% job
## 97.25% employment_length
## 93.75% savings_balance
## 93.25% installment_plan
## 91.88% age
## 79.13% credit_history
## 72.13% property
## 63.88% residence_history
## 58.25% installment_rate
## 55.50% personal_status
## 52.00% existing_credits
## 42.13% housing
## 33.38% dependents
## 20.63% telephone
##
##
## Time: 0.1 secs