30 Nov 2017

Regression and Classification

One of Data Science key purpose is to model and predict trends. With the Ames housing dataset, regression modelling and classification are able to utilise.

Regression and Classification with the Ames Housing Data

You have just joined a new “full stack” real estate company in Ames, Iowa. The strategy of the firm is two-fold:

Own the entire process from the purchase of the land all the way to sale of the house, and anything in between.
Use statistical analysis to optimize investment and maximize return.

The company is still small, and though investment is substantial the short-term goals of the company are more oriented towards purchasing existing houses and flipping them as opposed to constructing entirely new houses. That being said, the company has access to a large construction workforce operating at rock-bottom prices.

This project uses the Ames housing data recently made available on kaggle.

import numpy as np
import scipy.stats as stats
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import patsy
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LassoCV
from math import log, exp, sqrt

sns.set_style('whitegrid')

%config InlineBackend.figure_format = 'retina'
%matplotlib inline

1. Estimating the value of homes from fixed characteristics.

Your superiors have outlined this year’s strategy for the company:

Develop an algorithm to reliably estimate the value of residential houses based on fixed characteristics.
Identify characteristics of houses that the company can cost-effectively change/renovate with their construction team.
Evaluate the mean dollar value of different renovations.

Then we can use that to buy houses that are likely to sell for more than the cost of the purchase plus renovations.

Your first job is to tackle #1. You have a dataset of housing sale data with a huge amount of features identifying different aspects of the house. The full description of the data features can be found in a separate file:

housing.csv
data_description.txt

You need to build a reliable estimator for the price of the house given characteristics of the house that cannot be renovated. Some examples include:

The neighborhood
Square feet
Bedrooms, bathrooms
Basement and garage space

and many more.

Some examples of things that ARE renovate-able:

Roof and exterior features
“Quality” metrics, such as kitchen quality
“Condition” metrics, such as condition of garage
Heating and electrical components

and generally anything you deem can be modified without having to undergo major construction on the house.

Your goals:

Perform any cleaning, feature engineering, and EDA you deem necessary.
- Be sure to remove any houses that are not residential from the dataset.
- Identify fixed features that can predict price.
- Train a model on pre-2010 data and evaluate its performance on the 2010 houses.
- Characterize your model. How well does it perform? What are the best estimates of price?

Note: The EDA and feature engineering component to this project is not trivial! Be sure to always think critically and creatively. Justify your actions! Use the data description file!

# Load the data
house = pd.read_csv('./housing.csv')

# Load data text
file = open('data_description.txt','r') 
 
# print file.read()

1.1 Check shape, head and info of house data

shape = house.shape
shape

(1460, 81)

# inspect head by ensuring that all columns are displayed
pd.options.display.max_columns = 100
pd.options.display.max_rows = 100

house.head()

house.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1460 entries, 0 to 1459
Data columns (total 81 columns):
Id               1460 non-null int64
MSSubClass       1460 non-null int64
MSZoning         1460 non-null object
LotFrontage      1201 non-null float64
LotArea          1460 non-null int64
Street           1460 non-null object
Alley            91 non-null object
LotShape         1460 non-null object
LandContour      1460 non-null object
Utilities        1460 non-null object
LotConfig        1460 non-null object
LandSlope        1460 non-null object
Neighborhood     1460 non-null object
Condition1       1460 non-null object
Condition2       1460 non-null object
BldgType         1460 non-null object
HouseStyle       1460 non-null object
OverallQual      1460 non-null int64
OverallCond      1460 non-null int64
YearBuilt        1460 non-null int64
YearRemodAdd     1460 non-null int64
RoofStyle        1460 non-null object
RoofMatl         1460 non-null object
Exterior1st      1460 non-null object
Exterior2nd      1460 non-null object
MasVnrType       1452 non-null object
MasVnrArea       1452 non-null float64
ExterQual        1460 non-null object
ExterCond        1460 non-null object
Foundation       1460 non-null object
BsmtQual         1423 non-null object
BsmtCond         1423 non-null object
BsmtExposure     1422 non-null object
BsmtFinType1     1423 non-null object
BsmtFinSF1       1460 non-null int64
BsmtFinType2     1422 non-null object
BsmtFinSF2       1460 non-null int64
BsmtUnfSF        1460 non-null int64
TotalBsmtSF      1460 non-null int64
Heating          1460 non-null object
HeatingQC        1460 non-null object
CentralAir       1460 non-null object
Electrical       1459 non-null object
1stFlrSF         1460 non-null int64
2ndFlrSF         1460 non-null int64
LowQualFinSF     1460 non-null int64
GrLivArea        1460 non-null int64
BsmtFullBath     1460 non-null int64
BsmtHalfBath     1460 non-null int64
FullBath         1460 non-null int64
HalfBath         1460 non-null int64
BedroomAbvGr     1460 non-null int64
KitchenAbvGr     1460 non-null int64
KitchenQual      1460 non-null object
TotRmsAbvGrd     1460 non-null int64
Functional       1460 non-null object
Fireplaces       1460 non-null int64
FireplaceQu      770 non-null object
GarageType       1379 non-null object
GarageYrBlt      1379 non-null float64
GarageFinish     1379 non-null object
GarageCars       1460 non-null int64
GarageArea       1460 non-null int64
GarageQual       1379 non-null object
GarageCond       1379 non-null object
PavedDrive       1460 non-null object
WoodDeckSF       1460 non-null int64
OpenPorchSF      1460 non-null int64
EnclosedPorch    1460 non-null int64
3SsnPorch        1460 non-null int64
ScreenPorch      1460 non-null int64
PoolArea         1460 non-null int64
PoolQC           7 non-null object
Fence            281 non-null object
MiscFeature      54 non-null object
MiscVal          1460 non-null int64
MoSold           1460 non-null int64
YrSold           1460 non-null int64
SaleType         1460 non-null object
SaleCondition    1460 non-null object
SalePrice        1460 non-null int64
dtypes: float64(3), int64(35), object(43)
memory usage: 924.0+ KB

house.head()

# Understand the unique values of each columns
house.nunique().sort_values()

CentralAir          2
Utilities           2
Street              2
Alley               2
BsmtHalfBath        3
LandSlope           3
GarageFinish        3
HalfBath            3
PavedDrive          3
PoolQC              3
FullBath            4
MasVnrType          4
BsmtExposure        4
ExterQual           4
MiscFeature         4
BsmtFullBath        4
Fence               4
KitchenQual         4
BsmtCond            4
Fireplaces          4
LandContour         4
LotShape            4
KitchenAbvGr        4
BsmtQual            4
FireplaceQu         5
Electrical          5
YrSold              5
GarageCars          5
GarageQual          5
GarageCond          5
HeatingQC           5
ExterCond           5
MSZoning            5
LotConfig           5
BldgType            5
BsmtFinType2        6
Foundation          6
RoofStyle           6
SaleCondition       6
GarageType          6
BsmtFinType1        6
Heating             6
Functional          7
RoofMatl            8
HouseStyle          8
Condition2          8
PoolArea            8
BedroomAbvGr        8
SaleType            9
Condition1          9
OverallCond         9
OverallQual        10
TotRmsAbvGrd       12
MoSold             12
Exterior1st        15
MSSubClass         15
Exterior2nd        16
3SsnPorch          20
MiscVal            21
LowQualFinSF       24
Neighborhood       25
YearRemodAdd       61
ScreenPorch        76
GarageYrBlt        97
LotFrontage       110
YearBuilt         112
EnclosedPorch     120
BsmtFinSF2        144
OpenPorchSF       202
WoodDeckSF        274
MasVnrArea        327
2ndFlrSF          417
GarageArea        441
BsmtFinSF1        637
SalePrice         663
TotalBsmtSF       721
1stFlrSF          753
BsmtUnfSF         780
GrLivArea         861
LotArea          1073
Id               1460
dtype: int64

1.2 Remove rows where MSZoning is non-residential i.e. A, C or I

#Understand the different zoning in the dataset
house['MSZoning'].value_counts()

# The dataset includes commercial properties which is not necessary

RL         1151
RM          218
FV           65
RH           16
C (all)      10
Name: MSZoning, dtype: int64

# Remove non-residential
house = house[house.MSZoning != 'C (all)']
house.shape
# note: we lost 10 rows. 

(1450, 81)

1.3 Inspect null values. Drop ID & columns with more than 40% null values

house.isnull().sum().sort_values(ascending= False).head(10)
# PoolQC, MiscFeature, Alley, Fence,
#FireplaceQu & LotFrontage have a lot of null values      

PoolQC          1453
MiscFeature     1406
Alley           1369
Fence           1179
FireplaceQu      690
LotFrontage      259
GarageCond        81
GarageType        81
GarageYrBlt       81
GarageFinish      81
dtype: int64

#drop irrelavant columns for modelling
house.drop('Id', axis=1, inplace=True)

#look for columns with more than 40% null values
null_cols = house.columns[house.isnull().sum()> len(house)*0.4]
null_cols

Index(['Alley', 'FireplaceQu', 'PoolQC', 'Fence', 'MiscFeature'], dtype='object')

#drop columns with more than 40% null values
house.drop(null_cols, axis=1, inplace=True)

1.4 Assess the remainding null values

house.isnull().sum().sort_values(ascending= False).head(10)

LotFrontage     259
GarageType       81
GarageYrBlt      81
GarageCond       81
GarageQual       81
GarageFinish     81
BsmtExposure     38
BsmtFinType2     38
BsmtFinType1     37
BsmtCond         37
dtype: int64

Since the data cannot be further reduce as it would be too short, we need to fill in the null values

1.5 Fill in null values

isnull = pd.DataFrame(house.isnull().sum(), columns=['null_values']).reset_index()
isnull = isnull[isnull['null_values'] != 0]
isnull

1.6 Fill numerical data with median & categorical data with median or mode values

for col in isnull['index']:
    try:
        house[col].fillna(house[col].median(), inplace= True)
    except:
        house[col].fillna(house[col].mode()[0], inplace=True)
#Check if all the null values are being filled up
house.isnull().sum()

MSSubClass       0
MSZoning         0
LotFrontage      0
LotArea          0
Street           0
LotShape         0
LandContour      0
Utilities        0
LotConfig        0
LandSlope        0
Neighborhood     0
Condition1       0
Condition2       0
BldgType         0
HouseStyle       0
OverallQual      0
OverallCond      0
YearBuilt        0
YearRemodAdd     0
RoofStyle        0
RoofMatl         0
Exterior1st      0
Exterior2nd      0
MasVnrType       0
MasVnrArea       0
ExterQual        0
ExterCond        0
Foundation       0
BsmtQual         0
BsmtCond         0
BsmtExposure     0
BsmtFinType1     0
BsmtFinSF1       0
BsmtFinType2     0
BsmtFinSF2       0
BsmtUnfSF        0
TotalBsmtSF      0
Heating          0
HeatingQC        0
CentralAir       0
Electrical       0
1stFlrSF         0
2ndFlrSF         0
LowQualFinSF     0
GrLivArea        0
BsmtFullBath     0
BsmtHalfBath     0
FullBath         0
HalfBath         0
BedroomAbvGr     0
KitchenAbvGr     0
KitchenQual      0
TotRmsAbvGrd     0
Functional       0
Fireplaces       0
GarageType       0
GarageYrBlt      0
GarageFinish     0
GarageCars       0
GarageArea       0
GarageQual       0
GarageCond       0
PavedDrive       0
WoodDeckSF       0
OpenPorchSF      0
EnclosedPorch    0
3SsnPorch        0
ScreenPorch      0
PoolArea         0
MiscVal          0
MoSold           0
YrSold           0
SaleType         0
SaleCondition    0
SalePrice        0
dtype: int64

1.7 Rename columns names with numbers as first character to facilitate patsy later on

number = {0:'zero', 1: 'one', 2:'two', 3: 'three', 4: 'four', 5: 'five', 6: 'six', 7:'seven', 8: 'eight', 9: 'nine'}

new_col_names= []

for col in house.columns:
    if col[0] in [str(x) for x in range(10)]:
        new_col = number[float(col[0])] + '_' + col[1:]
        new_col_names.append(new_col)
    else:
        new_col_names.append(col)
        
house.columns = new_col_names
        

1.8 Find and handle outliers

house.describe(include='all')

# Found that houses with 'LotArea' < or > 6 std away from mean

threshold = 6 * house['LotArea'].std()
lotarea_mask = abs( house['LotArea'] - house['LotArea'].mean() ) > threshold
house[lotarea_mask]

# create new column for big houses

house['big_house'] = 0
house.loc[lotarea_mask, 'big_house'] = 1
house['big_house'].value_counts()

0    1455
1       5
Name: big_house, dtype: int64

1.9 Select relevant columns for analysis

""""Build a reliable estimator for the price of the house given characteristics of the house that 
CANNOT be renovated"""


# Remove non-fixed variables such as Exterior1st, Exterior2nd, ExterQual,ExterCond,BsmtQual, BsmtCond, 
# BsmtFinType1, BsmtFinType2, HeatingQC etc
# 'CentralAir' and 'Heating' cannot be changed
# 'BsmtExposure' will be a fixed variable as the exposure cannot be changed
# Target= SalePrice 



non_fixed = ['OverallQual','OverallCond','YearRemodAdd','RoofStyle','RoofMatl','Exterior1st','Exterior2nd',
             'MasVnrType','MasVnrArea','ExterQual','ExterCond','BsmtQual','BsmtCond','BsmtFinType1','BsmtFinSF1',
             'BsmtFinType2','BsmtFinSF2','BsmtUnfSF','HeatingQC','Electrical','LowQualFinSF','KitchenQual',
             'Functional','GarageFinish','GarageQual','GarageCond','PavedDrive','MiscVal','SaleType','SaleCondition',
            'MoSold']

house_new = house.drop(non_fixed, axis=1)

#Checking the shape of the data
house_new.shape

(1460, 45)

1.10 Explore relationships between SalePrice and other variables

plt.figure(figsize=(14,14))
sns.clustermap(house_new.corr(), cmap='seismic', center=0)

<seaborn.matrix.ClusterGrid at 0x111d4cfd0>

<matplotlib.figure.Figure at 0x111d4cb00>

abs(house_new.corr()['SalePrice']).sort_values(ascending=False).head(10)

SalePrice       1.000000
GrLivArea       0.708624
GarageCars      0.640409
GarageArea      0.623431
TotalBsmtSF     0.613581
one_stFlrSF     0.605852
FullBath        0.560664
TotRmsAbvGrd    0.533723
YearBuilt       0.522897
Fireplaces      0.466929
Name: SalePrice, dtype: float64

1.11 Model: train-test-split, scale, fit,predict using Lasso

target = 'SalePrice'

# create X, y
f = 'SalePrice ~ ' +  ' + '.join([c for c in house_new.columns])+' - 1'.format()
f

y, X = patsy.dmatrices(f, data=house_new, return_type='dataframe')

# create train (before 2010) test (2010) split
X_train = X[X['YrSold'] != 2010].drop([target,'YrSold'],axis=1)
y_train = X[X['YrSold'] != 2010][target]

X_test = X[X['YrSold'] == 2010].drop([target,'YrSold'],axis=1)
y_test = X[X['YrSold'] == 2010][target]


# scale
ss = StandardScaler()
ss.fit(X_train)
Xs_train = ss.transform(X_train)
Xs_test = ss.transform(X_test)


# fit and score
model_lasso = LassoCV(cv=10)
model_lasso.fit(Xs_train, y_train)
score = model_lasso.score(Xs_test, y_test)
y_pred_lasso = model_lasso.predict(Xs_test)
coefs = model_lasso.coef_
print(score)

0.8605892616213779

1.12 Plot residuals

residual_plot = pd.DataFrame(list(zip(y_pred_lasso,y_test)), columns=['predict','true'])
sns.lmplot(x= 'true', y='predict', data=residual_plot)

<seaborn.axisgrid.FacetGrid at 0x1172672b0>

residual_plot.head()

	predict	true
0	153968.943858	149000.0
1	151241.501279	154000.0
2	132550.282833	134800.0
3	301865.495610	306000.0
4	175591.208878	165500.0

plt.figure(figsize=(14,5))
sns.residplot(x= 'predict', y='true', data=residual_plot)

<matplotlib.axes._subplots.AxesSubplot at 0x1a1e748898>

As the residuals diverges as the prices increase, a linear regression is not appropriate. Therefore, we will use logarithmic model.

1.13 Applying log function on the SalePrice

# add the log of the saleprice as a new column and drop the original saleprice
house['SalePrice_lg'] = house['SalePrice'].map(log)
non_fixed.append('SalePrice')
house_new_2 = house.drop(non_fixed, axis=1)

#Evaluate the score of the 'SalePrice_lg
# create X, y
f = 'SalePrice_lg ~ ' +  ' + '.join([c for c in house_new_2.columns])+' - 1'.format()

y, X_log = patsy.dmatrices(f, data=house_new_2, return_type='dataframe')

# create train (before 2010) test (2010) split
X_train_log = X_log[X_log['YrSold'] != 2010].drop(['SalePrice_lg','YrSold'],axis=1)
y_train_log = X_log[X_log['YrSold'] != 2010]['SalePrice_lg']

X_test_log = X_log[X_log['YrSold'] == 2010].drop(['SalePrice_lg','YrSold'],axis=1)
y_test_log = X_log[X_log['YrSold'] == 2010]['SalePrice_lg']


# scale
ss = StandardScaler()
ss.fit(X_train_log)
Xs_train_log = ss.transform(X_train_log)
Xs_test_log = ss.transform(X_test_log)


# fit and score
model_lasso_log = LassoCV(cv=10)
model_lasso_log.fit(Xs_train_log, y_train_log)
score_log = model_lasso_log.score(Xs_test_log, y_test_log)
y_pred_lasso_log = model_lasso.predict(Xs_test_log)
coefs_log = model_lasso_log.coef_
print(score_log)

0.8973966456429032

residual_plot = pd.DataFrame(list(zip(y_pred_lasso_log, y_test_log)), columns=['predict','true'])
plt.figure(figsize=(14,5))
sns.residplot(x='predict', y='true', data=residual_plot)

<matplotlib.axes._subplots.AxesSubplot at 0x1a1e71c048>

1.14 Identify top features of the housing dataset

lasso_coefs = pd.DataFrame(list(zip(X_log.columns, coefs_log)), columns=['Variables','Coef'])
lasso_coefs = lasso_coefs.loc[(lasso_coefs['Coef'] != 0)]
lasso_coefs.sort_values('Coef', ascending=True).plot(x='Variables',y='Coef',figsize=(6,18), kind='barh')

<matplotlib.axes._subplots.AxesSubplot at 0x1a1eceec50>

lasso_coefs['Coef_abs'] = lasso_coefs['Coef'].abs()
lasso_coefs['Coef_true'] = lasso_coefs['Coef'].map(lambda x: exp(x**2))
lasso_coefs.sort_values('Coef_abs', ascending=False).head(10)

	Variables	Coef	Coef_abs	Coef_true
95	GrLivArea	0.130983	0.130983	1.017305
91	YearBuilt	0.060790	0.060790	1.003702
105	GarageCars	0.059840	0.059840	1.003587
34	Neighborhood[T.NridgHt]	0.047916	0.047916	1.002299
82	CentralAir[T.Y]	0.032553	0.032553	1.001060
0	MSZoning[C (all)]	-0.032476	0.032476	1.001055
24	Neighborhood[T.Crawfor]	0.031324	0.031324	1.000982
96	BsmtFullBath	0.030189	0.030189	1.000912
39	Neighborhood[T.Somerst]	0.027759	0.027759	1.000771
40	Neighborhood[T.StoneBr]	0.025956	0.025956	1.000674

Top estimators of price

GrLivArea: Above grade (ground) living area square feet
GarageCars: Size of garage in car capacity
YearBuilt: Original construction date
Neighbourhood: NridgHt

2. Determine any value of changeable property characteristics unexplained by the fixed ones.

Now that you have a model that estimates the price of a house based on its static characteristics, we can move forward with part 2 and 3 of the plan: what are the costs/benefits of quality, condition, and renovations?

There are two specific requirements for these estimates:

The estimates of effects must be in terms of dollars added or subtracted from the house value.
The effects must be on the variance in price remaining from the first model.

The residuals from the first model (training and testing) represent the variance in price unexplained by the fixed characteristics. Of that variance in price remaining, how much of it can be explained by the easy-to-change aspects of the property?

Your goals:

Evaluate the effect in dollars of the renovate-able features.
- How would your company use this second model and its coefficients to determine whether they should buy a property or not? Explain how the company can use the two models you have built to determine if they can make money.
- Investigate how much of the variance in price remaining is explained by these features.
- Do you trust your model? Should it be used to evaluate which properties to buy and fix up?

2.1 Evaluate residual in the fixed variable model

residual_plot['predict_value'] = residual_plot['predict'].map(log)
residual_plot['true_value'] = residual_plot['true'].map(exp)
residual_plot['residual'] = abs(residual_plot['true_value'] - residual_plot['predict'])
residual_plot.head()

	predict	true	predict_value	true_value	residual
0	153968.943858	11.911702	11.944506	149000.0	4968.943858
1	151241.501279	11.944708	11.926633	154000.0	2758.498721
2	132550.282833	11.811547	11.794717	134800.0	2249.717167
3	301865.495610	12.631340	12.617737	306000.0	4134.504390
4	175591.208878	12.016726	12.075914	165500.0	10091.208878

fixed_var_residual_mean = residual_plot['residual'].mean()
fixed_var_residual_median = residual_plot['residual'].median()

# after sorting, we see that the greatest difference goes up to 18.6k!
residual_plot.sort_values('residual', ascending=False).head(10)

	predict	true	predict_value	true_value	residual
111	401369.988970	13.323927	12.902639	611657.0	210287.011030
9	267974.441612	12.100712	12.498647	180000.0	87974.441612
157	275614.504709	12.154779	12.526758	190000.0	85614.504709
147	296724.420258	12.843971	12.600559	378500.0	81775.579742
155	256726.473750	12.721886	12.455766	335000.0	78273.526250
94	472883.944257	13.195614	13.066605	538000.0	65116.055743
55	333476.387722	12.885202	12.717327	394432.0	60955.612278
127	267333.678918	12.700769	12.496253	328000.0	60666.321082
120	335672.669978	12.887127	12.723892	395192.0	59519.330022
27	277972.802795	12.301383	12.535279	220000.0	57972.802795

residual_plot['residual'].describe()

count       175.000000
mean      19578.169117
std       22733.579501
min          18.966037
25%        5978.414157
50%       12536.703886
75%       25344.490982
max      210287.011030
Name: residual, dtype: float64

2.2 EDA for model with changeable features

house.columns

Index(['MSSubClass', 'MSZoning', 'LotFrontage', 'LotArea', 'Street',
       'LotShape', 'LandContour', 'Utilities', 'LotConfig', 'LandSlope',
       'Neighborhood', 'Condition1', 'Condition2', 'BldgType', 'HouseStyle',
       'OverallQual', 'OverallCond', 'YearBuilt', 'YearRemodAdd', 'RoofStyle',
       'RoofMatl', 'Exterior1st', 'Exterior2nd', 'MasVnrType', 'MasVnrArea',
       'ExterQual', 'ExterCond', 'Foundation', 'BsmtQual', 'BsmtCond',
       'BsmtExposure', 'BsmtFinType1', 'BsmtFinSF1', 'BsmtFinType2',
       'BsmtFinSF2', 'BsmtUnfSF', 'TotalBsmtSF', 'Heating', 'HeatingQC',
       'CentralAir', 'Electrical', 'one_stFlrSF', 'two_ndFlrSF',
       'LowQualFinSF', 'GrLivArea', 'BsmtFullBath', 'BsmtHalfBath', 'FullBath',
       'HalfBath', 'BedroomAbvGr', 'KitchenAbvGr', 'KitchenQual',
       'TotRmsAbvGrd', 'Functional', 'Fireplaces', 'GarageType', 'GarageYrBlt',
       'GarageFinish', 'GarageCars', 'GarageArea', 'GarageQual', 'GarageCond',
       'PavedDrive', 'WoodDeckSF', 'OpenPorchSF', 'EnclosedPorch',
       'three_SsnPorch', 'ScreenPorch', 'PoolArea', 'MiscVal', 'MoSold',
       'YrSold', 'SaleType', 'SaleCondition', 'SalePrice', 'big_house',
       'SalePrice_lg'],
      dtype='object')

plt.figure(figsize=(12,12))
sns.clustermap(house.corr(), cmap='seismic', center=0)

<seaborn.matrix.ClusterGrid at 0x111835e48>

<matplotlib.figure.Figure at 0x111cb8048>

abs(house.corr()['SalePrice']).sort_values(ascending=False).head(10)

SalePrice       1.000000
SalePrice_lg    0.948374
OverallQual     0.790982
GrLivArea       0.708624
GarageCars      0.640409
GarageArea      0.623431
TotalBsmtSF     0.613581
one_stFlrSF     0.605852
FullBath        0.560664
TotRmsAbvGrd    0.533723
Name: SalePrice, dtype: float64

2.3 Evaluate residual

# find out the difference in predicted and actual values
residual_plot['predict_value'] = residual_plot['predict'].map(lambda x:log(x))
residual_plot['true_value'] = residual_plot['true'].map(lambda x: exp(x))
residual_plot['residual'] = abs(residual_plot['true_value'] - residual_plot['predict'])
residual_plot.head()

	predict	true	predict_value	true_value	residual
0	153968.943858	11.911702	11.944506	149000.0	4968.943858
1	151241.501279	11.944708	11.926633	154000.0	2758.498721
2	132550.282833	11.811547	11.794717	134800.0	2249.717167
3	301865.495610	12.631340	12.617737	306000.0	4134.504390
4	175591.208878	12.016726	12.075914	165500.0	10091.208878

# after sorting, we see that the greatest difference goes up to 18.6k!
residual_plot.sort_values('residual', ascending=False).head(10)

	predict	true	predict_value	true_value	residual
111	401369.988970	13.323927	12.902639	611657.0	210287.011030
9	267974.441612	12.100712	12.498647	180000.0	87974.441612
157	275614.504709	12.154779	12.526758	190000.0	85614.504709
147	296724.420258	12.843971	12.600559	378500.0	81775.579742
155	256726.473750	12.721886	12.455766	335000.0	78273.526250
94	472883.944257	13.195614	13.066605	538000.0	65116.055743
55	333476.387722	12.885202	12.717327	394432.0	60955.612278
127	267333.678918	12.700769	12.496253	328000.0	60666.321082
120	335672.669978	12.887127	12.723892	395192.0	59519.330022
27	277972.802795	12.301383	12.535279	220000.0	57972.802795

#append the columns with all with the log function
house['SalePrice_lg'] = house['SalePrice'].map(log)
house_all = [house, house['SalePrice_lg'] ]
all_var = pd.concat(house_all)
all_var_new = all_var.columns.drop([0])

/Users/hayatibintehamzah/anaconda2/envs/py36/lib/python3.6/site-packages/pandas/core/indexes/api.py:87: RuntimeWarning: '<' not supported between instances of 'str' and 'int', sort order is undefined for incomparable objects
  result = result.union(other)

#Evaluate the score of the 'SalePrice_lg with ALL the features
# create X, y
f = 'SalePrice_lg ~ ' +  ' + '.join([c for c in all_var_new])+' - 1'.format()

y_all, X_log_all = patsy.dmatrices(f, data=all_var, return_type='dataframe')

# create train (before 2010) test (2010) split
X_train_log_all = X_log_all[X_log_all['YrSold'] != 2010].drop(['SalePrice_lg','YrSold'],axis=1)
y_train_log_all = X_log_all[X_log_all['YrSold'] != 2010]['SalePrice_lg']

X_test_log_all = X_log_all[X_log_all['YrSold'] == 2010].drop(['SalePrice_lg','YrSold'],axis=1)
y_test_log_all = X_log_all[X_log_all['YrSold'] == 2010]['SalePrice_lg']


# scale
ss = StandardScaler()
ss.fit(X_train_log_all)
Xs_train_log_all = ss.transform(X_train_log_all)
Xs_test_log_all = ss.transform(X_test_log_all)


# fit and score
model_lasso_log_all = LassoCV(cv=10)
model_lasso_log_all.fit(Xs_train_log_all, y_train_log_all)
score_log_all = model_lasso_log_all.score(Xs_test_log_all, y_test_log_all)
y_pred_lasso_log_all = model_lasso_log_all.predict(Xs_test_log_all)
coefs_log_all = model_lasso_log_all.coef_
print(score_log_all)

0.9610537276873103

residual_plot_all = pd.DataFrame(list(zip(y_pred_lasso_log_all, y_test_log_all)), columns=['predict','true'])
plt.figure(figsize=(14,5))
sns.residplot(x='predict', y='true', data=residual_plot)

<matplotlib.axes._subplots.AxesSubplot at 0x1117e2d30>

residual_plot_all['residual'] = abs(residual_plot_all['true'] - residual_plot_all['predict'])
residual_plot.head()

	predict	true	predict_value	true_value	residual
0	153968.943858	11.911702	11.944506	149000.0	4968.943858
1	151241.501279	11.944708	11.926633	154000.0	2758.498721
2	132550.282833	11.811547	11.794717	134800.0	2249.717167
3	301865.495610	12.631340	12.617737	306000.0	4134.504390
4	175591.208878	12.016726	12.075914	165500.0	10091.208878

all_var_residual_mean = residual_plot_all['residual'].mean()
all_var_residual_median = residual_plot_all['residual'].median()

residual_plot_all['residual'].describe()

count    175.000000
mean       0.054052
std        0.059223
min        0.000303
25%        0.016689
50%        0.036289
75%        0.064983
max        0.331339
Name: residual, dtype: float64

print('mean diff:')  
print(fixed_var_residual_mean -  all_var_residual_mean)
print('median diff:')
print(fixed_var_residual_median - all_var_residual_median)

mean diff:
19578.115064632624
median diff:
12536.667597680169

3. What property characteristics predict an “abnormal” sale?

The SaleCondition feature indicates the circumstances of the house sale. From the data file, we can see that the possibilities are:

   Normal	Normal Sale
   Abnorml	Abnormal Sale -  trade, foreclosure, short sale
   AdjLand	Adjoining Land Purchase
   Alloca	Allocation - two linked properties with separate deeds, typically condo with a garage unit	
   Family	Sale between family members
   Partial	Home was not completed when last assessed (associated with New Homes)

One of the executives at your company has an “in” with higher-ups at the major regional bank. His friends at the bank have made him a proposal: if he can reliably indicate what features, if any, predict “abnormal” sales (foreclosures, short sales, etc.), then in return the bank will give him first dibs on the pre-auction purchase of those properties (at a dirt-cheap price).

He has tasked you with determining (and adequately validating) which features of a property predict this type of sale.

Your task:

Determine which features predict the Abnorml category in the SaleCondition feature.
- Justify your results.

This is a challenging task that tests your ability to perform classification analysis in the face of severe class imbalance. You may find that simply running a classifier on the full dataset to predict the category ends up useless: when there is bad class imbalance classifiers often tend to simply guess the majority class.

It is up to you to determine how you will tackle this problem. I recommend doing some research to find out how others have dealt with the problem in the past. Make sure to justify your solution. Don’t worry about it being “the best” solution, but be rigorous.

Be sure to indicate which features are predictive (if any) and whether they are positive or negative predictors of abnormal sales.

3.1 Train-Test-Split

f = 'SaleCondition + YrSold ~ '+' + '.join([c for c in house.columns if c != 'SaleCondition'])+' - 1'
f

'SaleCondition + YrSold ~ MSSubClass + MSZoning + LotFrontage + LotArea + Street + LotShape + LandContour + Utilities + LotConfig + LandSlope + Neighborhood + Condition1 + Condition2 + BldgType + HouseStyle + OverallQual + OverallCond + YearBuilt + YearRemodAdd + RoofStyle + RoofMatl + Exterior1st + Exterior2nd + MasVnrType + MasVnrArea + ExterQual + ExterCond + Foundation + BsmtQual + BsmtCond + BsmtExposure + BsmtFinType1 + BsmtFinSF1 + BsmtFinType2 + BsmtFinSF2 + BsmtUnfSF + TotalBsmtSF + Heating + HeatingQC + CentralAir + Electrical + one_stFlrSF + two_ndFlrSF + LowQualFinSF + GrLivArea + BsmtFullBath + BsmtHalfBath + FullBath + HalfBath + BedroomAbvGr + KitchenAbvGr + KitchenQual + TotRmsAbvGrd + Functional + Fireplaces + GarageType + GarageYrBlt + GarageFinish + GarageCars + GarageArea + GarageQual + GarageCond + PavedDrive + WoodDeckSF + OpenPorchSF + EnclosedPorch + three_SsnPorch + ScreenPorch + PoolArea + MiscVal + MoSold + YrSold + SaleType + SalePrice + big_house + SalePrice_lg - 1'

y, X = patsy.dmatrices(f, data=house, return_type='dataframe')

# create train test split
X_train = X[X['YrSold'] != 2010].drop(['YrSold'],axis=1)
y_train = y[y['YrSold'] != 2010]['SaleCondition[Abnorml]']

X_test = X[X['YrSold'] == 2010].drop(['YrSold'],axis=1)
y_test = y[y['YrSold'] == 2010]['SaleCondition[Abnorml]']

# scale
ss = StandardScaler()
ss.fit(X_train)
Xs_train = ss.transform(X_train)
Xs_test = ss.transform(X_test)

3.2 Revise datasets by doing undersampling/oversampling/SMOTE/SMOTEENN

# undersampling

from imblearn.under_sampling import ClusterCentroids
cc = ClusterCentroids(random_state=0)
X_resampled_under, y_resampled_under = cc.fit_sample(Xs_train, y_train)

# oversampling

from imblearn.over_sampling import RandomOverSampler
ros = RandomOverSampler(random_state=0)
X_resampled_over, y_resampled_over = ros.fit_sample(Xs_train, y_train)

# SMOTE

from imblearn.over_sampling import SMOTE
X_resampled_smote, y_resampled_smote = SMOTE().fit_sample(Xs_train, y_train)

# SMOTEENN

from imblearn.combine import SMOTEENN
smote_enn = SMOTEENN(random_state=0)
X_resampled_smoteenn, y_resampled_smoteenn = smote_enn.fit_sample(Xs_train, y_train)

3.3 Logistic Regression with cross validation

from sklearn import metrics
from sklearn.cross_validation import cross_val_score
from sklearn.model_selection import cross_val_predict
from sklearn.metrics import f1_score, make_scorer

# empty list to hold all model scores
overall_scores = []

# make precision scorer
scorer = make_scorer(f1_score)

/Users/hayatibintehamzah/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/cross_validation.py:41: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
  "This module will be removed in 0.20.", DeprecationWarning)

from sklearn.linear_model import LogisticRegressionCV

def model_lg(technique, X_resampled, y_resampled, Xs_test, y_test):

    # define model and fit
    lg = LogisticRegressionCV(Cs=50, cv=5, scoring=scorer, penalty='l2')
    lg.fit(X_resampled, y_resampled)
    
    # obtain mean of cross validation precision scores
    avg_score = np.mean(list(lg.scores_.values()))
    
    # obtain cross validation predictions to make classification report and confusion matrix
    y_pred_lg = lg.predict(Xs_test)

    # find coefs
    coefs = pd.DataFrame(list(zip(X_train.columns, lg.coef_[0])), columns=['Variable','Coef'])
    coefs = coefs.loc[(coefs['Coef'] != 0)]
    coefs['Coef_abs'] = coefs['Coef'].abs()
    top_3_coefs = coefs.sort_values('Coef_abs', ascending=False).head(3)
    
    # print scores    
    print(technique) 
    print("-------------------------------------------------------------")

    print(metrics.classification_report(y_test, y_pred_lg))

    print(metrics.confusion_matrix(y_test, y_pred_lg))

    print("Average f-1 score from CV:" )
    print(avg_score)

    print("top 3 coefficients: " )
    print(top_3_coefs)



    
    # append to overall scores
    overall_scores.append((avg_score, "Logistic Regression - ", technique))
    print('-----------------------------------------------------------------')

model_lg('undersampling', X_resampled_under, y_resampled_under, Xs_test, y_test)
model_lg('oversampling', X_resampled_over, y_resampled_over, Xs_test, y_test)
model_lg('SMOTE', X_resampled_smote, y_resampled_smote, Xs_test, y_test)
model_lg('SMOTEENN', X_resampled_smoteenn, y_resampled_smoteenn, Xs_test, y_test)

undersampling
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       1.00      0.40      0.57       164
        1.0       0.10      1.00      0.18        11

avg / total       0.94      0.44      0.55       175

[[66 98]
 [ 0 11]]
Average f-1 score from CV:
0.5118860339070415
top 3 coefficients: 
            Variable      Coef  Coef_abs
190  SaleType[T.Oth]  0.002958  0.002958
144  Heating[T.GasA]  0.002770  0.002770
194          LotArea -0.002634  0.002634
-----------------------------------------------------------------
oversampling
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.86      0.90       164
        1.0       0.12      0.27      0.16        11

avg / total       0.89      0.82      0.85       175

[[141  23]
 [  8   3]]
Average f-1 score from CV:
0.8793876680774424
top 3 coefficients: 
               Variable       Coef   Coef_abs
192          MSSubClass  15.321701  15.321701
9    LandContour[T.HLS] -14.791268  14.791268
58   BldgType[T.2fmCon] -14.507107  14.507107
-----------------------------------------------------------------
SMOTE
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.87      0.91       164
        1.0       0.12      0.27      0.17        11

avg / total       0.90      0.83      0.86       175

[[143  21]
 [  8   3]]
Average f-1 score from CV:
0.8769788528485988
top 3 coefficients: 
                Variable       Coef   Coef_abs
65  HouseStyle[T.2.5Unf] -17.920607  17.920607
58    BldgType[T.2fmCon] -16.473324  16.473324
9     LandContour[T.HLS] -16.438882  16.438882
-----------------------------------------------------------------
SMOTEENN
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.74      0.83       164
        1.0       0.10      0.45      0.17        11

avg / total       0.90      0.72      0.79       175

[[121  43]
 [  6   5]]
Average f-1 score from CV:
0.9382978528695868
top 3 coefficients: 
            Variable      Coef  Coef_abs
229     SalePrice_lg -3.697549  3.697549
227        SalePrice -3.626274  3.626274
189  SaleType[T.New] -3.323360  3.323360
-----------------------------------------------------------------

3.3 Try decision tree classifier

from sklearn.tree import DecisionTreeClassifier

def model_tree(technique, X_resampled, y_resampled, Xs_test, y_test):
    # run cross validation
    tree = DecisionTreeClassifier()
    scores = cross_val_score(tree, X_resampled, y_resampled, cv=5, scoring=scorer)
    
    # obtain avg cross validated precision score
    avg_score = scores.mean()

    # run tree for predictions to make classification report and confusion matrix
    tree.fit(X_resampled, y_resampled)
    y_pred_tree = tree.predict(Xs_test)
    
    # print scores
    print(technique)
    print('-------------------------------------------------------------')

    print(metrics.classification_report(y_test, y_pred_tree))

    print(metrics.confusion_matrix(y_test, y_pred_tree))

    print('average f-1 score from CV: ')
    print(avg_score)

    
    # append to overall scores
    overall_scores.append((avg_score, 'Decision Tree - ',technique))
    
model_tree('undersampling', X_resampled_under, y_resampled_under, Xs_test, y_test)
model_tree('oversampling', X_resampled_over, y_resampled_over, Xs_test, y_test)
model_tree('SMOTE', X_resampled_smote, y_resampled_smote, Xs_test, y_test)
model_tree('SMOTEENN', X_resampled_smoteenn, y_resampled_smoteenn, Xs_test, y_test)

undersampling
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       1.00      0.21      0.34       164
        1.0       0.08      1.00      0.14        11

avg / total       0.94      0.26      0.33       175

[[ 34 130]
 [  0  11]]
average f-1 score from CV: 
0.7654700854700854
oversampling
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.95      0.95       164
        1.0       0.20      0.18      0.19        11

avg / total       0.90      0.90      0.90       175

[[156   8]
 [  9   2]]
average f-1 score from CV: 
0.9583570927579039
SMOTE
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.94      0.92      0.93       164
        1.0       0.07      0.09      0.08        11

avg / total       0.88      0.87      0.88       175

[[151  13]
 [ 10   1]]
average f-1 score from CV: 
0.9281815909722964
SMOTEENN
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.80      0.87       164
        1.0       0.11      0.36      0.17        11

avg / total       0.90      0.78      0.83       175

[[132  32]
 [  7   4]]
average f-1 score from CV: 
0.9338618029830025

3.4 Try KNN

from sklearn.neighbors import KNeighborsClassifier


def model_knn(technique, X_resampled, y_resampled, Xs_test, y_test):
    
    cv_scores = []

    # create list of odd numbers for cross-validation
    neighbors = list(range(1,15,2))

    for k in neighbors:
        knn = KNeighborsClassifier(n_neighbors=k)
        scores = cross_val_score(knn, X_resampled, y_resampled, cv=5, scoring=scorer)
        cv_scores.append((scores.mean(), k))

    #obtain best k and best mean cv f-beta score
    best_k = max(cv_scores)[1]
    avg_score = max(cv_scores)[0]

    # run KNN on best_k for predictions to make classification report and confusion matrix
    knn = KNeighborsClassifier(n_neighbors=best_k)
    knn.fit(X_resampled, y_resampled)
    y_pred_knn = knn.predict(Xs_test)
    
    # print scores
    print(technique)
    print('n_neighbours =') 
    print(best_k)
    print('-------------------------------------------------------------')
    print(metrics.classification_report(y_test, y_pred_knn))

    print(metrics.confusion_matrix(y_test, y_pred_knn))

    print('average f-1 score from CV:')
    print(avg_score)


    
    # append to overall scores
    overall_scores.append((avg_score, "KNN -", technique))
    return avg_score
    
model_knn('undersampling', X_resampled_under, y_resampled_under, Xs_test, y_test)
model_knn('oversampling', X_resampled_over, y_resampled_over, Xs_test, y_test)
model_knn('SMOTE', X_resampled_smote, y_resampled_smote, Xs_test, y_test)
model_knn('SMOTEENN', X_resampled_smoteenn, y_resampled_smoteenn, Xs_test, y_test)

/Users/hayatibintehamzah/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.
  'precision', 'predicted', average, warn_for)
/Users/hayatibintehamzah/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.
  'precision', 'predicted', average, warn_for)
/Users/hayatibintehamzah/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.
  'precision', 'predicted', average, warn_for)
/Users/hayatibintehamzah/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.
  'precision', 'predicted', average, warn_for)
/Users/hayatibintehamzah/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.
  'precision', 'predicted', average, warn_for)
/Users/hayatibintehamzah/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.
  'precision', 'predicted', average, warn_for)
/Users/hayatibintehamzah/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.
  'precision', 'predicted', average, warn_for)
/Users/hayatibintehamzah/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.
  'precision', 'predicted', average, warn_for)
/Users/hayatibintehamzah/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/metrics/classification.py:1135: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.
  'precision', 'predicted', average, warn_for)


undersampling
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.88      0.92       164
        1.0       0.17      0.36      0.24        11

avg / total       0.90      0.85      0.87       175

[[145  19]
 [  7   4]]
average f-1 score from CV:
0.2628985507246377
oversampling
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.92      0.93       164
        1.0       0.19      0.27      0.22        11

avg / total       0.90      0.88      0.89       175

[[151  13]
 [  8   3]]
average f-1 score from CV:
0.9541175022628945
SMOTE
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.94      0.79      0.86       164
        1.0       0.08      0.27      0.12        11

avg / total       0.89      0.76      0.81       175

[[130  34]
 [  8   3]]
average f-1 score from CV:
0.9033518216530035
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113





0.9807279567701113

list(reversed(sorted(overall_scores)))

[(0.9807279567701113, 'KNN -', 'SMOTEENN'),
 (0.9583570927579039, 'Decision Tree - ', 'oversampling'),
 (0.9541175022628945, 'KNN -', 'oversampling'),
 (0.9382978528695868, 'Logistic Regression - ', 'SMOTEENN'),
 (0.9338618029830025, 'Decision Tree - ', 'SMOTEENN'),
 (0.9281815909722964, 'Decision Tree - ', 'SMOTE'),
 (0.9033518216530035, 'KNN -', 'SMOTE'),
 (0.8793876680774424, 'Logistic Regression - ', 'oversampling'),
 (0.8769788528485988, 'Logistic Regression - ', 'SMOTE'),
 (0.7654700854700854, 'Decision Tree - ', 'undersampling'),
 (0.5118860339070415, 'Logistic Regression - ', 'undersampling'),
 (0.2628985507246377, 'KNN -', 'undersampling')]

KNN with SMOTEENN gives the highest f-1 score, followed by Decision Tree and KNN with oversampling.

Logistic Regression with SMOTEENN gives the below top predictors. To test the importance of these predictors, we need to run KNN with SMOTEENN without them and assess the impact on the model performance.

# drop columns

impact_on_score = []

original_score = model_knn('SMOTEENN', X_resampled_smoteenn, y_resampled_smoteenn, Xs_test, y_test)    

SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113

def drop_col(col):

    X_train_drop_col = X_train.drop(col, axis=1)
    X_test_drop_col = X_test.drop(col, axis=1)

    # scale
    ss = StandardScaler()
    ss.fit(X_train_drop_col)
    Xs_train_drop_col = ss.transform(X_train_drop_col)
    Xs_test_drop_col = ss.transform(X_test_drop_col)

    # resample SMOTEENN
    X_resampled_smoteenn_drop_col, y_resampled_smoteenn_drop_col = smote_enn.fit_sample(
        Xs_train_drop_col, y_train)

    # run model
    print(col , 'dropped')
    
    print('------------')
    
    new_score = model_knn('SMOTEENN', X_resampled_smoteenn_drop_col, 
                          y_resampled_smoteenn_drop_col, Xs_test_drop_col, y_test)
    
    impact = original_score - new_score
    impact_on_score.append((impact, col))

for col in list(X_train.columns):
    drop_col(col)

MSZoning[C (all)] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.94      0.57      0.71       164
        1.0       0.07      0.45      0.12        11

avg / total       0.89      0.57      0.67       175

[[94 70]
 [ 6  5]]
average f-1 score from CV:
0.9811269384486723
MSZoning[FV] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
MSZoning[RH] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.68       175

[[94 70]
 [ 5  6]]
average f-1 score from CV:
0.9811335195065155
MSZoning[RL] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9807361630079079
MSZoning[RM] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9811401209597689
Street[T.Pave] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
LotShape[T.IR2] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9815341363558903
LotShape[T.IR3] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807296020138458
LotShape[T.Reg] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.58      0.72       164
        1.0       0.08      0.55      0.14        11

avg / total       0.90      0.58      0.68       175

[[95 69]
 [ 5  6]]
average f-1 score from CV:
0.9803834187868767
LandContour[T.HLS] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.58      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.58      0.69       175

[[95 69]
 [ 4  7]]
average f-1 score from CV:
0.9811402921824467
LandContour[T.Low] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
LandContour[T.Lvl] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803322455152383
Utilities[T.NoSeWa] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
LotConfig[T.CulDSac] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9819615169333844
LotConfig[T.FR2] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807296020138458
LotConfig[T.FR3] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9811302188632206
LotConfig[T.Inside] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.59      0.73       164
        1.0       0.08      0.55      0.14        11

avg / total       0.90      0.59      0.69       175

[[97 67]
 [ 5  6]]
average f-1 score from CV:
0.9831580637631545
LandSlope[T.Mod] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9827708551336316
LandSlope[T.Sev] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9807195943862063
Neighborhood[T.Blueste] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Neighborhood[T.BrDale] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9799299834221289
Neighborhood[T.BrkSide] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9787081843046497
Neighborhood[T.ClearCr] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807163139716577
Neighborhood[T.CollgCr] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.68       175

[[94 70]
 [ 5  6]]
average f-1 score from CV:
0.9815490052348915
Neighborhood[T.Crawfor] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9791352605152799
Neighborhood[T.Edwards] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9815843296071067
Neighborhood[T.Gilbert] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.59      0.72       164
        1.0       0.08      0.55      0.14        11

avg / total       0.90      0.58      0.69       175

[[96 68]
 [ 5  6]]
average f-1 score from CV:
0.9791132330972907
Neighborhood[T.IDOTRR] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9811218564793155
Neighborhood[T.MeadowV] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803359486249974
Neighborhood[T.Mitchel] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.58      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.58      0.69       175

[[95 69]
 [ 4  7]]
average f-1 score from CV:
0.9791397803828457
Neighborhood[T.NAmes] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.97      0.57      0.72       164
        1.0       0.10      0.73      0.18        11

avg / total       0.91      0.58      0.69       175

[[94 70]
 [ 3  8]]
average f-1 score from CV:
0.9827658950530523
Neighborhood[T.NPkVill] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9799299834221289
Neighborhood[T.NWAmes] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.59      0.73       164
        1.0       0.08      0.55      0.14        11

avg / total       0.90      0.59      0.69       175

[[97 67]
 [ 5  6]]
average f-1 score from CV:
0.9835488392039189
Neighborhood[T.NoRidge] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.59      0.72       164
        1.0       0.08      0.55      0.14        11

avg / total       0.90      0.58      0.69       175

[[96 68]
 [ 5  6]]
average f-1 score from CV:
0.9795077912751008
Neighborhood[T.NridgHt] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
Neighborhood[T.OldTown] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.68       175

[[94 70]
 [ 5  6]]
average f-1 score from CV:
0.9799265568523323
Neighborhood[T.SWISU] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9803306002715036
Neighborhood[T.Sawyer] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.59      0.73       164
        1.0       0.08      0.55      0.14        11

avg / total       0.90      0.59      0.69       175

[[97 67]
 [ 5  6]]
average f-1 score from CV:
0.9811466932708296
Neighborhood[T.SawyerW] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.59      0.73       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.59      0.69       175

[[96 68]
 [ 4  7]]
average f-1 score from CV:
0.9803322455152383
Neighborhood[T.Somerst] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803173322930967
Neighborhood[T.StoneBr] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.59      0.73       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.59      0.69       175

[[96 68]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Neighborhood[T.Timber] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.59      0.72       164
        1.0       0.08      0.55      0.14        11

avg / total       0.90      0.58      0.69       175

[[96 68]
 [ 5  6]]
average f-1 score from CV:
0.9835354798179541
Neighborhood[T.Veenker] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.980315697122283
Condition1[T.Feedr] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9811318641069553
Condition1[T.Norm] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.981527555298047
Condition1[T.PosA] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.55      0.70       164
        1.0       0.09      0.64      0.15        11

avg / total       0.90      0.55      0.66       175

[[90 74]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
Condition1[T.PosN] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.981128583692407
Condition1[T.RRAe] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.58      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.58      0.69       175

[[95 69]
 [ 4  7]]
average f-1 score from CV:
0.9803306002715036
Condition1[T.RRAn] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807163139716577
Condition1[T.RRNe] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
Condition1[T.RRNn] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Condition2[T.Feedr] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
Condition2[T.Norm] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Condition2[T.PosA] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Condition2[T.PosN] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Condition2[T.RRAe] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Condition2[T.RRAn] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Condition2[T.RRNn] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
BldgType[T.2fmCon] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
BldgType[T.Duplex] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.68       175

[[94 70]
 [ 5  6]]
average f-1 score from CV:
0.9823253724999208
BldgType[T.Twnhs] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9799348590432156
BldgType[T.TwnhsE] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9811302188632206
HouseStyle[T.1.5Unf] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9795277413928009
HouseStyle[T.1Story] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.59      0.73       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.59      0.69       175

[[96 68]
 [ 4  7]]
average f-1 score from CV:
0.9819415359541172
HouseStyle[T.2.5Fin] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9803322455152383
HouseStyle[T.2.5Unf] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
HouseStyle[T.2Story] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.59      0.73       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.59      0.70       175

[[97 67]
 [ 4  7]]
average f-1 score from CV:
0.9811384251010173
HouseStyle[T.SFoyer] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.55      0.70       164
        1.0       0.09      0.64      0.15        11

avg / total       0.90      0.56      0.67       175

[[91 73]
 [ 4  7]]
average f-1 score from CV:
0.9819450071532604
HouseStyle[T.SLvl] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.58      0.72       164
        1.0       0.08      0.55      0.14        11

avg / total       0.90      0.58      0.68       175

[[95 69]
 [ 5  6]]
average f-1 score from CV:
0.9807228950295009
RoofStyle[T.Gable] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.97      0.57      0.72       164
        1.0       0.10      0.73      0.18        11

avg / total       0.91      0.58      0.69       175

[[94 70]
 [ 3  8]]
average f-1 score from CV:
0.9815728023306051
RoofStyle[T.Gambrel] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9815292005417817
RoofStyle[T.Hip] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.97      0.57      0.72       164
        1.0       0.10      0.73      0.18        11

avg / total       0.91      0.58      0.68       175

[[93 71]
 [ 3  8]]
average f-1 score from CV:
0.9815793031316451
RoofStyle[T.Mansard] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
RoofStyle[T.Shed] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
RoofMatl[T.CompShg] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
RoofMatl[T.Membran] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
RoofMatl[T.Metal] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
RoofMatl[T.Roll] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
RoofMatl[T.Tar&Grv] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
RoofMatl[T.WdShake] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
RoofMatl[T.WdShngl] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Exterior1st[T.AsphShn] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Exterior1st[T.BrkComm] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9811402921824467
Exterior1st[T.BrkFace] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9815292005417817
Exterior1st[T.CBlock] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Exterior1st[T.CemntBd] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Exterior1st[T.HdBoard] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.67       175

[[93 71]
 [ 5  6]]
average f-1 score from CV:
0.9835304731420024
Exterior1st[T.ImStucc] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Exterior1st[T.MetalSd] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.67       175

[[93 71]
 [ 5  6]]
average f-1 score from CV:
0.9835521398472137
Exterior1st[T.Plywood] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.59      0.73       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.59      0.69       175

[[96 68]
 [ 4  7]]
average f-1 score from CV:
0.9807129323106946
Exterior1st[T.Stone] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Exterior1st[T.Stucco] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9799267230713617
Exterior1st[T.VinylSd] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.68       175

[[94 70]
 [ 5  6]]
average f-1 score from CV:
0.9787330283955857
Exterior1st[T.Wd Sdng] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9815472182147733
Exterior1st[T.WdShing] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.981525773971985
Exterior2nd[T.AsphShn] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9811269384486723
Exterior2nd[T.Brk Cmn] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9799299834221289
Exterior2nd[T.BrkFace] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9815292005417817
Exterior2nd[T.CBlock] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Exterior2nd[T.CmentBd] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Exterior2nd[T.HdBoard] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.68       175

[[94 70]
 [ 5  6]]
average f-1 score from CV:
0.9843466358419981
Exterior2nd[T.ImStucc] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803322455152383
Exterior2nd[T.MetalSd] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.67       175

[[93 71]
 [ 5  6]]
average f-1 score from CV:
0.9843666684472622
Exterior2nd[T.Other] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Exterior2nd[T.Plywood] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9799365644799721
Exterior2nd[T.Stone] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
Exterior2nd[T.Stucco] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9799216210382238
Exterior2nd[T.VinylSd] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.68       175

[[94 70]
 [ 5  6]]
average f-1 score from CV:
0.9787330283955857
Exterior2nd[T.Wd Sdng] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9811433007221039
Exterior2nd[T.Wd Shng] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803371211363251
MasVnrType[T.BrkFace] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.60      0.74       164
        1.0       0.10      0.64      0.17        11

avg / total       0.91      0.61      0.71       175

[[99 65]
 [ 4  7]]
average f-1 score from CV:
0.9787228795572556
MasVnrType[T.None] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.59      0.73       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.59      0.69       175

[[96 68]
 [ 4  7]]
average f-1 score from CV:
0.9807394434224562
MasVnrType[T.Stone] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.58      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.58      0.69       175

[[95 69]
 [ 4  7]]
average f-1 score from CV:
0.9815292005417817
ExterQual[T.Fa] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
ExterQual[T.Gd] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.58      0.72       164
        1.0       0.08      0.55      0.14        11

avg / total       0.90      0.58      0.68       175

[[95 69]
 [ 5  6]]
average f-1 score from CV:
0.9815205399560677
ExterQual[T.TA] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.58      0.72       164
        1.0       0.08      0.55      0.14        11

avg / total       0.90      0.58      0.68       175

[[95 69]
 [ 5  6]]
average f-1 score from CV:
0.9815357815996248
ExterCond[T.Fa] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9815292005417817
ExterCond[T.Gd] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807462563909152
ExterCond[T.Po] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
ExterCond[T.TA] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9787379240804537
Foundation[T.CBlock] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9819347402346583
Foundation[T.PConc] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9811450265542708
Foundation[T.Slab] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9799267230713617
Foundation[T.Stone] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
Foundation[T.Wood] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.56      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.67       175

[[92 72]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
BsmtQual[T.Fa] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9795277413928009
BsmtQual[T.Gd] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.59      0.73       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.59      0.69       175

[[96 68]
 [ 4  7]]
average f-1 score from CV:
0.9811433007221039
BsmtQual[T.TA] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.59      0.73       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.59      0.70       175

[[97 67]
 [ 4  7]]
average f-1 score from CV:
0.9815488838539185
BsmtCond[T.Gd] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.979916517677139
BsmtCond[T.Po] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
BsmtCond[T.TA] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9799133137189774
BsmtExposure[T.Gd] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
BsmtExposure[T.Mn] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.59      0.72       164
        1.0       0.08      0.55      0.14        11

avg / total       0.90      0.58      0.69       175

[[96 68]
 [ 5  6]]
average f-1 score from CV:
0.9839476949559781
BsmtExposure[T.No] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.58      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.58      0.69       175

[[95 69]
 [ 4  7]]
average f-1 score from CV:
0.9819247556505459
BsmtFinType1[T.BLQ] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.68       175

[[94 70]
 [ 5  6]]
average f-1 score from CV:
0.983533814178809
BsmtFinType1[T.GLQ] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.59      0.73       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.59      0.69       175

[[96 68]
 [ 4  7]]
average f-1 score from CV:
0.9835337941807076
BsmtFinType1[T.LwQ] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.58      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.91      0.58      0.69       175

[[95 69]
 [ 4  7]]
average f-1 score from CV:
0.9823418314641238
BsmtFinType1[T.Rec] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.58      0.72       164
        1.0       0.08      0.55      0.14        11

avg / total       0.90      0.58      0.68       175

[[95 69]
 [ 5  6]]
average f-1 score from CV:
0.9811334297512702
BsmtFinType1[T.Unf] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9799265568523323
BsmtFinType2[T.BLQ] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.97      0.57      0.72       164
        1.0       0.10      0.73      0.18        11

avg / total       0.91      0.58      0.68       175

[[93 71]
 [ 3  8]]
average f-1 score from CV:
0.9803273399207365
BsmtFinType2[T.GLQ] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
BsmtFinType2[T.LwQ] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.56      0.70       164
        1.0       0.08      0.55      0.13        11

avg / total       0.89      0.56      0.67       175

[[92 72]
 [ 5  6]]
average f-1 score from CV:
0.9807345076083477
BsmtFinType2[T.Rec] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803189775368313
BsmtFinType2[T.Unf] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.97      0.57      0.72       164
        1.0       0.10      0.73      0.18        11

avg / total       0.91      0.58      0.68       175

[[93 71]
 [ 3  8]]
average f-1 score from CV:
0.9823486238266085
Heating[T.GasA] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
Heating[T.GasW] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
Heating[T.Grav] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807263215992975
Heating[T.OthW] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Heating[T.Wall] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803273399207365
HeatingQC[T.Fa] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9791120800202279
HeatingQC[T.Gd] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9811251571226103
HeatingQC[T.Po] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
HeatingQC[T.TA] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.72       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.58      0.68       175

[[94 70]
 [ 4  7]]
average f-1 score from CV:
0.9819496421480475
CentralAir[T.Y] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.68       175

[[94 70]
 [ 5  6]]
average f-1 score from CV:
0.981539041950392
Electrical[T.FuseF] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9803238831313333
Electrical[T.FuseP] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Electrical[T.Mix] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9807279567701113
Electrical[T.SBrkr] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.56      0.70       164
        1.0       0.08      0.55      0.13        11

avg / total       0.89      0.56      0.67       175

[[92 72]
 [ 5  6]]
average f-1 score from CV:
0.9823286935386258
KitchenQual[T.Fa] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.96      0.57      0.71       164
        1.0       0.09      0.64      0.16        11

avg / total       0.90      0.57      0.68       175

[[93 71]
 [ 4  7]]
average f-1 score from CV:
0.9791336253444662
KitchenQual[T.Gd] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.67       175

[[93 71]
 [ 5  6]]
average f-1 score from CV:
0.9819296914646547
KitchenQual[T.TA] dropped
------------
SMOTEENN
n_neighbours =
1
-------------------------------------------------------------
             precision    recall  f1-score   support

        0.0       0.95      0.57      0.71       164
        1.0       0.08      0.55      0.14        11

avg / total       0.89      0.57      0.67       175

[[93 71]
 [ 5  6]]
average f-1 score from CV:
0.9827375366895785
Functional[T.Maj2] dropped
------------



---------------------------------------------------------------------------

KeyboardInterrupt                         Traceback (most recent call last)

<ipython-input-74-63688dae373d> in <module>()
      1 for col in list(X_train.columns):
----> 2     drop_col(col)


<ipython-input-73-898cd284afc1> in drop_col(col)
     20 
     21     new_score = model_knn('SMOTEENN', X_resampled_smoteenn_drop_col, 
---> 22                           y_resampled_smoteenn_drop_col, Xs_test_drop_col, y_test)
     23 
     24     impact = original_score - new_score


<ipython-input-69-01f5cec8cf5d> in model_knn(technique, X_resampled, y_resampled, Xs_test, y_test)
     11     for k in neighbors:
     12         knn = KNeighborsClassifier(n_neighbors=k)
---> 13         scores = cross_val_score(knn, X_resampled, y_resampled, cv=5, scoring=scorer)
     14         cv_scores.append((scores.mean(), k))
     15 


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/cross_validation.py in cross_val_score(estimator, X, y, scoring, cv, n_jobs, verbose, fit_params, pre_dispatch)
   1579                                               train, test, verbose, None,
   1580                                               fit_params)
-> 1581                       for train, test in cv)
   1582     return np.array(scores)[:, 0]
   1583 


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in __call__(self, iterable)
    777             # was dispatched. In particular this covers the edge
    778             # case of Parallel used with an exhausted iterator.
--> 779             while self.dispatch_one_batch(iterator):
    780                 self._iterating = True
    781             else:


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in dispatch_one_batch(self, iterator)
    623                 return False
    624             else:
--> 625                 self._dispatch(tasks)
    626                 return True
    627 


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in _dispatch(self, batch)
    586         dispatch_timestamp = time.time()
    587         cb = BatchCompletionCallBack(dispatch_timestamp, len(batch), self)
--> 588         job = self._backend.apply_async(batch, callback=cb)
    589         self._jobs.append(job)
    590 


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py in apply_async(self, func, callback)
    109     def apply_async(self, func, callback=None):
    110         """Schedule a func to be run"""
--> 111         result = ImmediateResult(func)
    112         if callback:
    113             callback(result)


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py in __init__(self, batch)
    330         # Don't delay the application, to avoid keeping the input
    331         # arguments in memory
--> 332         self.results = batch()
    333 
    334     def get(self):


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in __call__(self)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
    132 
    133     def __len__(self):


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in <listcomp>(.0)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
    132 
    133     def __len__(self):


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/cross_validation.py in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, error_score)
   1692 
   1693     else:
-> 1694         test_score = _score(estimator, X_test, y_test, scorer)
   1695         if return_train_score:
   1696             train_score = _score(estimator, X_train, y_train, scorer)


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/cross_validation.py in _score(estimator, X_test, y_test, scorer)
   1749         score = scorer(estimator, X_test)
   1750     else:
-> 1751         score = scorer(estimator, X_test, y_test)
   1752     if hasattr(score, 'item'):
   1753         try:


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/metrics/scorer.py in __call__(self, estimator, X, y_true, sample_weight)
     99         super(_PredictScorer, self).__call__(estimator, X, y_true,
    100                                              sample_weight=sample_weight)
--> 101         y_pred = estimator.predict(X)
    102         if sample_weight is not None:
    103             return self._sign * self._score_func(y_true, y_pred,


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/neighbors/classification.py in predict(self, X)
    143         X = check_array(X, accept_sparse='csr')
    144 
--> 145         neigh_dist, neigh_ind = self.kneighbors(X)
    146 
    147         classes_ = self.classes_


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/neighbors/base.py in kneighbors(self, X, n_neighbors, return_distance)
    383                 delayed(self._tree.query, check_pickle=False)(
    384                     X[s], n_neighbors, return_distance)
--> 385                 for s in gen_even_slices(X.shape[0], n_jobs)
    386             )
    387             if return_distance:


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in __call__(self, iterable)
    777             # was dispatched. In particular this covers the edge
    778             # case of Parallel used with an exhausted iterator.
--> 779             while self.dispatch_one_batch(iterator):
    780                 self._iterating = True
    781             else:


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in dispatch_one_batch(self, iterator)
    623                 return False
    624             else:
--> 625                 self._dispatch(tasks)
    626                 return True
    627 


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in _dispatch(self, batch)
    586         dispatch_timestamp = time.time()
    587         cb = BatchCompletionCallBack(dispatch_timestamp, len(batch), self)
--> 588         job = self._backend.apply_async(batch, callback=cb)
    589         self._jobs.append(job)
    590 


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py in apply_async(self, func, callback)
    109     def apply_async(self, func, callback=None):
    110         """Schedule a func to be run"""
--> 111         result = ImmediateResult(func)
    112         if callback:
    113             callback(result)


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/_parallel_backends.py in __init__(self, batch)
    330         # Don't delay the application, to avoid keeping the input
    331         # arguments in memory
--> 332         self.results = batch()
    333 
    334     def get(self):


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in __call__(self)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
    132 
    133     def __len__(self):


~/anaconda2/envs/py36/lib/python3.6/site-packages/sklearn/externals/joblib/parallel.py in <listcomp>(.0)
    129 
    130     def __call__(self):
--> 131         return [func(*args, **kwargs) for func, args, kwargs in self.items]
    132 
    133     def __len__(self):


KeyboardInterrupt: 

impact_on_score

[(-0.00039898167856100564, 'MSZoning[C (all)]'),
 (0.0, 'MSZoning[FV]'),
 (-0.00040556273640424134, 'MSZoning[RH]'),
 (-8.206237796626326e-06, 'MSZoning[RL]'),
 (-0.0004121641896576156, 'MSZoning[RM]'),
 (0.0, 'Street[T.Pave]'),
 (-0.0008061795857789988, 'LotShape[T.IR2]'),
 (-1.6452437345826354e-06, 'LotShape[T.IR3]'),
 (0.00034453798323452745, 'LotShape[T.Reg]'),
 (-0.0004123354123354295, 'LandContour[T.HLS]'),
 (0.0, 'LandContour[T.Low]'),
 (0.00039571125487292136, 'LandContour[T.Lvl]'),
 (0.0, 'Utilities[T.NoSeWa]'),
 (-0.0012335601632731397, 'LotConfig[T.CulDSac]'),
 (-1.6452437345826354e-06, 'LotConfig[T.FR2]'),
 (-0.00040226209310934014, 'LotConfig[T.FR3]'),
 (-0.002430106993043246, 'LotConfig[T.Inside]'),
 (-0.00204289836352034, 'LandSlope[T.Mod]'),
 (8.36238390500288e-06, 'LandSlope[T.Sev]'),
 (0.0, 'Neighborhood[T.Blueste]'),
 (0.0007979733479823725, 'Neighborhood[T.BrDale]'),
 (0.002019772465461589, 'Neighborhood[T.BrkSide]'),
 (1.1642798453559422e-05, 'Neighborhood[T.ClearCr]'),
 (-0.0008210484647802607, 'Neighborhood[T.CollgCr]'),
 (0.0015926962548313828, 'Neighborhood[T.Crawfor]'),
 (-0.0008563728369954671, 'Neighborhood[T.Edwards]'),
 (0.0016147236728205616, 'Neighborhood[T.Gilbert]'),
 (-0.00039389970920422623, 'Neighborhood[T.IDOTRR]'),
 (0.0003920081451138646, 'Neighborhood[T.MeadowV]'),
 (0.0015881763872656052, 'Neighborhood[T.Mitchel]'),
 (-0.0020379382829410764, 'Neighborhood[T.NAmes]'),
 (0.0007979733479823725, 'Neighborhood[T.NPkVill]'),
 (-0.0028208824338076255, 'Neighborhood[T.NWAmes]'),
 (0.0012201654950104723, 'Neighborhood[T.NoRidge]'),
 (0.0004006168493747575, 'Neighborhood[T.NridgHt]'),
 (0.0008013999177789444, 'Neighborhood[T.OldTown]'),
 (0.000397356498607615, 'Neighborhood[T.SWISU]'),
 (-0.0004187365007183308, 'Neighborhood[T.Sawyer]'),
 (0.00039571125487292136, 'Neighborhood[T.SawyerW]'),
 (0.00041062447701456506, 'Neighborhood[T.Somerst]'),
 (0.0, 'Neighborhood[T.StoneBr]'),
 (-0.0028075230478428193, 'Neighborhood[T.Timber]'),
 (0.0004122596478283169, 'Neighborhood[T.Veenker]'),
 (-0.0004039073368440338, 'Condition1[T.Feedr]'),
 (-0.0007995985279357631, 'Condition1[T.Norm]'),
 (0.0004006168493747575, 'Condition1[T.PosA]'),
 (-0.0004006269222956993, 'Condition1[T.PosN]'),
 (0.000397356498607615, 'Condition1[T.RRAe]'),
 (1.1642798453559422e-05, 'Condition1[T.RRAn]'),
 (0.0004006168493747575, 'Condition1[T.RRNe]'),
 (0.0, 'Condition1[T.RRNn]'),
 (0.0004006168493747575, 'Condition2[T.Feedr]'),
 (0.0, 'Condition2[T.Norm]'),
 (0.0, 'Condition2[T.PosA]'),
 (0.0, 'Condition2[T.PosN]'),
 (0.0, 'Condition2[T.RRAe]'),
 (0.0, 'Condition2[T.RRAn]'),
 (0.0004006168493747575, 'Condition2[T.RRNn]'),
 (0.0004006168493747575, 'BldgType[T.2fmCon]'),
 (-0.001597415729809537, 'BldgType[T.Duplex]'),
 (0.0007930977268956196, 'BldgType[T.Twnhs]'),
 (-0.00040226209310934014, 'BldgType[T.TwnhsE]'),
 (0.0012002153773104096, 'HouseStyle[T.1.5Unf]'),
 (-0.0012135791840058863, 'HouseStyle[T.1Story]'),
 (0.00039571125487292136, 'HouseStyle[T.2.5Fin]'),
 (0.0004006168493747575, 'HouseStyle[T.2.5Unf]'),
 (-0.0004104683309060775, 'HouseStyle[T.2Story]'),
 (-0.0012170503831491208, 'HouseStyle[T.SFoyer]'),
 (5.061740610323717e-06, 'HouseStyle[T.SLvl]'),
 (-0.000844845560493801, 'RoofStyle[T.Gable]'),
 (-0.0008012437716704568, 'RoofStyle[T.Gambrel]'),
 (-0.0008513463615338335, 'RoofStyle[T.Hip]'),
 (0.0, 'RoofStyle[T.Mansard]'),
 (0.0, 'RoofStyle[T.Shed]'),
 (0.0, 'RoofMatl[T.CompShg]'),
 (0.0, 'RoofMatl[T.Membran]'),
 (0.0, 'RoofMatl[T.Metal]'),
 (0.0, 'RoofMatl[T.Roll]'),
 (0.0, 'RoofMatl[T.Tar&Grv]'),
 (0.0, 'RoofMatl[T.WdShake]'),
 (0.0, 'RoofMatl[T.WdShngl]'),
 (0.0, 'Exterior1st[T.AsphShn]'),
 (-0.0004123354123354295, 'Exterior1st[T.BrkComm]'),
 (-0.0008012437716704568, 'Exterior1st[T.BrkFace]'),
 (0.0, 'Exterior1st[T.CBlock]'),
 (0.0, 'Exterior1st[T.CemntBd]'),
 (-0.002802516371891106, 'Exterior1st[T.HdBoard]'),
 (0.0, 'Exterior1st[T.ImStucc]'),
 (-0.0028241830771024157, 'Exterior1st[T.MetalSd]'),
 (1.5024459416701497e-05, 'Exterior1st[T.Plywood]'),
 (0.0, 'Exterior1st[T.Stone]'),
 (0.000801233698749515, 'Exterior1st[T.Stucco]'),
 (0.0019949283745255286, 'Exterior1st[T.VinylSd]'),
 (-0.000819261444662045, 'Exterior1st[T.Wd Sdng]'),
 (-0.0007978172018737739, 'Exterior1st[T.WdShing]'),
 (-0.00039898167856100564, 'Exterior2nd[T.AsphShn]'),
 (0.0007979733479823725, 'Exterior2nd[T.Brk Cmn]'),
 (-0.0008012437716704568, 'Exterior2nd[T.BrkFace]'),
 (0.0, 'Exterior2nd[T.CBlock]'),
 (0.0, 'Exterior2nd[T.CmentBd]'),
 (-0.0036186790718868433, 'Exterior2nd[T.HdBoard]'),
 (0.00039571125487292136, 'Exterior2nd[T.ImStucc]'),
 (-0.0036387116771509076, 'Exterior2nd[T.MetalSd]'),
 (0.0, 'Exterior2nd[T.Other]'),
 (0.0007913922901391368, 'Exterior2nd[T.Plywood]'),
 (0.0004006168493747575, 'Exterior2nd[T.Stone]'),
 (0.0008063357318874864, 'Exterior2nd[T.Stucco]'),
 (0.0019949283745255286, 'Exterior2nd[T.VinylSd]'),
 (-0.0004153439519926083, 'Exterior2nd[T.Wd Sdng]'),
 (0.0003908356337861685, 'Exterior2nd[T.Wd Shng]'),
 (0.0020050772128556993, 'MasVnrType[T.BrkFace]'),
 (-1.1486652344960824e-05, 'MasVnrType[T.None]'),
 (-0.0008012437716704568, 'MasVnrType[T.Stone]'),
 (0.0004006168493747575, 'ExterQual[T.Fa]'),
 (-0.0007925831859564303, 'ExterQual[T.Gd]'),
 (-0.0008078248295135815, 'ExterQual[T.TA]'),
 (-0.0008012437716704568, 'ExterCond[T.Fa]'),
 (-1.829962080390768e-05, 'ExterCond[T.Gd]'),
 (0.0, 'ExterCond[T.Po]'),
 (0.0019900326896575837, 'ExterCond[T.TA]'),
 (-0.0012067834645470565, 'Foundation[T.CBlock]'),
 (-0.00041706978415956275, 'Foundation[T.PConc]'),
 (0.000801233698749515, 'Foundation[T.Slab]'),
 (0.0004006168493747575, 'Foundation[T.Stone]'),
 (0.0004006168493747575, 'Foundation[T.Wood]'),
 (0.0012002153773104096, 'BsmtQual[T.Fa]'),
 (-0.0004153439519926083, 'BsmtQual[T.Gd]'),
 (-0.0008209270838072102, 'BsmtQual[T.TA]'),
 (0.0008114390929722104, 'BsmtCond[T.Gd]'),
 (0.0, 'BsmtCond[T.Po]'),
 (0.0008146430511338787, 'BsmtCond[T.TA]'),
 (0.0, 'BsmtExposure[T.Gd]'),
 (-0.0032197381858668495, 'BsmtExposure[T.Mn]'),
 (-0.0011967988804346685, 'BsmtExposure[T.No]'),
 (-0.002805857408697765, 'BsmtFinType1[T.BLQ]'),
 (-0.002805837410596368, 'BsmtFinType1[T.GLQ]'),
 (-0.0016138746940125293, 'BsmtFinType1[T.LwQ]'),
 (-0.0004054729811588942, 'BsmtFinType1[T.Rec]'),
 (0.0008013999177789444, 'BsmtFinType1[T.Unf]'),
 (0.0004006168493747575, 'BsmtFinType2[T.BLQ]'),
 (0.0, 'BsmtFinType2[T.GLQ]'),
 (-6.550838236418777e-06, 'BsmtFinType2[T.LwQ]'),
 (0.0004089792332799824, 'BsmtFinType2[T.Rec]'),
 (-0.0016206670564972159, 'BsmtFinType2[T.Unf]'),
 (0.0004006168493747575, 'Heating[T.GasA]'),
 (0.0004006168493747575, 'Heating[T.GasW]'),
 (1.6351708137518628e-06, 'Heating[T.Grav]'),
 (0.0, 'Heating[T.OthW]'),
 (0.0004006168493747575, 'Heating[T.Wall]'),
 (0.0016158767498833937, 'HeatingQC[T.Fa]'),
 (-0.0003972003524990164, 'HeatingQC[T.Gd]'),
 (0.0, 'HeatingQC[T.Po]'),
 (-0.0012216853779362102, 'HeatingQC[T.TA]'),
 (-0.000811085180280724, 'CentralAir[T.Y]'),
 (0.00040407363877792424, 'Electrical[T.FuseF]'),
 (0.0, 'Electrical[T.FuseP]'),
 (0.0, 'Electrical[T.Mix]'),
 (-0.0016007367685145768, 'Electrical[T.SBrkr]'),
 (0.0015943314256450236, 'KitchenQual[T.Fa]'),
 (-0.0012017346945434326, 'KitchenQual[T.Gd]'),
 (-0.0020095799194672637, 'KitchenQual[T.TA]')]

impact_list = impact_on_score.copy()

list(reversed(sorted(impact_list)))

[(0.002019772465461589, 'Neighborhood[T.BrkSide]'),
 (0.0020050772128556993, 'MasVnrType[T.BrkFace]'),
 (0.0019949283745255286, 'Exterior2nd[T.VinylSd]'),
 (0.0019949283745255286, 'Exterior1st[T.VinylSd]'),
 (0.0019900326896575837, 'ExterCond[T.TA]'),
 (0.0016158767498833937, 'HeatingQC[T.Fa]'),
 (0.0016147236728205616, 'Neighborhood[T.Gilbert]'),
 (0.0015943314256450236, 'KitchenQual[T.Fa]'),
 (0.0015926962548313828, 'Neighborhood[T.Crawfor]'),
 (0.0015881763872656052, 'Neighborhood[T.Mitchel]'),
 (0.0012201654950104723, 'Neighborhood[T.NoRidge]'),
 (0.0012002153773104096, 'HouseStyle[T.1.5Unf]'),
 (0.0012002153773104096, 'BsmtQual[T.Fa]'),
 (0.0008146430511338787, 'BsmtCond[T.TA]'),
 (0.0008114390929722104, 'BsmtCond[T.Gd]'),
 (0.0008063357318874864, 'Exterior2nd[T.Stucco]'),
 (0.0008013999177789444, 'Neighborhood[T.OldTown]'),
 (0.0008013999177789444, 'BsmtFinType1[T.Unf]'),
 (0.000801233698749515, 'Foundation[T.Slab]'),
 (0.000801233698749515, 'Exterior1st[T.Stucco]'),
 (0.0007979733479823725, 'Neighborhood[T.NPkVill]'),
 (0.0007979733479823725, 'Neighborhood[T.BrDale]'),
 (0.0007979733479823725, 'Exterior2nd[T.Brk Cmn]'),
 (0.0007930977268956196, 'BldgType[T.Twnhs]'),
 (0.0007913922901391368, 'Exterior2nd[T.Plywood]'),
 (0.0004122596478283169, 'Neighborhood[T.Veenker]'),
 (0.00041062447701456506, 'Neighborhood[T.Somerst]'),
 (0.0004089792332799824, 'BsmtFinType2[T.Rec]'),
 (0.00040407363877792424, 'Electrical[T.FuseF]'),
 (0.0004006168493747575, 'Neighborhood[T.NridgHt]'),
 (0.0004006168493747575, 'HouseStyle[T.2.5Unf]'),
 (0.0004006168493747575, 'Heating[T.Wall]'),
 (0.0004006168493747575, 'Heating[T.GasW]'),
 (0.0004006168493747575, 'Heating[T.GasA]'),
 (0.0004006168493747575, 'Foundation[T.Wood]'),
 (0.0004006168493747575, 'Foundation[T.Stone]'),
 (0.0004006168493747575, 'Exterior2nd[T.Stone]'),
 (0.0004006168493747575, 'ExterQual[T.Fa]'),
 (0.0004006168493747575, 'Condition2[T.RRNn]'),
 (0.0004006168493747575, 'Condition2[T.Feedr]'),
 (0.0004006168493747575, 'Condition1[T.RRNe]'),
 (0.0004006168493747575, 'Condition1[T.PosA]'),
 (0.0004006168493747575, 'BsmtFinType2[T.BLQ]'),
 (0.0004006168493747575, 'BldgType[T.2fmCon]'),
 (0.000397356498607615, 'Neighborhood[T.SWISU]'),
 (0.000397356498607615, 'Condition1[T.RRAe]'),
 (0.00039571125487292136, 'Neighborhood[T.SawyerW]'),
 (0.00039571125487292136, 'LandContour[T.Lvl]'),
 (0.00039571125487292136, 'HouseStyle[T.2.5Fin]'),
 (0.00039571125487292136, 'Exterior2nd[T.ImStucc]'),
 (0.0003920081451138646, 'Neighborhood[T.MeadowV]'),
 (0.0003908356337861685, 'Exterior2nd[T.Wd Shng]'),
 (0.00034453798323452745, 'LotShape[T.Reg]'),
 (1.5024459416701497e-05, 'Exterior1st[T.Plywood]'),
 (1.1642798453559422e-05, 'Neighborhood[T.ClearCr]'),
 (1.1642798453559422e-05, 'Condition1[T.RRAn]'),
 (8.36238390500288e-06, 'LandSlope[T.Sev]'),
 (5.061740610323717e-06, 'HouseStyle[T.SLvl]'),
 (1.6351708137518628e-06, 'Heating[T.Grav]'),
 (0.0, 'Utilities[T.NoSeWa]'),
 (0.0, 'Street[T.Pave]'),
 (0.0, 'RoofStyle[T.Shed]'),
 (0.0, 'RoofStyle[T.Mansard]'),
 (0.0, 'RoofMatl[T.WdShngl]'),
 (0.0, 'RoofMatl[T.WdShake]'),
 (0.0, 'RoofMatl[T.Tar&Grv]'),
 (0.0, 'RoofMatl[T.Roll]'),
 (0.0, 'RoofMatl[T.Metal]'),
 (0.0, 'RoofMatl[T.Membran]'),
 (0.0, 'RoofMatl[T.CompShg]'),
 (0.0, 'Neighborhood[T.StoneBr]'),
 (0.0, 'Neighborhood[T.Blueste]'),
 (0.0, 'MSZoning[FV]'),
 (0.0, 'LandContour[T.Low]'),
 (0.0, 'Heating[T.OthW]'),
 (0.0, 'HeatingQC[T.Po]'),
 (0.0, 'Exterior2nd[T.Other]'),
 (0.0, 'Exterior2nd[T.CmentBd]'),
 (0.0, 'Exterior2nd[T.CBlock]'),
 (0.0, 'Exterior1st[T.Stone]'),
 (0.0, 'Exterior1st[T.ImStucc]'),
 (0.0, 'Exterior1st[T.CemntBd]'),
 (0.0, 'Exterior1st[T.CBlock]'),
 (0.0, 'Exterior1st[T.AsphShn]'),
 (0.0, 'ExterCond[T.Po]'),
 (0.0, 'Electrical[T.Mix]'),
 (0.0, 'Electrical[T.FuseP]'),
 (0.0, 'Condition2[T.RRAn]'),
 (0.0, 'Condition2[T.RRAe]'),
 (0.0, 'Condition2[T.PosN]'),
 (0.0, 'Condition2[T.PosA]'),
 (0.0, 'Condition2[T.Norm]'),
 (0.0, 'Condition1[T.RRNn]'),
 (0.0, 'BsmtFinType2[T.GLQ]'),
 (0.0, 'BsmtExposure[T.Gd]'),
 (0.0, 'BsmtCond[T.Po]'),
 (-1.6452437345826354e-06, 'LotShape[T.IR3]'),
 (-1.6452437345826354e-06, 'LotConfig[T.FR2]'),
 (-6.550838236418777e-06, 'BsmtFinType2[T.LwQ]'),
 (-8.206237796626326e-06, 'MSZoning[RL]'),
 (-1.1486652344960824e-05, 'MasVnrType[T.None]'),
 (-1.829962080390768e-05, 'ExterCond[T.Gd]'),
 (-0.00039389970920422623, 'Neighborhood[T.IDOTRR]'),
 (-0.0003972003524990164, 'HeatingQC[T.Gd]'),
 (-0.00039898167856100564, 'MSZoning[C (all)]'),
 (-0.00039898167856100564, 'Exterior2nd[T.AsphShn]'),
 (-0.0004006269222956993, 'Condition1[T.PosN]'),
 (-0.00040226209310934014, 'LotConfig[T.FR3]'),
 (-0.00040226209310934014, 'BldgType[T.TwnhsE]'),
 (-0.0004039073368440338, 'Condition1[T.Feedr]'),
 (-0.0004054729811588942, 'BsmtFinType1[T.Rec]'),
 (-0.00040556273640424134, 'MSZoning[RH]'),
 (-0.0004104683309060775, 'HouseStyle[T.2Story]'),
 (-0.0004121641896576156, 'MSZoning[RM]'),
 (-0.0004123354123354295, 'LandContour[T.HLS]'),
 (-0.0004123354123354295, 'Exterior1st[T.BrkComm]'),
 (-0.0004153439519926083, 'Exterior2nd[T.Wd Sdng]'),
 (-0.0004153439519926083, 'BsmtQual[T.Gd]'),
 (-0.00041706978415956275, 'Foundation[T.PConc]'),
 (-0.0004187365007183308, 'Neighborhood[T.Sawyer]'),
 (-0.0007925831859564303, 'ExterQual[T.Gd]'),
 (-0.0007978172018737739, 'Exterior1st[T.WdShing]'),
 (-0.0007995985279357631, 'Condition1[T.Norm]'),
 (-0.0008012437716704568, 'RoofStyle[T.Gambrel]'),
 (-0.0008012437716704568, 'MasVnrType[T.Stone]'),
 (-0.0008012437716704568, 'Exterior2nd[T.BrkFace]'),
 (-0.0008012437716704568, 'Exterior1st[T.BrkFace]'),
 (-0.0008012437716704568, 'ExterCond[T.Fa]'),
 (-0.0008061795857789988, 'LotShape[T.IR2]'),
 (-0.0008078248295135815, 'ExterQual[T.TA]'),
 (-0.000811085180280724, 'CentralAir[T.Y]'),
 (-0.000819261444662045, 'Exterior1st[T.Wd Sdng]'),
 (-0.0008209270838072102, 'BsmtQual[T.TA]'),
 (-0.0008210484647802607, 'Neighborhood[T.CollgCr]'),
 (-0.000844845560493801, 'RoofStyle[T.Gable]'),
 (-0.0008513463615338335, 'RoofStyle[T.Hip]'),
 (-0.0008563728369954671, 'Neighborhood[T.Edwards]'),
 (-0.0011967988804346685, 'BsmtExposure[T.No]'),
 (-0.0012017346945434326, 'KitchenQual[T.Gd]'),
 (-0.0012067834645470565, 'Foundation[T.CBlock]'),
 (-0.0012135791840058863, 'HouseStyle[T.1Story]'),
 (-0.0012170503831491208, 'HouseStyle[T.SFoyer]'),
 (-0.0012216853779362102, 'HeatingQC[T.TA]'),
 (-0.0012335601632731397, 'LotConfig[T.CulDSac]'),
 (-0.001597415729809537, 'BldgType[T.Duplex]'),
 (-0.0016007367685145768, 'Electrical[T.SBrkr]'),
 (-0.0016138746940125293, 'BsmtFinType1[T.LwQ]'),
 (-0.0016206670564972159, 'BsmtFinType2[T.Unf]'),
 (-0.0020095799194672637, 'KitchenQual[T.TA]'),
 (-0.0020379382829410764, 'Neighborhood[T.NAmes]'),
 (-0.00204289836352034, 'LandSlope[T.Mod]'),
 (-0.002430106993043246, 'LotConfig[T.Inside]'),
 (-0.002802516371891106, 'Exterior1st[T.HdBoard]'),
 (-0.002805837410596368, 'BsmtFinType1[T.GLQ]'),
 (-0.002805857408697765, 'BsmtFinType1[T.BLQ]'),
 (-0.0028075230478428193, 'Neighborhood[T.Timber]'),
 (-0.0028208824338076255, 'Neighborhood[T.NWAmes]'),
 (-0.0028241830771024157, 'Exterior1st[T.MetalSd]'),
 (-0.0032197381858668495, 'BsmtExposure[T.Mn]'),
 (-0.0036186790718868433, 'Exterior2nd[T.HdBoard]'),
 (-0.0036387116771509076, 'Exterior2nd[T.MetalSd]')]

[(0.0044766157707263332, ‘GarageFinish[T.RFn]’)
(0.0044717006085596145, ‘Neighborhood[T.NAmes]’)
(0.0040776544953117222, ‘MasVnrType[T.BrkFace]’)

GarageFinish (Rough Finished), Neighbourhood (North Ames) and MasVnrType (BrickFace) have the greatest impact on predicting whether the sale will be Abnorml.