Some Graphical Comparison and Results (with fitted polynomials) of Naive Bayes, Logistic Regression and Linear Regression

This article provides results, graphical plots and analysis of Linear, Logistic Regression, Naive Bayes for three kinds of datasets, namely, (i) Linearly Seperable, (ii) Non-Linearly Seperable and (iii) Banana Data. This article plays a good role for new mentors, academicians and students to understand the process of experimentations and analysis, result presentation in a Machine Learning tasks. It also plots the classifier hyperplane along with the distinct class in this 2-class problem. The results are not benchmarks, they are elaborated for experimental setups in labs, and to be followed by mentors teaching Machine Learning Lab Work. For benchmark results look into peer-reviewed research papers.

RESULTS & PLOTS

Here some results obtained after the dividing the data into 30-70 ratio of testing and training respectively. This was done for 5 folds. Finally, average was taken. This has been done for all thre classifiers who’s results are discussed below.

The following table shows results for linearly seperable data for the three classifiers (i) Linear Regression (ii) Logistic Regression and (iii) Naïve Bayes.  The results are averages over 5 folds and are comparable.

Plots of original Data

Following are the plots of given data.

Linearly Separable Data

Non Linearly Seperable Data plot

Banana Data

LS Data Results  (Note the data are averages of 5 fold data)

 TPTNFPFNPrecisionRecallF-ValueAccuracy
Linear Regression750750001111
Logistic regression749.6 750     0    0.410.9995   0.9997   0.9997
Naïve Bayes750750001111

The following table shows results for non  linearly seperable data for the three classifiers (i) Linear Regression (ii) Logistic Regression and (iii) Naïve Bayes.  The best results are with Linear and logistic regression.

NLS Data Results (Note the data are averages of 5 fold data)

 TPTNFPFNPrecisionRecallF-ValueAccuracy
Linear Regression735.5  734.8000    15.2000  14.6000   0.97970.9805   0.9801   0.9801
Logistic regression733.8733.2000  16.8000  16.2000   0.97760.9784   0.9780   0.9780
Naïve Bayes72172629240.9650.9650.8880.96

The following table shows results for banana data for the three classifiers (i) Linear Regression (ii) Logistic Regression and (iii) Naïve Bayes.  The results are averages over 5 folds and Naive Bayes gives the best results.

 TPTNFPFNPrecisionRecallF-ValueAccuracy
Linear Regession 626.4000  633.8000  122.2000  116.6000    0.8370 0.8431          0.8399          0.8407
Logistic Regression66551524078    0.74250.8950  0.8094  0.7876
Naïve Bayes66666696720.880.880.8880.88

The Surface and figures

The surfaces and figures of the discriminant are given below.

The surface for Linearly Seperable Data using linear regression

Below are the two plots of linearly separable data surface generated by linear regression. The two plots have different axis coordinates.

The surface for  Non Linearly Seperable data using linear regression

Below are the two plots of non linearly separable data surface generated by linear regression. The two plots have different axis coordinates.

The surface for Banana Data using linear regression

Below are the two plots of banana  data surface generated by linear regression. The two plots have different axis coordinates.

The Surface of LS Data uisng Logistic regression

Below are the two plots of linearly separable data surface generated by logistic regression. The two plots with different runs and different training and testing data.

The following are surface generated over two runs.

The Surface of NLS Data with Logistic Regression

Below are the two plots of non linearly separable data surface generated by logistic regression.  

The Surface of BananaData with Logistic Regression

Below are the two plots of banana data surface generated by logistic regression. The two plots with different runs and different training and testing data.

(The outputs are over two runs)

More Results: On Segmented Data

Now the samples of training sizes (500, 1000, 1500,….,4500) and {{5000}-{training-size}} as the testing size is being analyzed here. I iterate over with incrementing each time the partition(train-test ratio) percentage by  0.1% of 5000 to .2% of 5000 and so on to -9% of 5000, remaining to be for testing. This is how we will get the training size of (500, 1000, 1500, ……,4500)and testing ratio as 5000-{training-size}.

Average time over 5 such samples of each of  the given training size. Following are the experiment results of execution times. Of different algorithms and different data sets.

                                 

RESULTS

Logistic Regression

I have converted data to 1 and 0 class instead of 1 and -1 class before using logit. Following are the results of logistic regression of all the three data samples (i)Banana Data (ii) NLS Data and (iii) LS Data.

In the below table the training sizes are given and the corresponding testing sizes are 5000-{training size}

Testing Sample45004000350030002500200015001000500
Banana Data3.5992     2.7031     2.8467     2.6833    1.46481.1170   1.0087    0.6808   0.4955   
NLS Data3.5631     3.3242     3.4129     3.1384     3.3511  1.5047        1.4308   0.8716   0.5782  
LS Data0.4260     0.3063     0.1808  0.6101     0.3813  0.0842   0.2085     0.0642     0.0364

Linear Regression

Following are the results of linear regression of all the three data samples (i)Banana Data (ii) NLS Data and (iii) LS Data

Training Sample45004000350030002500200015001000500
Banana Data  000000.0633000
NLS Data  00.0989000.19000.0570000
LS Data0          0          0          0          0  0.0566        000

Naïve Bayes

To use Naïve Bayes I have used discretizer present in weka before applying Naïve Bayes. Following are the results of Nayes Bayes of all the three data samples (i)Banana Data (ii) NLS Data and (iii) LS Data.

Training Sample45004000350030002500200015001000500
Banana Data  000.020.020.02000.020
NLS Data  0000.020.020.0200.020.02
LS Data00.02000.020000.02

First Order Polynomial Fitting for NLS Data:-

I have used polyfit and polyval for this part

Each is linear in number of Samples . ie O(N).

  1. Linear Regression for degree 1 polynomial fitting with NLSData

I have computer x = x1*x1+y1*y1

And used polyfit the parameter with 1 degree polyfit are:-

-0.0237x+ 1.4974. Graph Below

With x1 I am getting the parameter with 1 degree polyfit are:-

-0.2301 x+ 1.5416. Graph below

With x2 I am getting the parameter with 1 degree polyfit are:-

  1. X+  2.0415  .
  • Logistic Regression  with degree 1 polynomial fitting for NLSData

I have computer x = x1*x1+y1*y1

And used polyfit the parameter with 1 degree polyfit are:-

 -0.0081x+ 1.2010

With x1 I am getting the parameter with 1 degree polyfit are:-

-0.1628x+ 1.4918

With x2 I am getting the parameter with 1 degree polyfit are:-

-0.1684x+ 1.4518

  • Logistic Regression  with degree 2 polynomial fitting for NLSData

x = x1*x1+y1*y1

And used polyfit the parameter with 1 degree polyfit are:-

 0+-0.0151 x+1.4139

This is almost linear

With x1 I am getting the parameter with 1 degree polyfit are:-

-0.0041x^2+-0.1254x + 1.4189

This is almost linear

With x2 I am getting the parameter with 1 degree polyfit are:-

0.0003x^2 -0.17702x+ 1.5425

This is also almost straight line

So polynomial of degree 1 fits in Logistic regression and similar results were coming for linear regression

Linear RegressionParameters of polyfit for degree 1 with x = x1*x1+y1*y1Parameters of polyfit for degree 1 with x = x1Parameters of polyfit for degree 1 with x = x2
NLS Data-0.0237x+ 1.4974-0.2301x+ 1.54161.709x+2.0415 
LS Data -0.0018x+ 1.2346-0.1477x+ 1.5013-0.1794x+1.9014
Banana Data-0.0129x+ 2.2202-0.2264x+ -1.40320.9265x+ -1.6075
Logistic RegressionParameters of polyfit for degree 1 with x = x1*x1+y1*y1Parameters of polyfit for degree 1 with x = x1Parameters of polyfit for degree 1 with x = x2
NLS Data-0.0081x+ 1.2010  -0.1628x+ 1.4918-0.1684x+1.4518  
LS Data -0.0020x+ 1.0548-0.0794x+ 1.2986-0.0785x+1.2847
Banana Data-0.0021x+ 0.85050.0438x+ 0.55580.0191x+ 0.5530

Newtons Logistic Regression

The following are results obtained using the logistic regression using Newton’s method  classifier. I have split the datasets into 70% training and 30% testing randomly in 5 folds. Below are the details of how we choose these 5 folds.

Here the first weight represent the bias

    sum = sum + W_Old(j+1)*trainingSet(t,j);           

    P(t) = 1/(1+ exp(-1*sum));                          

    Z= (X’) * (Y-P)’;

    W = diag(P.* (1-P));

    Hessian =  X’ * W * X;

    etaMatrix(1:3) = eta;

    W_New = W_Old + etaMatrix .* (Hessian \ Z)’  ;

Its derivation and complete form in two ways is given as:

    Parameters used for computations:

    Maximum Number of  numIteration =10000;

    eta = 0.5 and 0.2;  // both tested         

    errorBound = 0.0001;

The following are the final weights for the three data:-

 W0W1W2
 LS Data 8.4137 -0.4298 -0.4125
NLS Data26.2879-2.2458-2.1134
Banana Data0.64950.3533-0.1829

The following are the TP, TN , FP, FN, Precision, recall, accuracy  for the three data:

 TPTNFPFNPrecisionRecallF-ValueAccuracy
 LS Data750750001111
NLS Data737.6732.817.212.40.97720.98350.98350.9803
Banana Data630.6647.4108.6112.40.85310.84870.85090.8526

The Decision Boundaries of the three data sets are as follows:-

The following are the figues showing (i) the decision surface and (ii) decision surface and dat both. Note different colored are used to show different classes.

  1. Linearly Seperable  Data(Green is with + sign the positive and blue with –ve)

This following figure is for the plane of the Newtons method for linearly seperable data. Red dots shows the predicted line projected in 2D

This following  figure is for data for linearly separable data. 

This  following figure is for the plane of the Newtons method for linearly seperable data plotted in 3D

  1. NLS Data   (Green is with + sign the positive and blue with –ve)

This  following figure is for the data  for non linearly seperable data.

This  following figure is of the Newtons method for NLS seperable data. Red dots shows the predicted line projected in 2D.

This  figure is for the plane of the Newtons method for non linearly seperable data over two runs.

  1. Banana Data

This following  figure is for the separating plane of the Newtons method for banana data.

This  following figure is for the plane of the Newtons method for banana data projected to 2D.

This  figure is for the plane of the Newtons method for banana data. Red dots shows the predicted line projected in 2D

The results are self explanatory with the help of image data and are not benchmarks, they are elaborated for experimental setups in labs, and to be followed by mentors teaching Machine Learning Lab Work. For benchmark results look into peer-reviewed research papers.

Published by Nidhika

Hi, Apart from profession, I have inherent interest in writing especially about Global Issues of Concern, fiction blogs, poems, stories, doing painting, cooking, photography, music to mention a few! And most important on this website you can find my suggestions to latest problems, views and ideas, my poems, stories, novels, some comments, proposals, blogs, personal experiences and occasionally very short glimpses of my research work as well.

%d bloggers like this: