This article provides results, graphical plots and analysis of Linear, Logistic Regression, Naive Bayes for three kinds of datasets, namely, (i) Linearly Seperable, (ii) Non-Linearly Seperable and (iii) Banana Data. This article plays a good role for new mentors, academicians and students to understand the process of experimentations and analysis, result presentation in a Machine Learning tasks. It also plots the classifier hyperplane along with the distinct class in this 2-class problem. The results are not benchmarks, they are elaborated for experimental setups in labs, and to be followed by mentors teaching Machine Learning Lab Work. For benchmark results look into peer-reviewed research papers.

# RESULTS & PLOTS

Here some results obtained after the dividing the data into 30-70 ratio of testing and training respectively. This was done for 5 folds. Finally, average was taken. __This has been done for all thre classifiers who’s results are discussed below.__

The following table shows results for linearly seperable data for the three classifiers (i) Linear Regression (ii) Logistic Regression and (iii) Naïve Bayes. The results are averages over 5 folds and are comparable.

__Plots of original Data__

Following are the plots of given data.

### Linearly Separable Data

Non Linearly Seperable Data plot

LS Data Results (Note the data are averages of 5 fold data)

TP | TN | FP | FN | Precision | Recall | F-Value | Accuracy | |

Linear Regression | 750 | 750 | 0 | 0 | 1 | 1 | 1 | 1 |

Logistic regression | 749.6 | 750 | 0 | 0.4 | 1 | 0.9995 | 0.9997 | 0.9997 |

Naïve Bayes | 750 | 750 | 0 | 0 | 1 | 1 | 1 | 1 |

The following table shows results for non linearly seperable data for the three classifiers (i) Linear Regression (ii) Logistic Regression and (iii) Naïve Bayes. The best results are with Linear and logistic regression.

NLS Data Results (Note the data are averages of 5 fold data)

TP | TN | FP | FN | Precision | Recall | F-Value | Accuracy | |

Linear Regression | 735.5 | 734.8000 | 15.2000 | 14.6000 | 0.9797 | 0.9805 | 0.9801 | 0.9801 |

Logistic regression | 733.8 | 733.2000 | 16.8000 | 16.2000 | 0.9776 | 0.9784 | 0.9780 | 0.9780 |

Naïve Bayes | 721 | 726 | 29 | 24 | 0.965 | 0.965 | 0.888 | 0.96 |

The following table shows results for banana data for the three classifiers (i) Linear Regression (ii) Logistic Regression and (iii) Naïve Bayes. The results are averages over 5 folds and Naive Bayes gives the best results.

TP | TN | FP | FN | Precision | Recall | F-Value | Accuracy | |

Linear Regession | 626.4000 | 633.8000 | 122.2000 | 116.6000 | 0.8370 | 0.8431 | 0.8399 | 0.8407 |

Logistic Regression | 665 | 515 | 240 | 78 | 0.7425 | 0.8950 | 0.8094 | 0.7876 |

Naïve Bayes | 666 | 666 | 96 | 72 | 0.88 | 0.88 | 0.888 | 0.88 |

## The Surface and figures

The surfaces and figures of the discriminant are given below.

### The surface for Linearly Seperable Data using linear regression

Below are the two plots of linearly separable data surface generated by linear regression. The two plots have different axis coordinates.

### The surface for Non Linearly Seperable data using linear regression

Below are the two plots of non linearly separable data surface generated by linear regression. The two plots have different axis coordinates.

### The surface for Banana Data using linear regression

Below are the two plots of banana data surface generated by linear regression. The two plots have different axis coordinates.

__The Surface of LS Data uisng Logistic regression__

Below are the two plots of linearly separable data surface generated by logistic regression. The two plots with different runs and different training and testing data.

__The following are surface generated over two runs.__

__The Surface of NLS Data with Logistic Regression__

Below are the two plots of non linearly separable data surface generated by logistic regression.

__The Surface of BananaData with Logistic Regression__

Below are the two plots of banana data surface generated by logistic regression. The two plots with different runs and different training and testing data.

(The outputs are over two runs)

## More Results: On Segmented Data

Now the samples of training sizes (500, 1000, 1500,….,4500) and {{5000}-{training-size}} as the testing size is being analyzed here. I iterate over with incrementing each time the partition(train-test ratio) percentage by 0.1% of 5000 to .2% of 5000 and so on to -9% of 5000, remaining to be for testing. This is how we will get the training size of (500, 1000, 1500, ……,4500)and testing ratio as 5000-{training-size}.

**Average time over 5 such samples** **of each of the given training size**. Following are the experiment results of execution times. Of different algorithms and different data sets.

## RESULTS

### Logistic Regression

I have converted data to 1 and 0 class instead of 1 and -1 class before using logit. Following are the results of logistic regression of all the three data samples (i)Banana Data (ii) NLS Data and (iii) LS Data.

In the below table the training sizes are given and the corresponding testing sizes are 5000-{training size}

Testing Sample | 4500 | 4000 | 3500 | 3000 | 2500 | 2000 | 1500 | 1000 | 500 |

Banana Data | 3.5992 | 2.7031 | 2.8467 | 2.6833 | 1.4648 | 1.1170 | 1.0087 | 0.6808 | 0.4955 |

NLS Data | 3.5631 | 3.3242 | 3.4129 | 3.1384 | 3.3511 | 1.5047 | 1.4308 | 0.8716 | 0.5782 |

LS Data | 0.4260 | 0.3063 | 0.1808 | 0.6101 | 0.3813 | 0.0842 | 0.2085 | 0.0642 | 0.0364 |

### Linear Regression

Following are the results of linear regression of all the three data samples (i)Banana Data (ii) NLS Data and (iii) LS Data

Training Sample | 4500 | 4000 | 3500 | 3000 | 2500 | 2000 | 1500 | 1000 | 500 |

Banana Data | 0 | 0 | 0 | 0 | 0 | 0.0633 | 0 | 0 | 0 |

NLS Data | 0 | 0.0989 | 0 | 0 | 0.1900 | 0.0570 | 0 | 0 | 0 |

LS Data | 0 | 0 | 0 | 0 | 0 | 0.0566 | 0 | 0 | 0 |

### Naïve Bayes

To use Naïve Bayes I have used discretizer present in weka before applying Naïve Bayes. Following are the results of Nayes Bayes of all the three data samples (i)Banana Data (ii) NLS Data and (iii) LS Data.

Training Sample | 4500 | 4000 | 3500 | 3000 | 2500 | 2000 | 1500 | 1000 | 500 |

Banana Data | 0 | 0 | 0.02 | 0.02 | 0.02 | 0 | 0 | 0.02 | 0 |

NLS Data | 0 | 0 | 0 | 0.02 | 0.02 | 0.02 | 0 | 0.02 | 0.02 |

LS Data | 0 | 0.02 | 0 | 0 | 0.02 | 0 | 0 | 0 | 0.02 |

**First Order Polynomial Fitting for NLS Data:-**

**I have used polyfit and polyval for this part**

**Each is linear in number of Samples . ie O(N).**

**Linear Regression for degree 1 polynomial fitting with NLSData**

**I have computer x = x1*x1+y1*y1**

**And used polyfit the parameter with 1 degree polyfit are:-**

**-0.0237x+ 1.4974. Graph Below**

**With x1 I am getting the parameter with 1 degree polyfit are:-**

**-0.2301 x+ 1.5416. Graph below**

**With x2 I am getting the parameter with 1 degree polyfit are:-**

**X+ 2.0415 .**

**Logistic Regression with degree 1 polynomial fitting for NLSData**

**I have computer x = x1*x1+y1*y1**

**And used polyfit the parameter with 1 degree polyfit are:-**

** -0.0081x+ 1.2010**

**With x1 I am getting the parameter with 1 degree polyfit are:-**

**-0.1628x+ 1.4918**

**With x2 I am getting the parameter with 1 degree polyfit are:-**

**-0.1684x+ 1.4518**

**Logistic Regression with degree 2 polynomial fitting for NLSData**

** x = x1*x1+y1*y1**

**And used polyfit the parameter with 1 degree polyfit are:-**

** 0+-0.0151 x+1.4139**

**This is almost linear**

**With x1 I am getting the parameter with 1 degree polyfit are:-**

**-0.0041x^2+-0.1254x + 1.4189**

**This is almost linear**

**With x2 I am getting the parameter with 1 degree polyfit are:-**

**0.0003x^2 -0.17702x+ 1.5425**

**This is also almost straight line**

**So polynomial of degree 1 fits in Logistic regression and similar results were coming for linear regression**

Linear Regression | Parameters of polyfit for degree 1 with x = x1*x1+y1*y1 | Parameters of polyfit for degree 1 with x = x1 | Parameters of polyfit for degree 1 with x = x2 |

NLS Data | -0.0237x+ 1.4974 | -0.2301x+ 1.5416 | 1.709x+2.0415 |

LS Data | -0.0018x+ 1.2346 | -0.1477x+ 1.5013 | -0.1794x+1.9014 |

Banana Data | -0.0129x+ 2.2202 | -0.2264x+ -1.4032 | 0.9265x+ -1.6075 |

Logistic Regression | Parameters of polyfit for degree 1 with x = x1*x1+y1*y1 | Parameters of polyfit for degree 1 with x = x1 | Parameters of polyfit for degree 1 with x = x2 |

NLS Data | -0.0081x+ 1.2010 | -0.1628x+ 1.4918 | -0.1684x+1.4518 |

LS Data | -0.0020x+ 1.0548 | -0.0794x+ 1.2986 | -0.0785x+1.2847 |

Banana Data | -0.0021x+ 0.8505 | 0.0438x+ 0.5558 | 0.0191x+ 0.5530 |

# Newtons Logistic Regression

The following are results obtained using the logistic regression using Newton’s method classifier. I have split the datasets into 70% training and 30% testing randomly in 5 folds. Below are the details of how we choose these 5 folds.

**Here the first weight represent the b ias**

sum = sum + W_Old(j+1)*trainingSet(t,j);

P(t) = 1/(1+ exp(-1*sum));

Z= (X’) * (Y-P)’;

W = diag(P.* (1-P));

Hessian = X’ * W * X;

etaMatrix(1:3) = eta;

W_New = W_Old + etaMatrix .* (Hessian \ Z)’ ;

Its derivation and complete form in two ways is given as:

** **** Parameters used for computations:**

Maximum Number of numIteration =10000;

eta = 0.5 and 0.2; // both tested

errorBound = 0.0001;

__The following are the final weights for the three data:-__

| W0 | W1 | W2 |

LS Data | 8.4137 | -0.4298 | -0.4125 |

NLS Data | 26.2879 | -2.2458 | -2.1134 |

Banana Data | 0.6495 | 0.3533 | -0.1829 |

__The following are the TP, TN , FP, FN, Precision, recall, accuracy for the three data:__

TP | TN | FP | FN | Precision | Recall | F-Value | Accuracy | |

LS Data | 750 | 750 | 0 | 0 | 1 | 1 | 1 | 1 |

NLS Data | 737.6 | 732.8 | 17.2 | 12.4 | 0.9772 | 0.9835 | 0.9835 | 0.9803 |

Banana Data | 630.6 | 647.4 | 108.6 | 112.4 | 0.8531 | 0.8487 | 0.8509 | 0.8526 |

__The Decision Boundaries of the three data sets are as follows:-__

The following are the figues showing (i) the decision surface and (ii) decision surface and dat both. Note different colored are used to show different classes.

(Green is with + sign the positive and blue with –ve)__Linearly Seperable Data__

This following figure is for the plane of the Newtons method for linearly seperable data. Red dots shows the predicted line projected in 2D

This following figure is for data for linearly separable data.

This following figure is for the plane of the Newtons method for linearly seperable data plotted in 3D

(Green is with + sign the positive and blue with –ve)__NLS Data__

This following figure is for the data for non linearly seperable data.

This following figure is of the Newtons method for NLS seperable data. Red dots shows the predicted line projected in 2D.

This figure is for the plane of the Newtons method for non linearly seperable data over two runs.

__Banana Data__

This following figure is for the separating plane of the Newtons method for banana data.

This following figure is for the plane of the Newtons method for banana data projected to 2D.

This figure is for the plane of the Newtons method for banana data. Red dots shows the predicted line projected in 2D

The results are self explanatory with the help of image data and are not benchmarks, they are elaborated for experimental setups in labs, and to be followed by mentors teaching Machine Learning Lab Work. For benchmark results look into peer-reviewed research papers.