Glimpse of Regression Codes in Matlab

Preprocessing: I have created three .mat files from the given input in form of a text file. I first imported it in xls and then copied it to .mat files.

NOTE: The derivation for linear and logistic regression is added at the end of the document.

Linear Regression Classifier

I  have split the datasets into 70% training and 30% testing randomly in 5 folds. I have used matlab method crossvalind to create 70-30 ratio of training and testing and did this 5 times. Finally I calculated the average over the five runs.

Brief discussion about Linear Regression Classifier

Now the details of form I have used for linear classifier are as follows:-


    X(1:trainLengthRow, 1)= 1;

    X(1:trainLengthRow, 2:trainLengthCol+1) = trainingSet;

    Z= X’* X;

    weights = (inv(Z)) *(X’ * Y);

Here the first weight represent the bias.

Code Decription

The code for classification using linear regression is written in Matlab. Here I am creating 5 folds randomly  with 70-30 ratio of training to testing.

Methods in code are as follows:-

  The processing starts by reading the data from mat file and creating  5 folds    . in 70-30 ratio for training and testing.

  1. train1LinearReg

Method to do train the Linear Regression Model


trainingSet: The training set

trainingLabels: The traingin labels corresponding to training Set

————Return Type——————-

weights: weights computed by linear regression as explained above

  • testLinearReg

This function tests the test set with testlabels and the computed outputs


testSet: The Test Set for testing

testLables: Corresponding test labels

weights: weights computed from the training


correctlyClassified: correctly classified number of samples

unClassified: 10 unclassified samples

v: vector that stores  TP ;TN;FP; FN ; P; R; F; Accuracy

count0: Number of class -1 unclassified upto max val of 5

count1: Number of class +1 unclassified upto max val of 5

Logistic Regression

The data are trained for logistic regression with following standard parameters:-

     Maximum number of numIteration =1000;

     eta = 0.5;           

     errorBound = 0.0001;

I tested over various values of eta. The following are results with eta value of 0.5.

Also I have written two codes for logistic regression one using the expanded approach and another using the shortned matrix manipulations. Results are similar for both the codes. Code is for two class problem. Here is brief description of both:-

The matrix version is as follows:-

         P(1:trainLengthRow) = 0;

    Y =  trainingLabels’;

    X(1:trainLengthRow , 1:trainLengthCol+1) = 0;

    X(1:trainLengthRow ,1) = 1;

    X(1:trainLengthRow ,2:trainLengthCol+1) =   trainingSet(1:trainLengthRow,    .    .       .                                                                       1:trainLengthCol); 

           sum = sum + W_Old(j+1)*trainingSet(t,j);

           P(t) = 1/(1+ exp(-1*sum( over values) )

    Z= (X’) * (Y-P)’;

    %computing the new weights

    W_New = W_Old + eta * Z’;


Code Details

The code settings are as described in the previous code details for linear regression. The ratio is 70-30 for training and I have used crossvalind with holdout parameter  of .3 for 30% testing and 70 % training data and then I have taken 5 folds of it. Following are the methods for training and testing the logistic regression.

Both the code that I have implemented have the same method interface with a slight difference of how weights are stored . The formulas are given above and in Appendix.

Method  TrainLogRegr:-

This mathod is for tarining the logestic regression problem


trainingSet: the training set

trainingLabels: the labels corresponding to the traiining set

weights: the initial weights obtained from traiining

weight0: The initial bias weight

—–Return Types——

weights: the final weights obtained from traiining

weight0: The bias weight

Method  TestLogRegr:-

This is the method that is called to test the accuracy of the methods


testSet: the set of samples to be considered for testing

testLabels: the labels corresponding to testset

weight0, weight: the weights corresponding to logistic regression

———————–Return Values————————

correctlyClassified: The number of correctly classified samples

unClassified: The array containing 5 unclassified data samples from each

classification type

v: The vecor that returns the computed values of  TP;TN; FP; FN ,P; R; F, accuracy

Published by nidhk

I have an eager research-based approach to solve problems in the domain of Artificial Intelligence and Computer Applications. I find solutions based on my strong knowledge and foundations in the subjects like Artificial Intelligence, Machine Learning, Data Mining, Optimization Techniques, Linear Algebra to mention a few. This is augmented by my high standard of coding skills which vary from C++, Java, Perl to Data Science languages such as Python, R and MATLAB. To further establish, it many of the my works have already been published online as research papers in well reputed journals. I have intense experience in Natural Language Processing applications such as summarization, search, retrieval, sentiment analysis, wordnet, deep learning. I have completed PhD specializing in Artificial Intelligence. Having worked on real time implementations of various applications of Computer Science. The domains that I have worked on are Health Care System, Electronic Document Management Systems, Natural Text Mining, EDA, Web Development etc. Apart from profession, I have inherent interest in writing especially poems, stories, doing painting, cooking, photography, music to mention a few!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: