YOUR SMILE MEANS THE WORLD TO ME MY LOVE

YOUR SMILE MEANS THE WORLD TO ME MY LOVE

Your smile means the world to me my love
When you smile I smile
My heart opens up..
What I world I enter into
When I hear you giggle
I am lost in your smile
I cant find myself

Your smile means the world to me my man
When you smile I smile
My heart opens up..
Colored sparkles raining through clouds
is all I see…
And is all I want to be wrapped into
yes the colored sparkles……..

There is no difference between you and me my love
I have merged in colors of your spirit my love
Yes you are so manly
I am so womanly
I felt that you being cruel to me
But I trying to understand, the way you express to me
Give me time baby

Good that you know me now
I can never hurt you my love
For you mean the world to me
For your world means the universe to me
For all that is dear to you is dear to me….
My love

Oh my love….I Thank you for all the love you showered on me ND could have offered me…….Your smile means the world to me, Oh my love…

YOUR LOVE MEANS THE SMILE ON MY LIPS MY LOVE..YOUR LOVE MEANS LIFE TO ME MY LOVE AND YOUR IT MEANS EVERTHIGN TO ME

Y

Photo by Maria Orlova on Pexels.com

THIS AIR I AM SENDING YOU MY LOVE

THIS AIR I AM SENDING YOU MY LOVE

This Air I am sending you my love

For remember I shall always love you

Through the day

From the golden sunlight that falls here

From the snow covered mountains

From the leaves filled with due

From the bank of this sparkling river

From the strands of my brown hairs

This air touches it all

This air has it all..

Through the night

From the white moonlight

From the waves that rise at night

From the gaps of silence that sorrounds

From the depth of my heart

From the echos of your voice that I hear in this voidness

This Air I am sending you my dear

For remember I shall always love you

With this air I am sending you my love oh my love

Catch it

Feel it

Smell it

Its For you my love –all my love

With this air I am sending you my love

MY LOST LOVE !!

My Lost Love



This unforeseen ring in my heart

This Feeling yes this feeling keeps ringing my heart

I want to hear your voice, But I cant

I want to see you But I cant

I want to just I cant explain



But yes the thought of your name

Makes me jump

It brings this joy I never had

This smile that never was mine

This dance I do

The songs that I sing

The love I share

The bliss I bring

All by just my heart rhyming your Name

Say that you thought of me

Say that you remembered me Just Now

Call my name

At-least once

If nothing else just come to me once





But yes the thought of your name

Makes me Me

It brings the world to me

Like these flowers I blossom

Like the rains I smile

I fly like breeze

Let me dream

For this dream brings me a New Life,

A life of Hopes

A life of desires

All by just my heart rhyming your Name



This unforeseen ring in my heart

Is ringing again and again

How Shall I answer it

Please tell…

Do tell…

Where are you

My Lost Love !!

A delicious way to eat Gourd with bread!

Recipe To Serve Two!

STEP 1.

  • Heat 4 tablespoon of mustard oil in a frying pan
  • When heat vapours come in pan put cumin in it (1/4th tablespoon)
  • Add chopped 2 onions in frying pan, fry it till it changes its color to light golden
  • Then add 1 chopped tomato in frying pan
  • Mix it well
  • Put some water in frying pan and let it cook till water dries up and oil is separated from mixture in frying pan

STEP 2

  • Peel 1 Gourd – I took long one, quantity as per size
  • Grind the peeled Gourd
  • Put it in frying pan-Make sure to close lid of to heating agent to off—– if grinding is taking time, else keep the ingredients ready for cooking
  • Put salt, red chili, grinded ginger, chopped garlic, turmeric powder, spices (cinnamon, pepper…..) as per taste, and/or green chili for more flavor

STEP 3

  • Mix the ingredients of frying pan
  • Add 1/4th glass of water to frying pan
  • Keep lid of frying pan closed for few minutes, keep opening it up and mixing it in between
  • Do it till it dry, it should not take more than 5 minutes of closed lid heating. Do keep checking by opening the lid.
  • And here it is ready to eat with breads-I made wheat bread on another pan, some with oil while others without oil.

Dried mixture in frying pan

Some water added and heated with closed lid on slow flame

Dish is ready, dry and hot to be served

Here it is Ready to Eat with Bread!

~Cabbage-Carrot Fried Mix for a Low Calorie High Nutrition and Delicious Lunch for Diet Freaks~

Many times we want to eat nutritious vegetables but it becomes quite mundane to eat same things daily especially uncooked vegetables. So here is a way to eat delicious food – full of vegetables and just little oil + no other carbohydrates ! Just Vegetables, Spices and Taste!

It can be taken as it is — if you are on diet or you may stuff it in in a burger or any kind of bread !

STEP1

Here are the ingredients for Cabbage-Carrot Fried Mix

  1. Two chopped onions, One chopped tomatoes, finely chopped cabbage-half, three small greenish tomatoes (if available, otherwise put normal tomato), finely chopped carrot (one)
  2. Grinded garlic and ginger as per taste
  3. Spices as per taste

STEP 2

  1. Take few spoons full of cooking oil in frying pan–as per taste – I took 4 spoons
  2. Heat the oil
  3. Put half table spoon of cumin and half table spoon black pepper
  4. Put chopped onions in pan and fry till it turns slightly golden
  5. Put chopped tomatoes in it
  6. Keep it moving and put water in it and let tomatoes become soft

STEP 3

  1. Put chopped cabbage in frying pan
  2. Put salt, red chilli, other spices as per taste
  3. Put chopped ginger and garlic as per taste
  4. Mix it and put water in it
  5. Heat it and put grinded carrot in pan
  6. Heat it till cabbage and carrot becomes soft
  7. Keep putting water and moving the mixture in pan
  8. Keep lid of pan closed for better effects

STEP 4

  1. Once vegetables are soft as per taste needed, cruncy, or mashed up or super soft.
  2. Dry the pan by evaporating the water by heat
  3. Once dried, the fried mix is ready to eat
  4. Take it in bowl and mix coriander leaves on it
  5. Serve it hot

Sumptuous Brinjal Fry Mix!

Here is a way to eat the iron rich brinjal. It is ready to eat with any bread in just 3 simple steps! Serve it Hot!

Ingredients: Two onion, two tomatoes, one brinjal, spices, oil.

STEP 1

  • Clean a brinjal of moderate size
  • Heat it directly on the burner
  • Keep changing sides
  • Make sure full brinjal is heated on burner
  • Check by putting a fork in brinjal. Fork should fully and easily go inside the brinjal. This shows brinjal is ready
  • Take it off the burner
  • Take outer cover of brinjal
  • Wash it
  • Mash the brinjal

STEP 2

  • Heat oil as per taste in fry pan
  • Put a pinch of cumin in hot oil
  • Chop two onions and put it in this fry pan without burning cumin to black
  • Fry till onions golden in shade

STEP 3

  • Now put one finely chopped tomatoes in frying pan
  • Put salt, chilli and spices as per taste. It comes in market as mixture in a packet of ready to use vegetable Spices.
  • Mix and fry the mixture
  • Put the mashed brinjal in pan
  • Mix it well and thoroughly so that the mixture is even
  • Once it is fully steamed
  • The dish is ready to be served hot!

Nutritious and delicious Cauliflower Fried Mix!

Recipe

Simple steps to eat this delicious vegetable fried mix.

Ingredients: Two onion, two tomatoes, one cauliflower, two capsicum, spices.

  • Heat oil as per taste in fry pan
  • Put a pinch of cumin in hot oil
  • Chop two onions and put it in this fry pan without burning cumin to black
  • Fry till onions golden in shade
  • Put two finely chopped tomatoes in it
  • Put salt, pepper, red chilli, green chilli, ginger powder and turmeric powder (a pinch), pinch of fennel powder, grinded coriander seeds, cinnamon powder and cloves and other spices as per taste. It comes in market as mixture in a packet of ready to use vegetable Spices.
  • Mix and fry the mixture
  • Put some water in it and let it dry till oil separates from the mixture
  • Grind one cauliflower and put it in the fry fan
  • Grind two capsicum and put in the same pan
  • Mix well
  • Put some water in pan so as to soak the mixture, don’t over add water, it will cause over mashing of the dish
  • Boil on slow gas with a lid of fry pan closed
  • Check if the mixture is soft, once so, dry the remaining water by heating on high flame
  • The dish is ready!

Eat with any kind of bread or on its own. Do serve it Hot!

Feature Selection Techniques using Evolutionary Algorithms

This article describes application of Evolutionary Algorithms to the task of Feature Selection. In particular the algorithms studies in this article are Particle Swarm Algorithm (PSO) and Genetic Algorithm (GA). The results are based on particular parameters used in experimentation. Here several parameters are analyzed for this problem on two datasets:

  1. Leukemia dataset (LIBSVM Data: Classification (Binary Class) (ntu.edu.tw))
  2. Colon Cancer dataset (LIBSVM Data: Classification (Binary Class) (ntu.edu.tw))
  3. Educational data mining   data set: kdd2010 bridge-toalgebra (kddb) dataset.

The purpose of this article is to show comparative analysis of experiments and that choice of kernel depends both on datasets–the kind of problem, the number of iterations performed, parameters used and more specifically aim of the problem. Here the aim of problem is feature selection which means reducing the dataset to lesser number of features while retaining the accuracy. The fitness function for this problem is a weighted mean of accuracy and number of features. In the following experiments weights are so changed that number of features selected remains below or equal to 20. Hence a decrease in accuracy is noticed below, given the fact the the number of epocs are not changed. The following svm kernels are experimented.

Further, results depend on lot of parameters and this article is the illustration of results in one experimental setup and are not standard results. The results are not benchmarks, they are elaborated for experimental setups in labs, and to be followed by mentors teaching Machine Learning Lab Work. For benchmark results look into peer-reviewed research papers.

  1. Linear
  2. Radial Basis Function (RBF)
  3. Polynomial Function

This article just pays focus on the way experiments are to be performed and analyzed and some results and do not play a role in benchmarking results or changing existing theories. It is just for elaboration for the purpose of academicians and students to learn the art of experimentation for feature selection using SVM, GA and PSO.

Experiment 1: Leukemia Dataset, Linear SVM

Linear Kernel.

The final accuracy reached was:   100

Number of features selected: 9

The following shows the graph of fitness function versus epochs

The final accuracy reached was: 100

Experiment 2: Leukemia Dataset, RBF Kernel

Radial Basis Kernel. The accuracy attained is : 97.222. The following shows the graph of fitness function versus epochs.

The accuracy reached was 97.222 for 38 features in 2000 epocs. Best 10 features were obtained for SVM . After 1400 epocs there was no much decrease in value of optimizing function.

Experiment 3: Leukemia Dataset, Polynomial Kernel

The fitness chosen is with higher weightage to selected features than to accuracy. The final accuracy reached was:    87.5

iterations: 2000

Number of features: 10

The following shows the graph of fitness function versus epochs

Experiment 4: Colon Cancer Dataset, Linear SVM

The final accuracy reached was:  98.3871

Number of features: 19

Top best 19 features were obtained for linear SVM . After 700 epocs there was no much decrease in value of fitness function. The following shows the graph of fitness function versus epochs

The following graph shows in red the number of features, blue the accuracy, green the minimization of fitness value. It is clear that after 800 epocs no minimization of features occurs , hence the algorithm has converged to 19 features as optimal.

Experiment 5: Colon Cancer Dataset, RBF Kernel

Radial Basis Kernel. The accuracy obtained is 91.935

The fitness chosen is with higher weightage to selected features than to accuracy. The final accuracy reached was:  91.935 for 14 features. After 700 epocs there was no much decrease in value of optimizing function.

Experiment 6: Colon Cancer Dataset, Polynomial SVM

The fitness chosen is with higher weightage to selected features than to accuracy.

The final accuracy reached was:   76

iterations: 1000

Number of features obtained: 12

 Obtained best 19 features . After 900 epocs there was not much decrease in value of fitness function. The following shows the graph of fitness function versus epochs

The following graph shows in red the number of features, blue the accuracy, green the minimization of fitness value. It is clear that after 900 epocs not much minimization of features occurs , hence the algorithm has converged to 12 features as optimal.

Experiment 7: Huge Dataset of 3,00,00,000 features

Total Data Testing accuracy: = 87.4392%

Accuracy  after selecting all 30000 features with nnz  :  88.4387%

The total number of features is approximately 3 crores. It took 2 days on the given system to calculate F-Score. When looking closely at data. The data has lot of sparseness. So only those features were taken which are non zeros i.e. non sparse features as sparse features wont contribute to discriminant. This has been used as a filtering method. A lot of time was spend in calculating f scores to filter the data.

Accuracy with non sparse data arranged in increasing value of f-scores

Total  Accuracy =

   5000 features: 88.4387

   10000 features : 88.0772

   15000 features :  87.7453

   20000 features :87.4188

   25000 features :  87.1324

   30000 features : 86.9255

Results of different SVM classifiers by varying  the dimensions in the range from 1000 to 30000 in steps of 5000.

 LiblinearSvm RBFSVM polynomial
5000 features   88.438788.77288.7725
10000 features   88.077288.77288.7725
15000 features   87.745388.77288.7725
20000 features   87.418888.77288.7725
25000 features   87.132488.77288.7725
30000 features   86.925588.77288.7725

Plot of the Results of different SVM classifiers by varying  the dimensions in the range from 1000 to 30000 in steps of 5000

More reduction in data features after 5000 features using PSO

PSO with C1=3, C2=3

Min number  of  features obtained:  415

population size=5

Accuracy: 75

Iterations:500

Convergence graph is given as follows:

PSO with alpha=0.8 and beta a= 0.2 ,C1=3, C2=3

Iterations:500

popsize=5

testing accuracy: 83.66

Min number of Features Obtained= 868

More reduction in data features after 5000 features using GA

GA 500 epochs

population size=5

Accuracy reached: 88.2

Minimum no of features:  1999

More reduction in data features after 5000 features using GA two times

Double application of GA:    79.398%

min features: 199

More reduction in data features after 5000 features using GA 3000 epocs

GA 3000 epochs

Reduced Min Number of features obtained : 367, more could be reduced by more epocs

Accuarcy: 86.563%

Convergence graph is as follows:

More reduction in data features after 5000 features using Forwards Selection Wrapper Method

forward selection:-

Reduced number of minimum features: 197

testing accuracy: 88.6904

More reduction in data features after 5000 features using Backward Selection Wrapper Method

This method did not performed well in reducing the number of feature much

These results show there is a tradeoff between accuracy and number of features selected. While accuracy also depends on the epocs, fitness function optimal value changes on change of fitness function which is taken as weighted mean targeting higher accuracy and lower feature subset on the training data. Further this also shows higher complex fitting of hyperplane may take more time to converge as is considered in experimentation. Best method seem to be forward selection one for reducing after filtering. Further, results depend on lot of parameters, filtering and wrapping techniques followed as pre-processing and post-processing and this article is the illustration of results in one experimental setup and are not standard results.

Function Optimization using Genetic Algorithm and Particle Swarm Optimization-Results and Comparison

In this article some experiments and their results are discussed for minimization of Rastrigin function, a famous mathematical function used in Optimization Techniques evaluation. The experiments are performed on certain setup, parameters and system. This has been performed using Genetic Algorithm (GA) and Particle Swarm Optimization (PSO). The fitness function is the given function to be optimized. This article just pays focus on the way experiments are to be performed and analyzed in students lab and by mentors teaching Machine Learning results do not play a role in benchmarking or changing existing theories. It is just for elaboration for the purpose of academicians and students to learn the art of optimizing function with the help of GA and PSO.

This function is given by

The function is usually evaluated on xi ∈ [-5.12, 5.12], for all i = 1, …, n . This has been tried on 5 different values of n between 10 and 100 as follows. Taking N as the size of the chromosome for GA and PSO implementation and population size of 10.

The Matlab implementation of fitness function is

%Fitness function
function z= testfunction(x)
%z= (x’x);
[M,N]=size(x);
z=[];
sum =0;
for j=1:M
for i = 1:N
sum = sum + (x(j,i)x(j,i) – 10 * cos (23.14x(j,i)));
end
z= [z;sum + 10*N]
end

The following table shows the experimental results.

NGA Final Minimum Value after 500 iterationsPSO Global Minimum Value after 500 iterations with C1=1, C2=2PSO Global Minimum Value after 500 iterations with C1=3, C2=3PSO Global Minimum Value after 500 iterations with C1=3, C2=2PSO Global Minimum Value after 500 iterations with C1=2, C2=3
1027.8663(500 epochs)/ 9.1 in 1000 epochs12.723  4.00028.46097.0004
20113.9635(500 epochs)74.58413.00131.87929.679
40399.6707 (500 epochs) / 235 in 1000 epochs   191.02 43.23 127.1 115.58
60918.9643(500 epochs)/ 619 (1000 epochs)372.296.065193.74189.75
801392.1043(500 epochs)    489.16220.28314.04286.96

The general analysis of results is that PSO is performing better than GA here given the same number of iterations. Though GA solutions are gradually decreasing while PSO solutions are oscillating.  Within PSO the c1=3, c2=3 seems to perform best, c1=1 and c2=2 seem to perform the worst in achieving global minimum. More details and comments below.

Num of chromosomes : 10

One point crossover GA., with 5% mutation rate.

Maximum  number of iterations: 500.

Graphical Visualization of Results

Command for plots:

plot(ga.iters,ga.minc,’:’,pso1.iters,pso1.minc,’green’, pso2.iters, pso2.minc ,’blue’ ,pso3.iters ,pso3.minc, ‘red’,pso4.iters,pso4.minc,’black’);

Here is plot for N=80

Green: PSO1: with parameters : c1=1,c2=2 : Oscillates and convergence not as fast as other parameters.

Blue: PSO2 : with parameters : c1=3,c2=3  :  Oscillates and reaches towards minima.

Red: PSO3 : with parameters : c1=3,c2=2  : Oscillates.  Reaches towards minima

Black: PSO4 : with parameters :  c1=2,c2=3 : Oscillates.  Reaches  towards minima

Dots : GA, still decreasing and is more stable though global minimum is not less than PSO.

And comparing GA and PSO , GA seems not to oscillates and be stable though its minimum not attained as fast.

PSO1 with c1=1,c2=2, seem to not be as fast in attaining minimum as compared to other parameters which are almost equivalent in attaining minimum. Here is separate result of c1=1, c2=2.

Plot of the three competing models is as follows:

Results for N=60

Green: PSO1: with parameters: c1=1,c2=2: Oscillates and convergence not as fast as other parameters.

Blue: PSO2 : with parameters : c1=3,c2=3  :  Oscillates and reaches towards minima.

Red: PSO3 : with parameters : c1=3,c2=2  : Oscillates.  Reaches towards minima

Black: PSO4 : with parameters :  c1=2,c2=3 : Oscillates.  Reaches  towards minima

Dots: GA, still decreasing and is more stable though global minimum is not less than PSO.

And comparing GA and PSO , GA seems not to oscillates and be stable though its minimum not attained as fast.

Plot of three competing models as follows

PSO1 with c1=1,c2=2, seem to not be as fast in attaining minimum as compared to other parameters which are almost equivalent in attaining minimum. Here is separate result of c1=1, c2=2.

Results for N=40

Green: PSO1: with parameters : c1=1,c2=2 : Oscillates and convergence not as fast as other parameters.

Blue: PSO2 : with parameters : c1=3,c2=3  :  Oscillates and reaches towards minima.

Red: PSO3 : with parameters : c1=3,c2=2  : Oscillates.  Reaches towards minima

Black: PSO4 : with parameters :  c1=2,c2=3 : Oscillates.  Reaches  towards minima

Dots: GA, still decreasing and is more stable though global minimum is not less than pso.

And comparing GA and PSO , GA seems not to oscillates so much and be stable though its minimum not attained as fast.

PSO1 with c1=1,c2=2, seem to not be as fast in attaining minimum as compared to other parameters which are almost equivalent in attaining minimum. Here is separate result of c1=1, c2=2.

Results for N=20

Green: PSO1: with parameters : c1=1,c2=2 : Oscillates and convergence not as fast as other parameters.

Blue: PSO2 : with parameters : c1=3,c2=3  :  Oscillates and reaches towards minima.

Red: PSO3 : with parameters : c1=3,c2=2  : Oscillates.  Reaches towards minima

Black: PSO4 : with parameters :  c1=2,c2=3 : Oscillates.  Reaches  towards minima

Dots: GA, still decreasing and is more stable though global minimum is not less than PSO.

And comparing GA and PSO , GA seems not to oscillates and be stable though its minimum not attained as fast.

PSO1 with c1=1,c2=2, seem to not be as fast in attaining minimum as compared to other parameters which are almost equivalent in attaining minimum. Here is separate result of c1=1, c2=2.

Results for N=11

Green: PSO1: with parameters : c1=1,c2=2 : Oscillates and convergence not as fast as other parameters.

Blue: PSO2 : with parameters : c1=3,c2=3  :  Oscillates and reaches towards minima.

Red: PSO3 : with parameters : c1=3,c2=2  : Oscillates.  Reaches towards minima

Black: PSO4 : with parameters :  c1=2,c2=3 : Oscillates.  Reaches  towards minima

Dots: GA, still decreasing and is more stable though global minimum is not less than PSO

And comparing GA and PSO , GA seems not to oscillates and be stable though its minimum not attained as fast.

PSO1 with c1=1,c2=2, seem to not be as fast in attaining minimum as compared to other parameters which are almost equivalent in attaining minimum. Here is separate result of c1=1, c2=2.

The experiments are performed on certain setup, parameters and system. The general impression and analysis are given at end of each experiment conducted. Changing parameters, the mutations functions, population to be performed for creating new chromosomes in case of GA drastically effects the results obtained. In a similar manner for PSO several parameters, initializations effect the results apart from C1 and C2 experimented above. The results are not benchmarks, they are elaborated for experimental setups in labs, and to be followed by mentors teaching Machine Learning Lab Work. For benchmark results look into peer-reviewed research papers.

2-class Image Recognition Task using Backpropagation, Regularized Neural Networks, Logistic Regression and Naive Bayes-Code, Results and Analysis for given Implementation

This article is for education and learning purpose. The aim is to understand how to start experimentation in the area of Neural Networks, how to compare results, what all features to consider while doign experiments. It is a handy tool for those doing self learning in the area of Machine Learning. And a good start for educators who want to impart education in these areas and want to know the art of assignments and what all to expect. Further, the results are not benchmarks, they are elaborated for explanations given ahead. For benchmark results look into peer-reviewed research papers.

Here we have implemented backpropagation algorithm and have tested on a subset of original MNIST datasets for 2-class problem of digit image recognition.

Backpropogation with MNIST Data for binary case of 3 and 8 digits

Contents

1.     Preprocessing the available data. 1

2.     Code Implementation Details. 1

i.       BackPropogation38. 2

ii.      TrainBP. 2

iii.     TestBP. 3

3.Evaluations and Results. 4

This article just pays focus on the way experiments are to be performed and analyzed and some results and do not play a role in benchmarking results or changing existing theories. It is just for elaboration for the purpose of academicians and students to learn the art of how Neural Networks can be used for two class classifications of image data. Further, the results vary with change in number of hidden layers, initial weights, and other settings.

1.     Preprocessing the available data.

I have written a script to collect subset of data from 3 and 8 digits called mnist_38_2.mat. The script takes the training data, training labels, testing data and testing labels and is combined into one data file.  This script extracts the 3 and 8 digits from training and testing data and creates a  single mat file of the whole data containing the training and the testing data along with their labels.

Note: I have performed much fewer epochs on experimentations due to constraints on my computing devise used. This is too less when GPUs are used. But all this is for illustrative purpose only. You can ask your students to perform higher iterations for reaching a higher accuracy and even testing on various number of hidden layers.

2.     Code Implementation Details

Here are details of implementation  of the backpropagation algorithm that I have written for binary case of digits 3 and digit 8 classification. I  have written the code in Matlab. The Matlab file name is backpropogation_38. Appropriate self explanatory comments are given in the code. Here is a brief summary of the algorithm that being implemented.

The methods are as follows:

i.                    BackPropogation38

This is the main method of the file from where execution starts. In this method data is read from file mnist38All.

This is for two class problem for more than two class code changes. Taking correct positive class as 3 ie 0 and the other one as ie 8 as 1. This is the main method of the file from where execution starts. In this method data is read from file mnist38.mat then 10 folds of data is created. Experiments were performed with crossvalind with a holdout of 0.1,0.2,0.3 to make number of elements in testing set as 10%  20% and 30% in testing and remaining ones in training. Further vtot is defined as a vector which stores the average values of following  quantities for the final result. The quantities stored in vtot are [ TP,TN,FP,FN, Precision, Recall, F-measure, accuracy]. We define the number of hidden and number of output neurons in this method experiments were done using varying number of hidden neurons equal to 100.

Inside the loop for ten folds call to the back propagation is made for training function on each of the training and testing set generated. Then on the same set of testing data the evaluations is performed. trainBP is for training and testBP for testing in the for loop.

ii.                   TrainBP

This is the main function that does the training of the network . It takes the following parameters and return types are discussed. Followed by the details of procedure:

—————parameters————–

trainingSet:  The training data as obtained by crossvalind                                                      

num_Hidden: The number of hidden nodes

num_Output: The number of output nides

trainingLabels:  The labels of training data, in case of one output the number of colums in data is one else is equal to number of output nodes     

—————-Return Arguments——————–

weights_1_ij:weights from input to hidden

weights_2_ij:weights from hidden to output

biasInput: the bias from input to hidden

biasHidden: the bias from hidden to output

—————-Details———————-

Here in the training the parameters such as learning rate are assigned which is set as equal to 1/sqrt(iteration). Then each  training pattern is presented in the loop one by one in each iteration.  The maximum number of iterations is set to 5000 due to computationally  large size of the datasets which takes large time to compute per  iteration. The condition of the looping of iterations is till either the  maximum number of iterations reached or and error in values computed is  less than permissible error of 0.001. Further for each  input updated weights are computed.

  1. S1(j) = S1(j) + weights_1_ij(i,j) * x(i) ; is w1.x, the net input at jth hidden neuron
  2. S2(j) = S2(j) + weights_2_ij(i,j) * h(i) ; w2.h, the net input at the jth  output neuron
  3. delta_2_weights_2_ij(j) = O(j)*(1-O(j))*(Y(k)-O(j));   delta 2 for each output neuron
  4. sum =sum = sum + delta_2_weights_2_ij(l) * weights_2_ij(j,l) ;
  5. delta_1_weights_1_ij(j) = h(j)*(1-h(j))*sum;  delta1 for each hidden neuron 
  6. weights_1_ij(i,j)  = weights_1_ij(i,j) + eta * delta_1_weights_1_ ij(j) *x(i) ; updation in weight in input to hidden layer
  7. weights_2_ij(i,j)  = weights_2_ij(i,j) + eta * delta_2_weights_2_ij(j) * % h(i) ; updation in weight in hidden layer to output layer

iii.                TestBP

This method is for testing the backpropagation algorithm written using iterative methology (non matrix based weight updation). This is the main function that does the testing of the network . It takes the following parameters and return types are discussed. Following is the detail of the procedure:

—————parameters————–

weights_1_ij:weights from input to hidden

weights_2_ij:weights from hidden to output

biasInput: the bias from input to hidden

biasHidden: the bias from hidden to output

testingSet: the testing set as creteated by 10 fold crossvalind

testLabel: the testing labes corresponding to the testingSet

num_Hidden: The number of hidden layer neurons

num_Output: number of output layer nodes

—————Return Arguments——————–

correctlyClassified

count3: The number of misclassified class 3 elements

count8: The number of misclassified class 8 elements

unclassified: the matrix containing 5 unclassified data elements of each of the given class

v: This is a vector returning the TP, TN, FP, FN ie the confusion matrix , it also returns precision, recall, F-Value and accuracy

—————-Details———————-

In this method each pattern in the testingSet is tested for its accuracy. Here the net input in hidden layer is calculated and the activation function applied on net input and the outputs at hidden layer are evaluated. Further, the net input at the output layer is computed and activation function is applied to get the net output. The results are compared, the net output with the the expected output to evaluate the TP, FP, TN, FN, precision, recall and accuracy.

3.Evaluations and Results

The results are evaluated as explained above under the maximum number of epoch equal to 50 as it was taking long time to compute the results. The above three functions are executed and results computed. 100 hidden neurons were taken

The following are experiments conducted. Experiment results are as follows:

 Number of EpochsTPTNFPFNPrecisionRecallF-ValueAccuracy
0.1 holdout300068207140.48851.48850.4885
Average of 10 folds, for each fold10068207140.48851.48850.4885

Only these many experiments were conducted due to constraint on time it is taking to run for large number of epochs.

Once settings of  initial weights, training and testing data values are changed, it drastically changed the number of epocs required as well as accuracy increased considerably. Here are the results.

 Number of EpochsTPTNFPFNPrecisionRecallF-ValueAccuracy
Average over all  folds and 100 epocs with hundred hidden layer neurons1006794862283.74.99.8460.83

The following accuracy was obtained, but since it was taking too much  time its ten folds could not be computed. This accuracy was  computed 2 times, 350 epochs. I have attached the final weights of this accuracy with the code.  

 TPTNFPFNPrecisionRecallF-ValueAccuracy
BackPropagation with 100 hidden layer neurons682697170.9757 1.9877.9878

Code

function BackPropogation38()
disp(‘..Starting BackPropogation38 Algorithm….’);

%reading data    

A = load('mnist38All.mat');

%v=[ TP,TN,FP,FN, Precision, Recall, F-measure, accuracy] 
vtot=[0,    0,     0,     0,  0, 0,0 ,0];

data =[A.train;A.test];

 %10 folds with 10-90 ratio of testing and training
  for i = 1:10

    P=.1;        
    groups=data(:,785);
    [train,test] = crossvalind('holdout',groups, P);
    train1= data(train, 1: 785);
    test1=data(test, 1: 785);               
    num_Hidden = 100;
    num_Output = 1;  % two for binary case else 10.


    [rowtrain,coltrain]=size(train1);
    [rowtest,coltest]= size(test1);

    %initilizating weights

    trainingLabels(1:rowtrain) = train1(:,coltrain);
    trainingSet(1:rowtrain,1:coltrain-1)=0;
    trainingSet(1:rowtrain,1:coltrain-1) = 
    train1(1:rowtrain,1:coltrain-1);
    trainingLabels(1:rowtrain)= train1(1:rowtrain,coltrain);

    testSet(1:rowtest,1:coltest-1)=0;
    testLabels(1:rowtest)=test1(:,coltrain);
    testSet(1:rowtest,1:coltest-1) = test1(1:rowtest,1:coltest-1);
    testLabels(1:rowtest)= test1(1:rowtest,coltest);

    for n1=1:rowtrain
        for n2=1:coltrain-1
            if trainingSet(n1,n2) >0
            trainingSet(n1,n2)=1;
            end
        end
    end


   for n1=1:rowtest
        for n2=1:coltest-1
            if testSet(n1,n2) >0
            testSet(n1,n2)=1;
            end
        end
    end


    [weights_1_ij, weights_2_ij, biasInput, biasHidden] = trainBP(trainingSet, num_Hidden,num_Output, trainingLabels,testLabels,testSet);


    [correctlyClassified,count3,count8,unClassified,v] =    testBP(testSet,weights_1_ij, weights_2_ij,biasInput, biasHidden, testLabels, num_Hidden,num_Output);

    %v stores the output containg TP.FP etc , vtot the total of all
    %such over 10 folds
    vtot = vtot + v;

    correctlyClassified

    count3
    count8


  end
%computing average of 10 folds
vtot = vtot ./i

end

%learning rate    
eta =1;
maxEpochs=100;
errorBound=0.001;

[trainLengthRow, trainLengthCol] = size(trainingSet);
num_Input = trainLengthCol;       

Y = trainingLabels;

%assingning initial weights
weights_1_ij(1:num_Input,1:num_Hidden) = 0.01  ;
weights_2_ij(1:num_Hidden,1:num_Output) = 0.01 ;                    
biasInput(1:num_Hidden) = 0.01  ;
biasHidden(1:num_Output) = 0.01  ;

delta_1_weights_1_ij(1:num_Hidden)= 0;
delta_2_weights_2_ij(1:num_Output)= 0;

epochs =1;
error = 1;

while((epochs < maxEpochs) && (error > errorBound))

    eta=1/sqrt(epochs);
    error=0;
    epochs = epochs+1 

for k =( 1: trainLengthRow )
x = trainingSet(k,:) ;

S1(1:num_Hidden)=0;
S2(1:num_Output)=0;

    %calculating weights
    for j =(1:num_Hidden)        
        for i =(1:num_Input)
            S1(j) = S1(j) + weights_1_ij(i,j) * x(i) ;
        end;    
        S1(j) = S1(j) + biasInput(j) * 1;
    end;

    h(1:num_Hidden) = 0;

    for j =(1:num_Hidden)
    h(j)= 1/(1+exp(-1*S1(j)));
    end;


    for j =(1:num_Output)        
        for i =(1:num_Hidden)
            S2(j) = S2(j) + weights_2_ij(i,j) * h(i) ;
        end;    
        S2(j) = S2(j) + biasHidden(j) * 1;
    end;


    O(1:num_Output) = 0;
    for j =(1:num_Output)
    O(j)= 1/(1+exp(-1*S2(j)));
    end;        

    %calculating weights
    for j =(1:num_Output)
    delta_2_weights_2_ij(j) = O(j)*(1-O(j))*(Y(k)-O(j));
    end;


    for j =(1:num_Hidden)
        sum = 0;
        for l=(1:num_Output)
            sum = sum + delta_2_weights_2_ij(l) * weights_2_ij(j,l) ;
        end;
    delta_1_weights_1_ij(j) = h(j)*(1-h(j))*sum;        
    end;


    %updating weights
    %calculating new weights        
    for i =(1:num_Input)
         for j =(1:num_Hidden)                        
            weights_1_ij(i,j)  = weights_1_ij(i,j) + eta * delta_1_weights_1_ij(j) * x(i) ;
        end;    
    end;        

    %computing bias
    for j =(1:num_Output)                        
            biasHidden(j)  = biasHidden(j) +  delta_2_weights_2_ij(j) * 1 ;
    end;            


    %updating weights
    %calculating new weights
    for i =(1:num_Hidden)        
        for j =(1:num_Output)
            weights_2_ij(i,j)  = weights_2_ij(i,j) + eta * delta_2_weights_2_ij(j) * h(i) ;
        end;    
    end; 

   %computing bias
    for j =(1:num_Hidden)                        
            biasInput(j)  = biasInput(j) +  delta_1_weights_1_ij(j) * 1 ;
    end;            

% error as output approaching target
error = error + sqrt( (O(1)- Y(k)) * (O(1)- Y(k)) );

end 
if epochs % 10 == 0
save('weights.mat','weights_1_ij','weights_2_ij','biasInput','biasHidden','testLabels','testSet','trainingSet','trainingLabels');
end

end
end

function [correctlyClassified,count3,count8,unClassified,v] = testBP(testingSet,weights_1_ij, weights_2_ij,biasInput, biasHidden, testLabel, num_Hidden,num_Output)

correctlyClassified = 0;
count3 = 0; count8=0;   TP=0;    TN=0;     FP=0;     FN =0; P=0; R=0; F=0;

[testLengthRow,testLengthCol]=size(testingSet);
unClassified(1:10 ,1: testLengthCol) = 0;

% checking accuracy by  number of correctly classified

for k=(1: testLengthRow )
    x=testingSet(k,:);
    S1(1:num_Hidden)=0;
    S2(1:num_Output)=0;
    num_Input =testLengthCol;

    %calculating
    for j =(1:num_Hidden)
        for i =(1:num_Input)
            S1(j) = S1(j) + weights_1_ij(i,j) * x(i) ;
        end;
        S1(j) = S1(j) + biasInput(j) * 1;
    end;

    h(1:num_Hidden) = 0;
    for j =(1:num_Hidden)
        h(j)= 1/(1+exp(-1*S1(j)));
    end;

    for j =(1:num_Output)
        for i =(1:num_Hidden)
            S2(j) = S2(j) + weights_2_ij(i,j) * h(i) ;
        end;
        S2(j) = S2(j) + biasHidden(j) * 1;
    end;


    O(1:num_Output) = 0;
    for j =(1:num_Output)
        O(j)= 1/(1+exp(-1*S2(j)));
    end;


    %    error as output approaching target
    if sqrt( (round(O(1))- (testLabel(k))) * (round(O(1))- (testLabel(k)) )) == 0
        % correctly classified examples
        correctlyClassified=correctlyClassified+1;

        %compute  TP, TN
        if(testLabel(k)==1)
            TP = TP+1;
        else
            TN = TN +1;
        end

    else
        % wrongly classified examples
        if(testLabel(k)==1)
            FN = FN+1;
        else
            FP = FP +1;
        end
        %storing 5 misclassified  classes from each class
        if(count8<5 && testLabel(k)==0)
            count8 = count8 + 1;
            unClassified(count8,1: testLengthCol) = testingSet(k,1: testLengthCol);
        end
        if(count3<5 && testLabel(k)==1 )
            count3 = count3 + 1;
            unClassified(count3+5,1: testLengthCol) = testingSet(k,1: testLengthCol);                
        end
    end

end

  k

  %for storing 'TP,    TN,     FP,     FN,   Precision,  Recall,
  %F value , accuracy
v=[TP,    TN,     FP,     FN,     TP/(TP+FP),      TP/P,      2*P*R / (P+R) , correctlyClassified/testLengthRow]
disp('TP,    TN,     FP,     FN,     TP/(TP+FP),      TP/P,      2*P*R / (P+R) , correctlyClassified/trainLengthRow');


 unClassified;
 accuracy = correctlyClassified/testLengthRow  ;   
 accuracy

end

Backpropagation algorithm for 10 digits learning

Applying BPNN classifier on original multiclass data with the same experimental settings as in the binary case above. The results are not as high, and only upto few epocs could be performed, again due to limits on computing capacity of systems for experimentation. This is due to the fact that the complete Neural Network, initial weights have to change with the change in problem. The outputs are 10 classes represented in form of 1’s and 0’s  of ten digits combinations. Conclusion, experiments of parameters for 10 class data need to be performed.

Misclassified Images for 2 class digit recognition problem

Code was written in each of the algorithm that stores 5 misclassified samples from each class in a mat file. And a script that is in this folder to read each of the misclassified row from this mat file and convert it in a 28 x 28 matrix in image form. The script is as follows:

%This is a script that reads data which was unclassified and stored in a
%mat file and displays the image file corresponding to that data.

data = load('mat38unclassified.mat');
A = data.unClassified;
[r,c] = size(A);
%read all r data that are present in unclassified mat   file
for t=1 : r
 k=1;   
 x(1:c)=A(t,1:c);
a(1:28,1:28)=0;

%convert it back to matrix form
for i=1: 28
        for j =1: 28
            a(i,j)= x(k);
            k=k+1;
        end
end

% display each of the figure separately
figure(t); 
imshow(a);
end

The following are misclassified ones from 3 and 8 digits  backpropagation for 2 class program

Following are misclassified “3” digit.

Comparison with other ML Techniques

MethodTPTNFPFNPrecisionRecallF-ValueAccuracy
Logistic regression9369902038.9791.9610.9699.9708
Naïve Bayes9389037271.928.928.928.92

Comment:  Logistic Regression is performing the best and Naïve Bayes is also performing good.

RNN takes high amount of time to learn. So the execution time for RNN is very high.

Naïve Bayes require and extra overhead of discretizing the data. Otherwise Naïve Bayes is a fast algorithm in terms of execution time as its output is direct output that does not require iterations.

The accuracy in these experiments is execution time and system restrictions, which prevented optimization of error and hence increase in accuracy. This does not say than a particular do not perform well-it is just an experimental setup for education and learning purpose for how to go ahead with experimentations.

Weka was used for Naïve Bayes. And the settings cant be changed the true positive rate and false positive  rate is coming out to be 0.92 and 0.926 respectively. Which is upper right hand column of the ROC graph.

Naive Bayes ROC Curve: Since using Weka GUI it is not possible to create ROC curve though the TPR and FPR is given by  (.926,.92). Also see the screen shoot below for details. Only two point plots possible.

ROC for implemented Logistic Regression:

Note again: Much fewer epochs were performed on experimentations due to constraints on computing devise used. This is too less when GPUs are used. But all this is for illustrative purpose only. Also iterative approach for implementation has been used in the code implemented for illustration. Modern day approach do use matrix computations for efficiency in processing, given the advent in processors and computing facilities used for matrix computations.