OpenCV for Machine Learning Using C++

Facebook
Twitter
LinkedIn

In this article, I will give you some examples of using OpenCV and C++ for machine learning (Knn, SVM, BOW).

If you have any trouble while installing OpenCV with C++, I made this video to show you how you can install it easily.

Knn with openCV

DefinitionIn artificial intelligence, more precisely in machine learning, the k nearest neighbors method is a supervised learning method.

In this framework, we have a training database made of N “input-output” pairs. To estimate the output associated to a new input x, the k nearest neighbors method consists in taking into account (in an identical way) the k training samples whose input is the closest to the new input x, according to a distance to be defined. Since this algorithm is based on distance, normalization can improve its accuracy.

Method

So to do the classification using the KNN method, we must first declare the training samples or we can say the input data.

After declaring the zeros and ones matrices, we have to fill them with random values using the normal law function ‘RNG’.

Now we need to concatenate the two classes so we can do the learning.

The training

So, to do the training, you must first create the model using this line of code:

Now we will make a loop that goes through the whole image (the plane we have chosen of 500×500 pixel to draw the zones of the different classes and of course to put the points in their places), and doing the learning the output of the model will be either 0 for the first class, or 1 for the second. By finding (or we can say by calculating the distances) we fill each pixel of the image with a color that defines that it belongs to such a class, like this to visualize the border.

So, to do the training, you must first create the model using this line of code:

So you can see that we have the find_nearest function which calculates the distance between the entry point and the others in order to find the neighbor and then this function returns 0 for the first class or 1 for the second. And finally we used a conditional block if, else to give a specific color to each class.

So by doing these steps, we did the learning and found the two classes and of course we drew the border, as you can see here:

Now if we want to present the samples we used for learning, we just have to put each point in its place, and this way we can see if the learning has been well done or not.

So here is the output image after the representation of the samples:

On remarque que la frontière à bien définie les deux classes, il y a quelques points qui n’ont pas été bien classés mais en gros l’apprentissage et bon.

The test

For the test we declared a matrix of 200 elements to make the test, and after we calculated the nearest neighbor of these samples, we put a small loop to calculate the well classified samples and the badly classified ones for the two classes.

Here is the representation of the test samples:

Evaluation

Now we have to evaluate our model, by seeing the well classified and poorly classified samples.

To understand what we are going to do, I will give you a small example.

In our case, we have two classes, the first one containing the green points and the other one containing the red points.

To make the evaluation we have to put a class of good and another one of bad, so here I’m going to give you an example:

For the green dots we will say that they present the people who are not sick (good classes) and the red dots for the people who are sick (bad classes).

  • If a sick person is classified in the sick class (with the red dots) then here we say that he is a true negative (true for well classified and negative for sick).
  • If a sick person is classified in the class of people who are not sick then here we say that it is a false negative (false for misclassified and negative for sick).
  • If a person who is not sick is classified in the class of people who are not sick (with the green) then here we say that it is a true positive (true for well classified and positive for not sick).
  • If a person who is not sick is classified in the class of people who are sick then here we say that it is a false positive.

So starting from this principle, we make the evaluation of our result so that we can calculate the precision…

TP: true positive

TN: true negative

FP: false positive

FN: false negative

So since we have 16 misclassified points of the greens and 9 misclassified points of the reds, knowing that the total number of points of each class is 200, then we can fill the table easily.

Now using these values, we can calculate the accuracy…

The Accuracy

The Recall

The Specificity

The error of each class

Class 1:

Classe 2:

Total error

SVM with OpenCV

Définition

Les machines à vecteurs de support ou séparateurs à vaste marge sont un ensemble de techniques d’apprentissage supervisé destinées à résoudre des problèmes de discrimination et de régression. Les SVM sont une généralisation des classifieurs linéaires.

Les SVM peuvent être utilisés pour résoudre des problèmes de discrimination, c’est-à-dire décider à quelle classe appartient un échantillon, ou de régression, c’est-à-dire prédire la valeur numérique d’une variable. La résolution de ces deux problèmes passe par la construction d’une fonction h qui à un vecteur d’entrée x fait correspondre une sortie y: y=h(x)

Dans notre cas, on va faire la classification de deux classes qui contient des points aléatoires comme on a fait pour la méthode de KNN.

Method

In this method, we will do almost the same thing we did in Knn, but here we have to do different parts. The first one will be the training using two classes of 100 random elements with the normal distribution function always and then we have to generate other test samples of 200 elements and we will do the prediction.

Because if we always use the training samples then we cannot study the performance of the generated model.

So to do the learning, we also have to create two matrices for two classes and then we have to concatenate them and put them in the input of the SVM model. (the same thing we did before).

To do the learning, we have to execute the following line:

But before that, we have to initialize the model parameters by doing this: (we used the RBF core)

If we notice well, when we launched the learning we added the vector of the classes because we said that it is supervised learning which means that we have to give the names of the classes to the model.

Display of the boundary

After the training, we need to make the representation of the data as we have already done in Knn, so we can see the boundary and also we can see if the model has well chosen the boundary according to the training data that we had.

We notice that the border really makes the difference between the two classes, there are some points that were a little difficult to classify but basically, the learning was well done.

Test

Now we need to test our model by declaring other test samples and run them directly through the model using the predict function to predict which class each point belongs to.

So to do this, we declared two more matrices for the two classes and each matrix contains 200 test items.

And finally it’s just a matter of passing the concatenated matrix (which contains the data from both classes) by doing the following rows:

So here is the test result:

Evaluation

So since we have 16 misclassified points of the greens and 9 misclassified points of the reds, knowing that the total number of points of each class is 200, then we can fill the table easily.

The Accuracy

The recall

The specificity

Errors of each class

Class 1:

Class 2:

Total error

Conclusion

To conclude, we can see that the values obtained for the evaluation are very satisfactory. Because the error is very small, the precision is high which means that the model has learned well and if we use an SVM model with this configuration then it will work well as this type of application.

Bag of words

Introduction

After we have seen how to use SVM for a basic case we will try to apply a SVM model on images to make a classification.

To do this, we have a database of 4 classes with quarantines of images for each class.

But the problem here is that a single image contains several features so if we want to use the images as they are then it will be an input matrix with a lot of values that will make the learning process very slow and it’s not even sure if they are the right features or not.

That’s why we need to find a method that helps us extract the necessary information from the image and we’ll do the learning on this information. Because if we don’t do that, then the input vectors will be whole images.

So the idea here is to find descriptors from the image that contain only the important information of the image, so here or we will use the method of BOW (bag of words) with SIFT.

The SIFT method is used to extract the points of interest of the image, it can be contours, circles, … Because if we take only the points of interest of the images then the input vectors will be largely very small and containing only what is necessary then like that the learning will be very fast.

Creation of the dictionary

Before doing the learning, we said that we have to find the points of interest and then we will put them in a single matrix so this matrix is called the dictionary and from this dictionary we will classify the images that have the majority of points of interest in groups (in the learning).

To extract the points, we will use openCV functions that will do the job. We decided to take 10 images of each class for the training, so even the dictionary must have the information for 10 images, so here are the lines of code we used to extract the points of interest:

So these lines of code will extract the points and put them in the descriptor, and finally we put this descriptor in the dictionary that we save and we use it in the learning part.

The training

Once we have extracted the points of interest and filled the dictionary, we need to find the diagram that contains the common points of interest between all the learning images we have.

And this diagram that we will use to do the learning, we use the following line of code:

And for the parameters of the model, we used the same ones we used for the points and we used the RBF kernel too.

So after the training is finished, we have to save the model so that we can use it later in the test part.

The test

Now we have to test our model using the predict function of openCV, but before doing that we have to know that for the test it’s the same thing, we are not going to take the whole image and put it in the input of the model but rather we have to apply the same SIFT algorithm here to extract the points of interest of this image and this descriptor obtained which is going to be in the input of the predict.

So for the test I chose 15 images of each class so that we can build the confusion matrix simply.

After I applied the method I’m explaining on all the test images, I got the following confusion matrix:

This matrix means that:

  • In the 15 test images about the accordion object, there were 5 misclassified images and 10 well classified.
  • For “airplanes”: 11 well classified images and 4 badly classified.
  • For “Anchor”: 5 images well classified and 10 poorly classified.
  • For “Ant”: 8 well classified images and 7 badly classified.

We notice that for ‘accordion’, ‘airplanes’ and ‘ant’ the model was a bit more sensitive than for ‘Anchor’ is that it looks a lot like the plane so the points of interest of the plane and the anchor were very close which means that it was a bit difficult to make the difference between these two objects at the test level

Evaluation

In order to make the evaluation, we will calculate the errors of each class and the global errors. Despite this, we can notice that the model is less performing.

After calculating the local errors we can see that it is not really efficient, and this is due to several factors:

  • The types of the data are very close, which puts the extraction of the points of interest a little close for the 4 classes.
  • The training data, we used only 10 images per class while it was necessary to put more so that the model learned more on many images.
  • And also at the test level, we tested only on 15 images, maybe if we increased the test data the error will decrease a little.

Conclusion

The use of SVMs with SIFTs is a very good approach, because it avoids learning on all the pixels of the images, and it allows the model to be more accurate and contains the features necessary to classify an image, but in this case it was not enough to have a good model and that gives a good classification perhaps the choice of descriptors was not well done. And this is one of the problems that we will have when we make a machine learning algorithm without neural networks which means that we have to choose the features by ourselves (I say ourselves because we have to put external algorithms that do the extraction of the features) but using a neural network, all this work will be done automatically and the extraction of the features will be done by the network in the course of learning.

Unfortunately, we can’t do a neural network on our case because of the lack of data. Because I think that even a neural network on a case like this does not give more accurate results.

You can find the code at this link.

happy learning

More to explorer

Making Sense of AI in Medical Images

Explore how AI revolutionizes medical imaging, enhancing diagnosis and treatment. Dive into real-world AI applications for better healthcare outcomes.