Image of flowers with only 12 colors |
I was kinda busy with a lot of things, but the wish to try again was always somewhere in the back of my head. A couple of weeks ago I reviewed an online course I followed on Coursera on Machine Learning. This is probably the one of the best courses I have followed, but like with all things, if you don't use the skills learned frequently you loose the skills. But still I remembered some of the topics and what I in particular wanted to learn again was clustering.
In the topic of clustering you let the algorithm you make search for groups of points lying close together. I don't want to explain the whole subject here, and since the course explains it so perfectly I would suggest to watch the movies there.
I believe they have opened the course in such a way now that you can follow it at any moment now and don't have to wait for a new session to start. If you ever want to learn about fitting data, computer learning, Matlab/Octave programming this is a great place to start.
Back to my small project.
Finding Clusters in RGB space
The colors in an image have three values red,green and blue for every pixel. I view these as point in a cube which has sides going from 0 to 255. If there is i.e. a lot of sky in an image there will be lot of points with a light blue color close to each other in this cube. I want to find that cluster of points.Time for an example.
original picture |
picture with only 12 colors |
In the pictures you can see the result of my algorithm. I have looked for clusters in the RGB cube, 12 in this case. In the end every pixel belongs to one of the clusters and the original color is replaced with the mean color of the cluster.
The result looks pretty good in my opinion. I could make a nice paint by numbers picture with this :)
But how does it preform on other images? And does it always give the same result?
That second question is not a trivial one. I choose my start point (the average point of the clusters) at random by taking points from the image (the RGB value of pixels in the original image).
Example 2: Room for improvement
The top image is the original image and the three images after that are from three different runs. When you inspect them closely you notice that they are quite different. Why is this happening?
First of all, I only let the algorithm run for a short time, and I'm not even checking if it has in any way finished. With finished I mean that a new step in the calculation doesn't change the answer.
Second of all the starting point determine the outcome. I would have do do this calculation a lot of times and then take a sort of average of the end results and maybe throw away bad results.
Hope to come back soon with a Part III to this topic.