Loving Data and Everything Else

Tuesday, July 1, 2014

Reducing the number of colors in an image

OK first all, one of my most favorite astronaut pictures showing Anna Lee Fischer. To me it looks it's from a movie or something, it's just that cool.

Anna Lee Fischer

Long story short, I was wondering if I could reduce the number of colors (the grays) in this picture. Just for fun and exercise. I could try an image/photo editor, but it's always more fun to do it yourself right.

I went to work in Matlab and to just show you the end result first (before all long boring programming details). Tadaa!

Only 5 grays left

How it's done

The main idea I had was finding the colors that are most dominant. We therefore look at the histogram of the image as shown below. Values run from 0 to 255 on the x-axis and the y-axis represents the number of occurrences of a certain gray value in the image.

Something I find strange are the very regular holes in the histogram where a certain gray value apparently never occurs. Apart from that it's not a smooth line, there are about 5 local maxima visible.

How do we find those?

Finding the peaks

I first removed the holes by using the average of the gray values around it. After that I applied a Gaussian smoothing. Before I tell you about how I find the peaks I'll first show the result of smoothing and finding the peaks.

You can see that the line is now very smooth and there are small red lines to show where the peaks are found. I find it a very satisfactory result. It may seem like the first and last line are just a bit off, but that is only because the steps are integers and the lines are placed on the integers. So i.e. to me and you the peak should be at 18.7, but it is found at 19.

The short explanation of how the peaks are found is this. I let the program walk along the line until it reaches the top. That means the next position has a lower value instead of a higher value. After that we continue to find the valley, so look for a position where the next element is higher instead of lower. And we go back to finding the next peak. I'm only interesting in the peaks of course. I don't tell the program how many peaks there are, just to stop when it's at the end of the histogram.

Changing the colors

To create the new image I take the value of every pixel in the original image and look to which peak value it is closest. Then it is changed to that peak value with a preference to the lowest/darkest value. The original image was pretty noisy so you can see this in the end result.

When you use this method you have no control over how many colors the end result will have. This can be an advantage or a disadvantage depending on your needs. But I like it that we don't put in any previous knowledge and the program itself finds the most dominant values.

I guess this whole process or parts of it has a name, but it was still nice to built myself from scratch.

I'll try how the program works with color images, which will need some adjustments.

Tuesday, June 24, 2014

Presidents in books

I just read about Ngram Viewer. A Google project where you can search for words of phrases in books Google has digitalized. Here I search for presidents of the United States, from FDR till Jimmy Carter. Non interactive picture:

Monday, June 23, 2014

Summation Trick for Object Recognition

In this article I want to explain a feature of the blob detection I built.

I've been working on my autonomous boat project and what I needed was a way to detect where the boat was and where it was going. I chose to put bright stickers (green and blue) on the boat. Here's a nice picture of me testing out the detection algorithm.

Program recognizes a green and blue spot on the paper

Binary images

The beginning of finding these object is simple. The video images from the webcam have three color channels (red, green and blue). So the first step in this process is to look at pixels and decide if we find them particularly green (I'll talk about green, but you can replace it with blue and you get the same story). If this is the case we assign a 1 to that pixel and otherwise we assign a zero, so we get a binary image.

There is a big possibility that there are some pixels that are green, but don't belong to the sticker I'm looking for. We do know that the middle of the sticker is probably surrounded by a lot of green pixels.

Think of it this way, if I could add up all the pixels in my binary (1's and 0's) image around some pixel I'll know if I'm looking at a green area or not. If the sum is small it was just some stray green pixel, if the sum is really high I've found a really green area. I'll try and find the pixel which gives me the most green neighbors and decide that that pixel is the middle of sticker (or close enough).

Sound easy, but it felt like the computation of all those sums could be to heavy to perform in a real-time video. To estimate how many calculations that is. Say we have a 800 by 600 video image (small) and we look at an area of 10 by 10 pixels around every pixel (small again). That means roughly 800*600*10*10=4.8 million pixel additions!

I don't know if Processing can handle that at 30 frames per seconds, but I didn't want to try when I already knew a way to speed it up by a factor of 25. Not possible?

Less is more

I'll explain the method I used in one dimension first before I explain the 2D image method.

Imagine you have a vector of numbers i.e. U = [2, 5, 7, 1, 3, 2, 8, 6 , 7, 2, ....] and you want the sum of 10 consecutive numbers in this vector, for every position.

To make it clear I give the positions a name 1, 2, 3, etc.

So this sum I am looking for is

V=[1+2+3+4+5+6+7+8+9+10, 2+3+4+5+6+7+8+9+10+11, ....].

Instead of doing this directly we take an intermediate step. We construct the following vector:

As you can notice the first sum I am looking for can already be found in position ten of this new vector. In position eleven we find 1+2+3+4+5+6+7+8+9+10+11, which is almost the second sum we are looking for. So have so subtract 1 which is found in position one of our new vector.

Last example: in element twelve we find 1+2+3+4+5+6+7+8+9+10+11+12, where the 1+2 are too much. This is found in the second element of the vector w.

As you hopefully understand know. You only have to look at two numbers in our intermediate vector and subtract them to find the sum of four numbers. How much does that save in total calculations?

Building the intermediate vector costs 2*n additions

pseudocode:

w(1) = v(1)

for i = 2:n

w(i) = w(i-1) + v(i)

endfor

And after that the summed vector s takes another 2*n additions

s(1) = w(10)

for i = 2:n
s(i) = w(i+10)-w(i-1)

endfor

So that brings the total to 4*n numbers that need to be added. If we had done it in the normal fashion it would have been 10*n additions.

The more numbers that needed to be added, the more you save.

More of that

But what about 2D? We want the sum in a whole around a pixel. Again we have to take an intermediate step to reduce the number of additions. I usually think of an image as a big matrix, so imagine one where every position has number as a name.

In this example the rows are 100 elements long, but that doesn't matter. We'll call the new intermediate matrix N and the first row will look very familiar now:

[1, 1+2, 1+2+3, 1+2+3+4, ....]

The second will be filled as follows [1+101, 1+2+101+102, 1+2+3+101+102+103, ...]. What we have done is adding elements again from the second row in M, but also add the element that lies above in N. The very last element in the matrix N while have the sum of all the elements of M if we continue like this.

One example should show how easy it is to find the sum of an area now. Say we want the sum of the elements in squares 102, 103, 202 and 203. This is a simple example that doesn't speed up, but it is easier to understand.

Let's look the third element in the third row. It has these four numbers added we are looking for. But also a couple to much (1,2,3, 101 and 201). The first can be removed by subtracting the square in blue. 101 and 201 can be removed by substracting the square in purple. You see that we subtract 1 twice, but we can correct this by adding the red square.

Ok, in this example we add and subtract four numbers when we could have added the four numbers directly.

But if you wanted to have the sum of a ten by ten area the calculation would still only involve four numbers ,instead of a hundred (tadaa factor 25!!).

For a position i,j in an image for an area of n by n pixels the sum becomes:

For me it's an old trick, but maybe someone finds it useful. I still find it satisfying when I find an opportunity to use it.

Thursday, May 22, 2014

Stop Motion Movie

I have always loved these movies where you see the clouds rushing by. Where a scene is played fast forward through a day, seeing plants open up, are traffic increasing and decreasing over time.
The way that is done is usually taking a picture from the same position with some time interval and then putting them all in a movie. Or you can slowly change the position of your camera.

Anyway, I was working with the webcam last week for my object recognition project so I had the knowledge fresh in my head. I quickly made a program that finds your webcam(s) and then takes pictures with fixed time intervals (adjustable in whole minutes).
I tested it out on the view outside my window hoping for the bad weather that was said to be coming. It did not come. So here you can see a movie of pictures with a one minute interval of clouds (and sometimes a bird).

For those who are interested, I ran this program in Processing 2.2.1 (64 bit) and that worked fine, some earlier version were giving me problems. My own program can be found here, but you'll have to change certain things to make it fit your own needs.

Webcams

I attached an extra webcam to my laptop, so there is a choice of two. But besides that, the list Processing shows gives different sizes and frame rates for you webcam(s). The list starts with 0, but you can fill in any number from the list and you should get an image.
Find the line that says:

cam = new Capture(this, cameras[0]);

Maybe there's a different number then 0, but that's the place to change it. You should find it around line 26 in the code. I picked a very low resolution for testing, so the video looks bad in my opinion. For future video's I think the only limitation is the size of my hard drive.

Other tweeking

The time step, called 'step' in the code was set to 1, but you can change it to any whole positive number between 1 and 59 you like.
It could be necessary to change the screen size, because that is set to 'size(640,480)' at the moment.
Files are save as 'hour+minute.jpg' in a directory called '/images'. You could change that to your own preferences.

Making the movie

In the original pictures I had the camera wasn't positioned horizontally and I had to rotate them. I did this in Matlab because the rotation was the same for every image and using a script was a lot faster then using Photoshop or some other image program.
When my images where rotated correctly and square again I loaded everything in Windows Movie Maker, which is downloadable for free. Ones they were all loaded it selected them all and under 'Edit' you can set the 'Duration' of every frame. I set it to 0.25 seconds which gave me a smooth running video.

I saved the movie and uploaded it to YouTube. What I think is pretty neat about YouTube is that editing is really simple. There is music to pick from and it even suggests things like video stabilization.

So I hope this was useful for somebody. Nothing is stopping you from placing your webcam somewhere and making a cool movie.

EDIT:
After a request I have modified the program so you take pictures any multiple of seconds.
Click here to download.

Tuesday, May 20, 2014

Do It Yourself Tracking Software

Introduction

In a preview I showed a radio controlled boat where I hacked the remote control. With an Arduino Uno you can send a signal to the remote which then sends it to the boat. A small video of my daughter trying it out to show it works.

My ultimate goal is to make the boat autonomous. I want to use a webcam to determine the position of the boat and then be able to steer in the right direction. So that could be making sure it doesn't crash into the wall or that it follows a certain route, I haven't made up my mind about that yet.

Choosing the system

When I started with this project I was thinking about all the obstacles and steps I would find along the way. But I soon realized that I was making the rules here. I could choose what system I was using and how difficult it was going to be. That sounds a bit vague so I guess I'll give an example. When I saw the remote control I immediately was thinking about the Arduino. Nowadays there are quite a few ways for computer-machine interaction but since I have some experience with the Arduino I knew I would have less problems implementing it.

One of the bigger challenges was using a webcam and it's images to send the right commands to the Arduino and so the boat. I followed part of a course last year on Coursera called: Creative Programming for Digital Media & Mobile Apps. All the apps they worked with were built in Processing which has a very low entry level and let's you make Android apps and JavaScript apps. It's really meant for fun user interfaces.
Bonus: It's really similar to the Arduino language and even the programs look similar. So if you know one, the other will be a cinch.
And yes you can access the webcam with processing and use the images to do whatever you want.

Object tracking

That was a long intro, bear with me.
Processing doesn't have object recognition functions built in and I figured any program you could find on the internet would not be suitable for what I wanted. So I decided to make it myself which is always more fun.
Accessing the webcam wasn't that hard (there are examples in Processing) but I needed some boundaries.
What did I want to recognize? The boat, but I didn't feel like building a Machine Learning algorithm that knows what the boat looks like and finds. Possible I guess, but pretty hard work. I wanted something simple.
I made the choice of placing a colored sticker (or maybe two) on the boat, let's say red. I intend to place the webcam above the water and I can choose what color bottom of the water will have. I could make sure that the only red thing in the webcam image is the red sticker. This is what I talked about earlier, I can make the choices of how difficult the problem is going to be. I have reduced the problem to finding a red dot in an image.

Finding the red dot

To give you a break from all this reading, here is a demo of my Processing program using the webcam on my laptop to follow a small red object.

In my hand I'm holding a small piece of red plastic. The program takes a webcam image and analyses it in two steps.
First it performs a threshold. Using the fact that red objects are red :) it assigns a '1' to pixels that have lot's of red and little green and blue (i.e. white pixels have lot's of red, green and blue) and it assigns a '0' to all the other pixels.
Next it calculates how many pixels in the neighborhood are found red and finds the pixel with the most neighboring red pixels. That neighborhood can be quite big, for instance this was done with 10 by 10 pixels.
The way these calculations are done can and will be topic of a separate blog entry.

To show the result it draws a white circle at the found position.
You can vaguely see image it works with around the white circle. That represents the number of pixels around every pixel. If you would have a stepfunction in your image the result of counting the neighboring pixels would look like something below.

Sketch of how the counting of neighboring pixels would look like

It reminds me of a Gaussian blur. Most important is that you see that the middle of the original platform results in a high peak in the second image. Important is choosing the right neighborhood size as you can imagine.

In the near future

Keeping the main goal in sight, tracking and controlling the boat there is still some work to be done.
I intend to place two stickers one the boat, a blue one and a green one and track them both. With the position of these two stickers you can determine the position and orientation of the boat. If you use previous positions and orientations you can calculate the speed and direction of the boat.

Then I have to think of some rules the boat has to follow like: don't touch the walls, or go around in circles.

So far the program is perfectly capable of doing all the calculations in real time, and I hope that it still does so when all these features are added.

To be continued...

The boat

Thursday, May 1, 2014

Friend speed

How fast can I get Facebook Friends?

41,23 friends per hour!

Awesome! You must wonder what that means and where I get this from.

A while ago I registered for a Facebook profile because:
1) All the cool kids are doing it
2) I want to watch holiday and baby pictures from friends
3) I wanted to do an experiment

Everyone I spoke about opening a FB account told me you get friends really fast. So I set out to see if this was true.

The experiment

When you don't change anything on the original settings you get an e-mail message about everything that is happening. That's why people switch this off immediately. You're on FB 24/7, you don't need see the messages twice, right?
I left the settings as were and registered every friend request and all the request I sent myself. The graph below shows you how that looks like. Yes, this was tedious work.

Most of it is quite clear I guess. On the horizontal axes we see the time in hours, and on the vertical axes the total number of friends, and the number of requests I sent myself. The numbers in the top are just the days, so we start on day one somewhere in the morning.
Funny thing is, within one minute of registering I received two friend requests without any other action then signing up.That means that those people somehow saw that I had joined, not sure how that works.
I sent maybe two or three requests, but I waited to see what would happen. The number of friends is rising, what happened?

Well in this case one of my first friends made friend suggestions and since I was in need of friends I clicked them all I guess. During day two I didn't do anything and I noticed that the rate at which my number of friends was growing was slowing down.

Sending requests

Therefore I sent out a lot of requests on day three, which you can clearly see from the quick rise of the blue line. And yes! I got an immediate response and my total number of friends sky rocketed. Let's zoom in on that event.

Total number of friends after sending lots of requests.

In this picture we can see that the total number of friends start to grow quite fast until.... people go to sleep. You can see it stops at about hour 72 which is the end of day 3. A bit more than eight hours later it picks up again.

Friend speed

So I mentioned speed, or 'friend speed' as I like to call it. How do we calculate that? There are a couple of ways to calculate that. You could for instance look in every hour how many friends were added. Downside is, that in most hours no friends were added at all and in some others maybe four of five. This would be a very non informative graph.
A second method is calculating the time between two confirmations. Plus, you get a value for all the data points. Minus, some friends sent their confirmation at the same time (the same minute). That means the time was zero between them and so the speed was infinite at that moments. Infinite works very bad in graphs.

To solve that problem I let R (used in the graph making) draw something of an average through the data points, which is called an LOESS fit as can be seen in blue in the image below.

Blue line depicts a LOESS fit on the friend datapoints.

I think you lose 90% of your crowd with every graph you use, so I'm down to 0.1% of my readers.
On this line I performed this calculation I just mentioned of calculating the time difference between two events which should be an approximation of the speed. In the image below you can see the calculated 'speed' compared with the total friends data.

As my number of friends is rising very promising we see a terrifying event. The speed of gaining friends is dropping. We see what we expect, at nightfall the friend speed drops to almost zero, and during day 4 the speed is very low. But there is something wrong in my opinion about the speed data. Of course we used this LOESS fit as an approximation of our data which will give rise to errors. But, in this graph it looks like during day 3 the friend speed is decreasing at a constant rate. And for those who love differentiating and integrating and all, that would mean that the total number of friend should be parabolic. I therefore don't trust this graph all too much and tried it in a third way.

Tangents

Since some regions in the graph seem almost linear it's easy to draw a tangent line there and use that to estimate the speed.

Tangent lines to estimate the friend speed.

In day 3 it looked there were two areas. One in the beginning consisting of people who responded as soon they got the friend requests on their mobile device and some hours later people checking in later.
In the first section the climbed was huge and I estimated it to about 41,23 f/h !! (friends per hour of course). During the rest of the day it settled to a more moderate 3,66 f/h and the next day it dropped to a sluggish 0,70 f/h.

Conclusions

So what the message to take home? When you ask to become friends of FB, they respond pretty fast. The number of friends has to reach a maximum because you only know that many people and most friends you gain comes from the people you invite yourself. So automatically to speed slow down. Good thing too, if the speed stayed of about 40 friends per hour I would so may friends by now I wouldn't be able to keep up with the news feed.

Lesson two: you can't really say anything on such a small dataset and especially on something that is not reproducible. So there is no real science here, I just wanted to make pretty graphs.

(note 2 self: repeat experiment until friends no longer respond)

Tuesday, April 1, 2014

Work in Progress

Teaser update

I haven't posted in a long time and that is partly because I was busy on some stuff I really want to post here.
So this post is a teaser about projects I hope to finish soon.

Sudoku Solver

The first project is a Sudoku Solver. I have liked Sudoku's for a long time and love to philosophize about some interesting features. I know that making a Sudoku Solver is like a first year's Computer Science assignment, but I decided to try and figure it out by myself without any help from other sources.

So far the program is able to solve what you would call Hard Sudoku's, but not the most difficult ones.
Most puzzles only require you to take away the possibilities to find the answers.
For the hardest Sudoku's you have to take more than one thinking step to find more answers. I have plans to improve the algorithm and I believe it will be able to solve any (solvable) Sudoku.

I've made it in Matlab, which is really handy to work with on your own computer, but not so nice to show to other people across the internet. So I still have to do a lot of work to get it online.

Computer Controlled Boat

My first Arduino project was a Led Cube which was a lot of fun, but I was dying for something new and more challenging. When I found a small RC boat at my work I immediately knew what I wanted to do.
I screwed open the remote control and looked where the buttons were giving their signals. When you place a voltage on the right pins you get the same effect as pressing a button. The Arduino is perfect for that.
You can allow the Arduino to send signals to the remote control depending on some input. We could use the keyboard to control the boat for instance. I decided to use Processing, which is very similar to the Arduino language, to communicate with the Arduino via the USB cable.
The whole setup looks this:

On the left we see the Arduino Uno, on the right the opened remote control and to connect the two we see a breadboard in the middle. On the top we see the boat in question upside down so the rotors can turn while testing.

To make this project completely awesome I'll connect a webcam to my laptop to monitor the position and direction of the boat to make it autonomous.
This project is going pretty well so I hope to finish it in a couple of days.

So stay tuned!

Pages

Tuesday, July 1, 2014

How it's done

Finding the peaks

Changing the colors

Tuesday, June 24, 2014

Monday, June 23, 2014

Binary images

Less is more

More of that

Thursday, May 22, 2014

Webcams

Other tweeking

Making the movie

Tuesday, May 20, 2014

Introduction

Choosing the system

Object tracking

Finding the red dot

In the near future

Thursday, May 1, 2014

How fast can I get Facebook Friends?

The experiment

Sending requests

Friend speed

Tangents

Conclusions

Tuesday, April 1, 2014

Teaser update

Sudoku Solver

Computer Controlled Boat