Caplin recently organised the sixth company Hackday, which, we all agree, was the best so far. You can get a glimpse of what it was like in this video.
Caplin Hackdays are twenty-four hour events where teams of three people get together and try to make something useful or just something cool. In this post I will show you what the team I was part of did.
We decided to dive into unknown territory and do something none of us had ever done before. After much brain storming we decided to tackle augmented reality, by improving the game of table tennis. Not really something Caplin would do as a business any time soon, but it was very refreshing to work on something completely new and push ourselves from our comfort zone of web apps.
Our idea was to use a projector to draw the trail of the ping pong ball and to also show where the ball hit the table. To do that we needed a device that would give us information about the position of an object, so we decided to order the Microsoft Kinect, which is usually used with the Xbox console.
We wanted to do everything in JavaScript (a combination of browser and Node.js) as that is the language we were all most familiar with. But it turned out that getting Kinect to talk with our Node.js app is really, really hard. And once we managed to do it, we had no idea what the data it was giving us was. We realised that the open source projects for Kinect just aren’t mature enough, so we decided to use C# to communicate with Kinect and then send data to the browser using websockets. It also didn’t help that we got the Kinect only 3 hours after we started, which caused us to lose a lot of time.
After setting off to a very frustrating start ,we managed to get data from the Kinect sensor and we made our first breakthrough by being able to track the ball. It’s always a very gratifying feeling when you make something that crosses that magical software-physical world boundary.
We didn’t use any ready made libraries for object detection, we rather coded a simple algorithm that scanned the image provided by Kinect for variations of the colour orange. This allowed us to get the X and Y coordinates of the ball (the Kinect was mounted directly above the table giving it a bird’s eye view). We sent these co-ordinates to the browser via websockets, and as we had already developed some animations in parallel to this, we were able to draw the trail of the ball on the screen. We did all the drawing in the browser using the Two.js library to draw on a canvas.
We face our next challenge after that. The projector wasn’t mounted on top of the table but to the side, which meant the projected image was skewed. So we had to spent a lot of time calibrating the projection and the coordinates of our animations. We used a combination of CSS transforms and coordinate adjustments to get the projection roughly right. Unfortunately we were unable to mount the projector directly above the table, as this would have resolved most of these problems.
After that we started exploring the Kinect’s depth sensor, which eventually allowed us to detect if the ball hit the table and draw an expanding circle around the hit area. Kinect provides you with a stream of data, that contain depth information (in millimetres) for each pixel. That allowed us to detect how far away the ball was (in combination with the X and Y coordinates we detected before). Again, it wasn’t perfect. It seems that lighting (which was ever changing) affects the readings Kinect gives you and the first generation of the Kinect we were using isn’t as accurate as we would like it to be. However, we managed to achieve a good enough effect for the demo.
Finishing our main objectives, we spent some time polishing the demo and adding some wacky features such as party mode and charts. You can see the final result in this video:
The FP: Futuristic Ping Pong from Caplin Systems on Vimeo.
A note on the technologies used
Like mentioned, we used the excellent Two.js library to draw on a canvas.
For the Kinect interaction we first tried getting it to work in Node.js but after a couple of hours we realised it just wasn’t going to work, as all the projects were really badly documented or just didn’t work. So we fired up Visual Studio C# and built our program on top of the examples that come with the Kinect SDK.
It’s been years since I’ve last worked in the Microsoft stack but the smoothness of everything just blew me away again. Everything just works. Microsoft really knows how to make developer tools.
To communicate between our C# app and the browser we used Fleck, which is a websocket server library. Again, things just worked.
I’ll finish this post by saying that I am extremely proud of my team, and myself, for pulling this off. None of us ever worked with augmented reality, Kinect or computer vision. And I would also like to thank Caplin for giving us the opportunity to work on something different. I can’t remember when was the last time I had so much fun building something.