The recent release of OpenKinect, the open-source drivers for the Microsoft Kinect, has unleashed a creative tsunami. Just days after OpenKinect announcement, scores of digital artists, reearchers and hobbyists have started to demonstrate a dazzling array of projects including RadioHead-like point clouds, skeletal tracking, multi-touch, etc. No doubt there is much more to come.
While I too have been buoyed up by this creative wave, I have a different need. As a robotics hobbyist, I want to use the depth maps produced by the Kinect device to simplify a robot’s task of recognizing objects in its view. To do that, I need to convert the Kinect’s depth map output into a 3D point cloud that can be stored, manipulated and analyzed. In other words, I want to be able to study a point cloud, make annotations if need be (e.g., to mark out surfaces or put bounding boxes around objects) and more importantly, to try out various algorithms on the dataset. Doing this type of analysis is virtually impossible(?) with a streaming video.
While searching online for solutions, I used these criteria for screening:
- the tool should be simple to use and “lightweight” (ie, don’t have to install download a huge software bundle)
- modular… Unix tools approach
- leverage readily available tools, open-source if possible
What I came up with is the process described below. Source code is at the end of this post.
1-2-3 Step Process
1. Get a snapshot or sequence of snapshots from the depth camera.
I created the glgrab program using the OpenKinect sample glview.c code as a start point and added code to dump the depth image. Presently, this program is bare bones. It waits for ~3s before grabbing an image frame from the depth camera and saves it to a file called <prefix>_nnn.bin, where <prefix> is the output name prefix that you specify. The nnn is the file index starting with 001. Make sure to move your last saved images before rerunning glgrab or they will be overwritten!
Use the -c flag to change how long the program will wait before saving depth images, and the -n flag to change how many frames to save (default=1). Type: glgrab -h to see the help info.
UPDATE 1/24/2011: I modified my glgrab.c program to work with the latest OpenKinect git build (2ea3ebb4b2be5d0472a8).
2. Convert the depth image into an point cloud file. I wrote a Python script that takes as input the filename of the saved image from Step 1 and the name of the output 3D point cloud file.
python depth2cloud.py <image_bin_file> <cloud_name.ply>
The cloud is saved in PLY file format which is very easy to understand and can be directly loaded into MeshLab.
NOTE: depth2cloud.py requires the pylab module. See this instruction page on how to install pylab.
3. Load the point cloud. Launch MeshLab and click on File -> Open, then select the PLY file. For navigation tips, click on Help -> On screen quick help.
Here are examples of point clouds from images I took inside my house. Note that even though some details are lost, the flat surfaces are well-defined. This should make image segmentation and surface extraction easier.
Point cloud showing various items on the living room floor.
Globe and various items on shelf.
Here’s the source code. To build glgrab:
- Unzip and copy the contents to the libfreenect/examples folder. The CMakeLists.txt adds the glgrab as a target to be compiled.
- cd ../build
You should now see the glgrab program in the build/bin folder along with glgiew and other OpenKinect binaries.
To use the depth2cloud.py, you’ll need Python and the pylab module installed.
If you don’t have your Kinect yet or simply want to see sample point clouds, here are the raw images and PLY files for the two examples above. I included the raw images so you can try to generate the PLY files yourself.
Hope you find this useful.