Qi Pan, a Cambridge University researcher has developed proForma. ProForma is a tool that turns any normal webcam into a 3D scanner. I got to interview him and talk about his product, which if it worked could one of the holy grails of mass customization. It would enable anyone to inexpensively turn things digital and then reproduce them. Joris Peels: So how did you get started on this project? Qi Pan: At the start of my PhD, I was interested in real-time 3D modeling of outdoor scenes. However, several months in, I realised that current processing power wasn't enough to model outdoor scenes well (due to occlusions, lack of texture, etc). Therefore I turned my attention to smaller objects, which would stand a better chance on current hardware. With smaller objects though, they would always be sitting in an environment, which you wouldn't want to model, which led me to the idea of using a fixed camera and separating the object using motion. All of the design choices made in the system were then tailored towards making everything as fast as possible, whilst still producing a reasonable output. How long did it take? develop, although not all of that time was spent on development (time was also spent on publications and attending various conferences). What was hard to do? into a real-time system. The problem with real-time is that if any one part of the system is not working well, your system just doesn't work full stop. Therefore you need to make sure all parts are well optimised and producing the right output at the right time for the other components. When designing each component, the utmost care had to be taken to ensure that we were doing things as efficiently as possible, using the best available algorithms (or inventing our own if none existed). How does it work exactly? The first stage is a tracker, and uses the partial 3D model we've constructed to work out the position and orientation of the object relative to the camera. This stage also tracks the position of interest points (areas of high contrast change) in the images frame-to-frame. After a significant enough motion is detected, a key-frame plus the interest point tracks are passed to the reconstruction stage. Only interest points on the object are tracked as there is a mathematical constraint on the motion of points on a rigid object (based on Epipolar geometry). The reconstruction stage takes these feature tracks and triangulates 3D positions in order to form a cloud of points. This is then meshed using a 3D Delaunay tetrahedralisation. This however merely partitions the convex hull of the points into tetrahedra, so therefore we need to employ a carving algorithm to remove incorrect tetrahedra from concavities in the object. We formulated a very efficient probabilistic carving algorithm to achieve this, which allows us to obtain the surface of the object based on the interest points we've seen in each keyframe. This method requires a partial 3D model to track from, which isn't available right at the start of reconstruction (but is later). Therefore, our initialisation step differs slightly from normal operation. We assume that at least part of the object falls within a large circle at the centre of the image. We track interest points inside this circle, and use rigid body motion constraints to ascertain the orientation and position of the object relative to the camera. Amazingly, this is possible, even if we have no idea about the 3D positions of the interest points we are tracking! The system then works as above once we have this initial orientation and position. But, can I take a thing and then you will give me a mesh? points, so the object must have enough areas of high contrast change. What are some of the limitations? This system is of course only a first step in generic object reconstruction, and as such has a few limitations. One limitation is the inability to model objects or parts of objects without enough texture. This is something we are working on - we are seeking to This approach can in theory be applied to modeling entire scenes,combine other cues to complement our interest point based approach. but then we come up against the problems of the environment not being The technique as it stands can only be used to model rigid objects duetextured enough in areas, occlusion and needing more processing power. to the rigid body assumption being used for segmentation. You will be working more on it in the future? concept and just the tip of the iceberg in terms of what we can achieve. Will there be a tool that people can download? When? I'm currently porting the software to the newest libraries (which unfortunately means reimplementing lots of stuff from scratch) but in a few months time we aim to release a linux-based demo which will hopefully be followed by a windows based demo after that. |