EM023 » Mono-Camera 3D modeling of an image
In photography, aperture is the opening within the lens through which light travels. It determines the cone angle of light from the image plane. One of the effects of aperture is depth of field. Depth of field is the amount of your photograph that appears sharp from front to back. The further an object from the focus plane, the more blurry it appears in the photo. Conversely, if we can find a way to measure sharpness of an object through digital image processing, we can determine the relative distance of all the objects in an image and even surfaces. With the computing ability of FPGA, it can easily produce an elevation map of an area which can be used for better image analysis, urban planning, and disaster risk assessment.
Design Introduction: Researchers have been able to use high-computing software to extract information within an image, such as detection and recognition. One of such information is being able to construct a 3D projection of an object. Most of the algorithms used have their own limitations. One algorithm uses multiple cameras which is very effective in 3D extraction at a very high cost. Single camera reduces the cost but has its own limitations too. 3D extraction from single camera uses the concept of aperture to perceive depth in an image. One of the main disadvantages of this is that, it can only detect depth of edges of an object. It is also difficult to determine if an object is before or after the focus plane since aperture has the same effect on both sides as long as they have the same distance from the focus plane.
Objective: The main objective of this project is to answer these problems to improve 3D extraction using a single camera and an algorithm that we designed. Instead of taking one picture from the camera, we will take at least two pictures of the same image plane focused at a different distance. Theoretically, this can determine whether an object is before or after the focus plane. In addition, it does not need the edges of an object since the algorithm will be comparing the images in extracting depth.
Application scope and target user: With the portability and high parallel computing power of DE10-Nano, this project can be used for fast assessment of risk in an area during disasters with the help of a drone. Given all the parameters of the camera used, users will be able to extract relative and absolute distances within the image, creating a fast 3D model within the area. The same concept can also be used for mapping of cities for urban planning. It can also be used for security system such as detection of any moving creature within an area and track its absolute coordinate. Depth map is also relevant in autonomous navigation as it gives ubiquitous proximity sensing capabilities to machine vision. Since depth is independent of color, another usage of depth map is background extraction. Extraction can happen since depth perception can detect the background layer, which has a distance gap to the objects of an image.
Image Acquisition: The system uses a DSLR camera with an aperture depending on the distance of the object that the user wants to capture. An aperture of f2 will be used for objects that are in short ranges from the camera while f16 will be used in landscape image planes.
Capture 2 images of the same image plane using different focus plane and save it in SRAM for image processing in the next part.
Implementation of Depth perception algorithm inside the FPGA using blur detection
Compare blur measurement of all the parts of the two images to extract depth
Create a matrix of depth of the image
Convert depth measurement to a black and white color scale and send it through VGA and display on the screen.
The output can be displayed in a VGA projector or PC monitor or save it to a memory card for offline viewing
One way to assess the performance of the system is to compare the accuracy of the depth map extracted from the system and the actual depth map from the image plane. Accuracy of the depth relies on the algorithm used to measure blurr of the image. Different algorithms will be compared to determine which among these algorithm works best for our system. Depth perception of the system can be improved furthermore by comparing two depth maps of the same image taken with different focus plane.
With the help of the SOC, this algorithm can be easily within the DE10-Nano kit. The FPGA provided can also provide an output than can be interfaced to matlab or any other software with advanced data presentation a new image file format that contains the image itself, with the depth map. Matlab should be able to create a 3D-like projection of the image.
Camera aperture is the opening in the lens through which light passes to enter the camera as shown in the two figures. It is the one responsible for the sharpness/blurriness of an object away from the focal plane (green line) giving a depth effect on the image. In the first image, the red point-source is in the focal plane. Light travels through the camera and reflected towards the sensor plane (violet line). Objects in the focal plane are the sharpest within the image. The blue pixel which is slightly away from the focal plane such that the point in which light converges is not at the sensor plane as shown in the figure. This produces blurriness of the object. In the second figure, the camera is set to a wider aperture amplifying the blur. If we can find a way to measure this blurriness, we can reverse engineer this idea and estimate the relative distance of all the objects in the image to the focal plane.
2. Blur Measurement
A blurMetric by Crété-Roffet et al was first implemented in our design. It first gets a horizontally and vertically filtered image of the source, a matrix of difference of the pixels around its neighbor and compare the two images. This gives a good estimate of the depth of the source image but the time it takes to process one image is quiet long. We greatly reduced this computing time by considering only the difference of the pixels and its neighbor
1. Image Acquisition
Although the camera interface and the image acquisition are not included in our scope, it is important to know the kind of camera to be used. Since blur measurement is highly dependent on the amount of bluriness of an object in the image, it would be better to amplify the blurring effect of depth by having wider aperture ( f1.2 to f1.8). With this, slight distance from the focal plane will have higher blur measurement. The First figure is a good example for this. The focal plane of the image is the front part of the eyeglasses. we can easily observe that objects from less than 1 metter are already blurred.
2. After image acquisition, the image is then processed pixel by pixel. blur is first calculated by getting the one with higher difference between each pixel vertically or horizontally.
it can be expressed mathematiclly by: blur = max[ image(x,y)-image(x+1,y), image(x,y)-image(x,y+1) ]. The second image is the result of this process.
The third image is the reciprocal of all the values of the matrix to make depth more visible.The focal plane is the part with dark pixels and becomes ligher as distance from the plane increases.
3. The next stage of data processing is getting the average values of every 50x50 pixels. The part of the image where the focal plane is located is boxed blue and have darker pixels. the red box are the part of the image that are slightly away from the focal plane while the yellow boxes are objects that are already too far away from the focal plane.
The success of this project relies on how effective the algorithm is in measuring blurriness of a segment of an image with lesser computational speed. The most basic blur detection available in the internet is the pixel difference method. Another method is the one implemented by Crété-Roffet, et al which is more effective at the cost of having higher processing time. We decied to improve the pixel difference method by getting the vertilcal and hortizontal difference and also averaging a 50x50 segment of the image.
2. Depth of field
The main goal of this project is to have some perception of depth for computer vision. this way, computer will have idea of object location and distance which is an important thing for the applications mentioned in earlier chapters of this paper. .
The easiet way to implement DSP is through python in LINUX operating system. DE10 is capable of booting a linux OS.For the sake of prototyping, image was loaded to the device by transfering file from an external camera to the DE10 through a flash disk. this can be replaced with a live feed from a camera interfaced to the DE10. After file trasnfer, everything else is software.
The source file is first loaded and repersented to python using matrix. DSP such as the pixel difference and averaging can be done after. The whole process takes a considerable amount of time but if implemented directly to DE10 without linux, this computational time would greatly reduce