Annual: 2019

AP028 »
Real time, Low power CNN accelerated Multichannel Face Recognition
📁Machine Learning
👤Maheshwari Natarajan
 (zoho corp)
📅Oct 09, 2019
Regional Final



👀 3131   💬 0

AP028 » Real time, Low power CNN accelerated Multichannel Face Recognition

Description

Face Recognition in surveillance has now been the most vital need of the future. But large computations that process live streams make the task difficult.
Thus our project aims at reducing computational time with CNN Acceleration & Parallelism, which helps us to achieve Real-Time Face Recognition.
It also does face recognition from multiple Live streams and uses improvised computation method to achieve Low Power.

Demo Video

  • URL: https://youtu.be/VxW6wwWefwc

  • Project Proposal

    1. High-level Project Description

               In recent days, Technology is getting better and better with new inventions. It brings us an enormous amount of advantages by making our work easier, but these technologies are now being misused by some people to do crimes. Some of these crimes include fake passports, fake voters and terrorism. For these crimes to be reduced, we need to identify the unusual activities caused by certain suspected people. This can be achieved by Recognising the Face of those certain people using CCTV. The unusual activities in roadsides, photo morphing in documents like Passport,driving license, etc., Criminals roaming in public places and illegal entry of unauthorized people entering a restricted area in industries can easily be found using this type of technique. Not only crime controlling, it also has certain other applications like attendance, tracking, authorizational entry and finding the lost person. It mainly helps in surveillance, attendance and tracking.

                This Facial Recognition system gets input images or videos and detects the face of the persons and recognizes them based on the predefined data (facial feature vectors - [512]). Here we have enabled multi channel face recognition system which gets input from multiple channels and run inference simultaneously even with Real-time video. The already existing techniques can run inference for a single input in one Processor, but this system enables us to monitor more number of cameras whose inference could be run in a single machine. This system has to process large number of computations, So by using FPGA as target device for these compute intensive parts, we are achieving near real-time. By the use of FPGA, we are able to attain low power inference system. 

                    This Facial recognition system gets the inputs and process it frame by frame. The Face Detection Neural Network is fed into the FPGA Plugin and the frame is processed in the FPGA for face detections. Once the face is detected, the detected faces are passed into the Mobilenet for Feature extraction. These features are compared with person's facial vectors using cosine similarity and the faces are recognized. From our experiment, in both Face Detection and Feature Extraction Neural Network, we are achieving minimum latency with increased throughput when inferring up to 8 infer request asynchronously with an accuracy around 92%. By this system, we can track a person through multiple streams, prevent illegal entry, track attendance and Faster offline video processing. This is a heterogenetic application which can work across different platforms like CPU, GPU, Movidius, FPGA, etc., 

    2. Block Diagram

     

         This is the overall workflow diagram of our project. Here we get the inputs from various sources like RTSP streams, videos, and Images. This input is given to the preprocessing techniques and the frame level data is given to the Face Detection Neural network. This Neural Network is based on the Single Shot Detection (SSD) architecture, the inference here is getting processed on the OpenVINO Starter Kit based on asynchronous method to improve the parallelism.

          After that the detected faces are given to the Feature Extraction Neural network which is run on Host CPU to produce a vector of size 512. This output feature vector is compared with the person's data whom should be identified based on cosine similarity method. As a result we get recognised person's name, location based on the location of cctv and time.

    3. Intel FPGA Virtues in Your Project

    FPGA Neural Network Execution

                                                                                                               ⟱⟱

     

     

     

     

    Parallelism of computation is achieved in the FPGA according to the number of infer request that is being received. Thus the FPGA has beed effectively utilized to over the difficulties while using other platforms.

    OpenVINO model optimizer helps to generate the optimized and quantized fp16 intermediate representation. Precision reduction allows more processing to be done in parallel.