👤SAMEER BAIG MOHAMMAD
(RAJIV GANDHI UNIVERSITY OF KNOWLEDGE TECHNOLOGIES, NUZVID)
📅Oct 11, 2019
For the past few years, there is an increasing trend in the road accidents across the world, there is no exemption even for the developed countries like US for this problem, which reported 40,000 accidents deaths in the year 2018. This shows the gross severity of the problem which should be addressed immediately.
Road accidents lead to lots of unpredictable consequences such as abnormal deaths, perpetual injuries, loss of earnings, etc. The primary causes of these unexpected accidents are distracted and drowsy driving, over speeding, violation of traffic rules, drunken driving, etc. To overcome these constraints, we aim to design an efficient and reliable autonomous car.
Autonomous car is capable of sensing its environment and takes decisions in accordance with the rules without the intervention of human beings. Our design works on the concepts of Convolution Neural Networks (CNN), Machine Learning, Computer vision, and Image/Video Processing. In our project, the autonomous system will detect the lane and obstacles, calculate the distance from the obstacle to the car, blow the horn when a pedestrian is detected and controls the car’s acceleration based on the information given to the system.
Simultaneously, the provision will be made for live audio streaming (especially for the visually impaired) and displaying video streaming to the persons inside the vehicle.
Parallel architecture features of FPGA will be utilized for arriving quick decisions using CNN.
If time permits, we may plan to implement destination arrival using the GPS.
(Tokyo Institute of Technology)
📅Oct 15, 2019
This project is an FPGA implementation of the accurate monocular depth estimator with realtime. The monocular depth estimation estimates the depth from single RGB images. Estimating depth is important to understand the scene and it improves the performance of 3D object detections and semantic segmentations. Also, there is many applications requiring depth estimation such as robotics, 3D modeling and driving automation systems. The monocular depth estimation is extremely effective in these applications where the stereo images, optical flow, or point clouds cannot be used. Moreover, there is the possibility to replace an expensive radar sensor into the general RGB camera.
We choose the CNN (Convolutional Neural Network)-based monocular depth estimation since the stereo monocular estimation requires larger resource and CNN schemes are able to realize accurate and dense estimation. Estimating the depth from 2D images is easy for human but it is difficult to implement accurate system under limited device resources. Because CNN schemes require massive amount of multiplications. To handle this, we adapt 4 and 8-bit quantizations for the CNN and weight pruning for the FPGA implementation.
Our CNN-based estimation is demonstrated on OpenVINO Starter Kit and Jetson-TX2 GPU board to compare the performance, inference speed and energy efficiency.
(University of Moratuwa)
📅Sep 23, 2019
Traffic congestion is a widespread problem that results in the loss of billions of dollars annually, valuable time of citizens and in some cases: invaluable human lives. By utilizing our custom designed CNN accelerator, we propose an edge-computing solution for this problem, that is both cost-effective and scalable. For developing countries like Sri Lanka, our vision-based traffic control on FPGA would be an ideal solution as described below.
In most countries, traffic flow is controlled by traffic lights with pre-set timers. In Sri Lanka, this often causes congestion during peak hours as the system is not sensitive to the traffic levels in each lane of an intersection. To solve this, the traffic policemen usually turn off the lights and manually control the traffic during peak hours. However, the policemen are unable to visually judge the level of traffic in each lane from their vantage point close to the ground.
An automated solution to this problem would be vision-based traffic sensing. However, the neural networks that excel in machine vision tasks require powerful GPUs or dedicated hardware. Laying cables along the road to transmit video feeds to control centers would require expensive infrastructure which is infeasible for a developing country like Sri Lanka.
Therefore, we present an implementation of a traffic sensing algorithm that is based on Object Detection on FPGA as a cost-effective, scalable, edge solution. We use YOLOv2, a state-of-the-art CNN for object detection accelerated through our custom CNN accelerator with post processing done on the ARM processor.
Custom CNN Accelerator Design:
A unique aspect of our project is, we design and implement a brand-new highly parallelized CNN accelerator whose single core at 100 Mhz can run a 384 x 384 RGB image through YOLOv2: (a 23-layer state-of-the-art object detection CNN with 2 billion floating point multiplications, 6 million comparisons, 8 billion additions) within 0.2 seconds. Multiple such cores can be implemented in parallel / series inside an FPGA to further improve throughput. The architecture can also be used to accelerate several other neural networks with slight modifications.
(University of Moratuwa)
📅Oct 11, 2019
From the world’s population, reportedly there are a considerable amount of people suffering from speech disorders such as muteness, Apraxia (childhood/acquired) and Aphasia. These may occur due to brain damage, stroke, head injury, tumor or any other illness that affects the brain/ vocal cords/ mouth/ tongue etc. Our device mainly focuses the community that suffers from these diseases that cannot be cured by speech-language pathologists.
Existing solutions include image processing techniques where image frames are processed and decoded to text and text-speech converters which converts typed text into electronic vocalizations. However, devices that use cameras require to adapt to various/ sudden lighting conditions with the same high quality which may lead to less accuracy in such conditions. Also, eye contact-eye contact speech opportunity is obstructed in the text-speech method which will not give the person a natural conversation experience.
In order to address these problems and give the users a real time experience in communicating with another we propose a system designed to recognize gestures (Sign Language/ Fingerspelling) using Electromyography (EMG) and Inertial measurement unit (IMU) sensors. The DE-10 Nano kit will get these sensor readings and a pre-trained Deep Neural Network (DNN) will be used for inferencing. Since inferencing will be done on the FPGA board itself, the need for a separate computational device will be eliminated which will make our device a real time processing, portable device. Ultimately this will enable people in need to communicate in an efficient way.
The output is given in both voice and text formats and it will support upto five spoken languages - English, Chinese, French, Hindi and Arabic. An Arduino Nano will be used to interface the output devices. A speaker and a HC-05 bluetooth module is used as output devices. The transcript of the translated sign language can be seen through a mobile device connected via bluetooth.
Possible future extensions of this work include, the addition of any sign language as input and supporting another language as output in addition to the five built in languages.The community support will be very useful in scaling this to multiple sign languages and speech.
👤Joseph Thambi Nelapati
(RAJIV GANDHI UNIVERSITY OF KNOWLEDGE AND TECHNOLOGIES , NUZVID)
📅Oct 14, 2019
Linking technology with agriculture is one of the key areas that is to be given at most concern.
Agriculture is the main stem of Indian economy. But the crops are facing potential problems namely pests, fungal diseases, water scarcity and thereby causing a huge loss to the crops. That’s why farmers are monitoring the crops day-by-day.
It is observed that, in most of the cases farmers will diagnose the disease by identification and will take the proper action to overcome those problems. For the farmers, those who are aware of these problems will easily handle the situation but in the case of farmers, who don’t have proper knowledge about of it, will take improper actions. Finally, it will destroy the entire crop and cause a huge loss to farmer.
This results into giant waste of human work, loss of time and money.
The primary solution for these problems is ‘Digital Farming’.
Digital Farming is applying precision location methods and decision quality agronomic information to illuminate, predict and affect the continuum of cultivation issues across the farm. This farming works on the basis of Digital image processing, Convolutional Neural networks and Machine Learning techniques. By using these techniques our system will work as a guide, to assist the farmers by giving information about Water level, Atmospheric humidity, Temperature and about plant growth, plants affected by pests in the field for the entire life span of the crop. Simultaneously it will suggest immediate measures to be taken to overcome those problems and will suggest the best pesticides to the farmer.
This complete setup of image processing and machine learning will be performed on FPGA DE Nano-10 board to use the features of parallelism and pipelining architecture for getting high speed and accuracy in assessment.
👤Narayan Raval D
(LD college of engineering)
📅Oct 08, 2019
Lie detection is an evolving subject. Polygraph techniques is the most trending so far,but a physical contact has to be maintained.The project proposes the lie detection by extracting facial expressions using image processing. The captured images to be analyzed is broken into facial parts like eyes, eyebrows,nose etc. Each facial parts is then studied to determine various emotions like eyebrows raised and pulled together,raised upper eyelids,lips stretched horizontally back to ears signifies fear while eyebrows down and together, narrowing of the lip shows anger. All the emotions can be aggregated to determine wheather a person is lying or not. The interrogation video or live video is broke down into various facial images of the particular individual. Different emotions from the various images is collected and processed with the general face reading criteria to evaluate his truthfullness.
(Tokyo Institute of Technology)
📅Oct 08, 2019
This project presents an accurate, fast, and energy-efficient object detector with a thermal camera on an FPGA for surveillance systems. A thermal camera outputs pixel values which represent heat (temperature), and the output is gray-scale images. Since the thermal cameras do not depend on whether there is the light or not unlike other visible range cameras, object detection using the thermal camera is reliable without dependence on the ambient surrounding. Additionally, for a surveillance system, visible images are not suitable since they potentially violate user privacy. Thus, this topic is of a broad interest in object surveillance and action recognition. However, since it is challenging to extract informative features from the thermal images, the implementation challenges of the object detector with high accuracy remain. In recent works, convolutional neural networks (CNNs) outperform conventional techniques, and a variety of object detectors based on the CNNs have been proposed. The representative networks are single-shot detectors that consist of one CNN and infer locations and classes simultaneously (e.g., SSD and YOLOv2). Although the primary advantage of the type is that it enables to train detection and classification simultaneously, the resulting increased computation time and area requirements can cause problems of implementation on an FPGA. Also, as for the proposed networks on RGB three channel images, one of the problems is false positive; the realization of a more reliable object detector is required. This project demonstrates an FPGA implementation of such reliable YOLOv2-based object detector that meets high accuracy and real-time processing requirements with high energy-efficiency. We explore the best preprocessing among conventional ones for the YOLOv2 to extract more informative features. Also, well-known model compression techniques, both quantization and weight pruning are applied to our model without significant accuracy degradation, and thereby the reliable model can be implemented on an FPGA.
(University of Auckland)
📅Oct 25, 2019
With the explosive interest in the utilization of Neural Networks (NN), several approaches have taken place to make them faster, more accurate or power efficient; one technique used to simplify inference models is the utilization of binary representations for weights, activations, inputs and/or outputs. This competition entry will present a novel approach to train from scratch Binary Neural Networks (BNN) using neuroevolution as its base technique (gradient descent free) executed on Intel FPGA platforms to achieve better results than general purpose GPUs
Traditional NN uses different variants of gradient descent to train fixed topologies, as an extension to that optimization technique, BNN research has focused on the application of such algorithms to discrete environments, with weights and/or activations represented by binary values (-1,1). It has been identified by the authors that the most frequent obstacle of the approach taken by multiple BNN publications to date is the utilization of gradient descent, given that the procedure was originally designed to deal with continuous values, not with discrete spaces. Even when it has been shown that precision reduction (Float32 -> Float16 -> Int16) can train NN at a comparable precision , the problem resides in the adaptation of a method originally designed for continuous contexts into a different set of values that create instabilities at time of training.
In order to tackle that problem, it is imperative to take a completely different approach to how BNNs are trained, which is the main proposition of this project, in which we expose a new methodology to obtain neural networks that use binary values in weights, activations, operations and is completely gradient free; which brings us to the brief summary of the capabilities of this implementation:
• Use weights and activations as unsigned short int values (16 bits)
• Use only logic operations (AND, XOR, OR...), no need of Arithmetic Logic Units (ALU)
• Calculate distance between individuals with hamming distance
• Use evolutionary algorithms to drive the space search and network topology updates.
These substantial changes simplify the computing architecture needed to execute the algorithm, which match natively with the Logic Units in the FPGA, but also allows us to design processing elements that effectively adapt to the problem to be solved, while at the same time, remain power efficient in terms of the units needed to deploy because agents with un-optimized structures would automatically be disregarded.
The algorithm proposed, Binary SUNA (SUNA  with binary extensions ), will be used to solve standard reinforcement learning challenges, which are going to be connected to an FPGA to solve them more efficiently, given that the architecture will match the evolved network at multiple stages, specially during training and inference. Comparison of the performance gains between CPU, GPU and FPGA will be demonstrated.
 Michaela Blott et al. 2018. FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks. ACM Trans. Reconfigurable Technology
 Danilo Vargas et al. 2017. Spectrum-Diverse Neuroevolution With Unified Neural Models. IEEE Transactions on Neural Networks and Learning Systems 28
📅Oct 09, 2019
Face Recognition in surveillance has now been the most vital need of the future. But large computations that process live streams make the task difficult.
Thus our project aims at reducing computational time with CNN Acceleration & Parallelism, which helps us to achieve Real-Time Face Recognition.
It also does face recognition from multiple Live streams and uses improvised computation method to achieve Low Power.
(Now Why Would You Do That)
📅Oct 04, 2019
This project intends to create a working prototype of an add-on solution for motor vehicles (including motor cycles) to actively detect pedestrian movements and warn of potential collision hazards.
The solution will project information onto a heads-up display to minimise distraction to the driver.
👤Vivek Jangir S
(CMR University, Bengaluru, Karnataka, India)
📅Oct 08, 2019
Bionic Leg is a technological healthcare solution to tackle physical disability. The solution uses any one of the approachs i.e., ML or adaptive control systems. Real time response & adaptability to changes are considered to be the significant aspects/features of any healthcare solution which can be addressed by the FPGA.
👤Shivam Kumar Mehra
(Guru Gobind Singh Indraprastha University)
📅Jul 07, 2019
According to World Health Organization (WHO), The population of visually impaired is estimated for the year 2010 to be 285 million globally. This massive population have to deal with many environmental, social, technological challenges in daily life. It is difficult for them to navigating outside the spaces that they are accustomed to, they can’t easily participate in social activities, blindness restricts their career options that affects their finances and their self esteem. Blindness can make it difficult to use the internet for research, general purpose or social media. Impaired vision not only affects an individual physically, but also affecting their emotional health. Becoming familiar with the challenges that blindness creates can help sighted people understand better about their problems and the importance of this project. Our team taking a step forward to create a bridge between visually impaired lifestyle with normal lifestyle.
The proposed idea is about to create a wearable device that can guide a visually impaired individual in daily life. This device is mounted with a camera and actuators (speaker) interfaced with FPGA (OpenVino starter kit), the device takes the input from external environment as an image, and generate a meaningful output understandable by visually impaired person. We are implementing image captioning algorithm using Convolutional Neural Network (CNN). In the first stage, the project will be able to generate output through actuators understandable by visually impaired individual. After successful implementation of the first stage, the project will be upgraded with voice output using LSTM architecture and a text to speech generator