(James Cook University)
📅Jun 18, 2019
Deep Neural Networks (DNNs) have recently achieved remarkable performance in a myriad of applications, ranging from image recognition to language processing. Training such networks on Graphics Processing Units (GPUs) currently offers unmatched levels of performance; however, GPUs are subject to large power requirements. With recent advancements in High Level Synthesis (HLS) techniques, new methods for accelerating deep networks using Field Programmable Gate Arrays (FPGAs) are emerging. FPGA-based DNNs present substantial advantages in energy efficiency over conventional CPU- and GPU-accelerated networks. Using the Intel FPGA Software Development Kit (SDK) for OpenCL development environment, networks described using the high-level OpenCL framework can be accelerated targeting heterogeneous platforms including CPUs, GPUs, and FPGAs. These networks, if properly customized on GPUs and FPGAs, can be ideal candidates for learning and inference in resource-constrained portable devices such as robots and the Internet of Things (IoT) edge devices, where power is limited and performance is critical. Here, we propose a project using a novel FPGA-accelerated deterministically binarized DNN, tailored toward weed species classification for robotic weed control. We intend to train and benchmark our network using our publicly available weed species dataset, named DeepWeedsX, which includes close to 18,000 weed images. This project acts as a significant step toward enabling deep inference and learning on IoT edge devices, and smart portable machines such as an agricultural robot, which is the target application of this project.
(University of Allahabad, Prayagraj, India)
📅Jun 30, 2019
There has been a long history of studying Altered States of consciousness(ASC) to better understand the phenomenological properties of conscious visual perception. ASC can be defined as the qualitative alternation in the overall pattern of mental functioning such that experiencer feels that their consciousness being very different from the normal. One of the qualitative properties of ASC is visual hallucination (Tart C. T.,1972).
Hallucination Machine(HM) is a combination of Virtual Reality and Machine learning developed by Keisuke Suzuki and his team at Sackler Centre for Consciousness Science, University of Sussex, United Kingdom. This can be used to isolate and simulate one specific aspect of psychedelic phenomenology i.e. visual hallucination. HM uses panoramic videos modified by Deep Dream algorithm and are presented through Virtual Reality head set with head tracking facility allowing to view videos in naturalistic manner. The immersive nature of the paradigm, the close correspondence in representational levels between layers of Deep Convolutional Neural Network(DCNN) and the primate visual hierarchy along with the informal similarities between DCNN and biological visual systems, together suggest that the Hallucination Machine is capable of simulating biologically plausible and ecologically valid visual hallucinations (Keisuke et al. 2017).
Deep Dream is the algorithm developed by Mordvintsev, Tyka (2015) et al. at Google. When an input image is fed into a neural network using Deep Dream algorithm, and the user chooses a layer, the network enhances whatever it detects at the user defined layer. For example, if we choose higher level layer, complex features or even whole images tends to appear. So if a cloud in an image looks like a bird, neural network will make it look more like a bird and enhancement of the bird image in the output image will depend on the number of iterations computed for.
Due to the physiological effects of psychedelic drugs which are known to induce ASC, scientific community is in need of some alternative tool to study consciousness. Study done by Keisuke et al.(2017) provides no information whether the Hallucination Machine can be used to study the neural underpinnings behind the conscious perception of emotional visual processing. It is still a very hot topic in scientific community whether there is any role of top-down signalling or predictive processing theories of perception (Bayesian Inference) in the formation of perceptual content. We even don’t have any clear answers regarding whether the emotional visual processing is a late or early process.
So to answer these questions our team is developing a DCNN using Deep Dream and Deep Dream Anim algorithms and it will be trained on large data set of emotional images prepared by Dr. Narayanan Srinivasan at CBCS, University of Allahabad, India. Then test images will be evaluated by tweaking the lower and higher level layers, number of iterations and other parameters. Based on the analysis of results the above mentioned questions can be answered.
So, it will be an exploratory research to decipher the science of conscious perception that can be used in advancement of vision science and technologies around it.
(International Institute of Information Technology Hyderabad)
📅Jul 04, 2019
Aim of the project is to detect the major traffic rules violations using MobileNet CNN on Indian roads. Traffic Management and Road Safety are one of the major issues in Cities. In a country like India with highly populated cities, vehicular count manned traffic surveillance is very cumbersome and time taking. It needs a lot of manual effort to carry out such a job. Police personnel can't vigil traffic round the clock, so we need some automated systems which can detect violation of road rules and regulations for the safety of citizens. We have come up with an innovative deep learning solution for common traffic rules violation we see in India like:
• People not wearing a helmet on bikes
• Three people riding on the same motorcycle (Triple riding)
• Identifying vehicles traveling in the wrong route
• Identifying signal jumps and speed violation
These are the most important rules for the safety of an individual. Our system takes a live feed of camera as an input, detects vehicles which violate traffic rules and automatically crops the number plate and report to the concerned authorities.
Real time video processing requires huge computational capability. Open VINO starter kit is suitable for high performance computation and designed for computer vision applications. It also supports Intel FPGA OpenCL BSP for developers to design a system with high level programming language which implies open vino starter sit gives all kinds of flexibility to developer. Hence we opted Open VINO Starter kit.
👤Sharjeel Riaz Ahmad
(National University of Sciences and Technology (NUST))
📅Oct 21, 2019
With the recent advances in Machine Learning, it has become possible to train neural networks such as video2x and SRGAN that "realistically" and smartly upscale images or videos from low-detail/low-resolution input frames. However, the upscaling process is slow and computationally expensive, owing mainly to the inherent dissimilarities between the the architectures of the contemporary general purpose computing hardware and those of the neural networks.
Our aim for this project is to develop a hardware accelerator for image upscaling neural networks (Generative Adversarial Networks, in specific) that employs parallelism, and leverages the specialized dataflow of these neural networks for fast (and possibly real time) upscaling of images.
The hardware accelerator, once realized and deployed on the client-side, may be used for realistically upscaling low-resolution video streams and graphic textures. Hence delivering desirable quality and detail of images while saving crucial bandwidth.
📅Jul 03, 2019
A person who is visually impaired finds navigating independently very hard and needs to rely on someone else to take them to desired place safely. Idea is to build smart belt which provides audio feedback continuously for the person to navigate without hitting/tripping on obstacles. This is achieved using Intel RealSense Depth Camera D415 which is used distance approximation of various elements in the scene. Intel RealSense Depth Camera D415 module is connected via USB to Cortex-A9s.Intel RealSense Depth Camera D415 modules have RGB camera whose output is used to determine different movable/non-movable objects in the scene, so that the audio feedback can be more specific about animals/non-movable objects.
This continuous video input is fed into YOLO network for efficient elements in the scene detection. This project require s reasonably high frame rate hence FPGA offloading is required. The offload engine would return object detecion prediction with coordinates to Hard Processor System and Hard Processor System does the task of overlaping depth map with objects identified to play out meaningful audio.