InnovateFPGA

Real-time analysis of medical diagnostics using AI is crucial in healthcare systems. Advanced sensors with deep learning networks need to analyze the diagnostics in set time constraints. The need for high-speed real-time systems is imperative. Generic computer architecture slows down this process. Using low-latency networks with FPGAs will decrease the analysis time by reducing idle cycles, and working on resource utilization. Digital image processing (DIPs) fares better on FPGAs.
Through time, FPGAs are increasingly being used for computationally intensive tasks. Image processing is one such task. To improve the performance of Digital Image Processing systems, it is necessary to implement them on hardware instead of software. FPGAs are inherently good in parallel processing because of the architecture. Incidentally, image processing tasks like feature detection and extractions are highly parallelizable-which makes FPGAs the ideal candidates for this task. We have seen how the healthcare systems had been overwhelmed during the pandemic. Early detection plays a crucial role during the onslaught of highly contagious diseases. Containment and early diagnosis can reduce the losses inflicted during a pandemic. Using the FPGAs and cloud storage, the team can create a robust image detection system to detect the presence of the disease, and use the cloud to distribute relevant information to the concerned doctors. Image detection with CT, MRI, X-Ray scans helps in detecting the disease earlier instead of later. The same image detecting algorithm can be used for similar respiratory diseases that have overlapping symptoms, with minimal changes. Additionally, the cloud ensures that data from across the world is shared with the relevant specialists-removing barriers in healthcare. The accelerometers and sensors will be used to get general information about the body which is important for overall health; for example- temperature sensors to detect temperature, accelerometers for gait analysis and mobility(mobility and gait are important factors to ensure the health and well-being of older patients- fall risk assessment and balance evaluation being a few examples) etc. All the sensors will be used to monitor important aspects of the patient’s health.
This project is a step towards a broad spectrum well-being platform for patients from all walks of life, making healthcare more accessible among the masses.

Demo Video

[URL: https://www.youtube.com/watch?v=NSUCK6vdTn8]

Project Proposal

1. High-level project introduction and performance expectation

PURPOSE OF THE DESIGN

a) INTRODUCTION

COVID-19 is more contagious and deadly than the other COVID viruses. Influenza virus had a shorter median incubation period (time from infection to onset of symptoms) and a shorter serial interval (time between successive cases) than COVID-19 virus. The COVID-19 virus has a serial period of 5-6 days, while influenza virus has a serial interval of 3 days. This makes containing and managing the infection more challenging. The world economy, health, and trade have all suffered as a result of this extremely contagious disease (Covid-19). In terms of morbidity and death, Covid-19 has outperformed its predecessors SARS-CoV and MERS-CoV. As a result, it is critical to accurately detect and diagnose the virus in order to limit its transmission.

"Severe acute respiratory syndrome" has been labelled (SARS). To quickly detect the severe acute respiratory syndrome–associated coronavirus, a real-time reverse transcription–polymerase chain reaction (RT-PCR) assay was devised (SARS-CoV). Covid-19 is detected using the same method. The high frequency of Covid-19 and the poor therapy effects may be because to significant RT-PCR error (30-35%), a lack of separation between viral contamination and disease-bearing patients, or false negatives. When RT-PCR failed, a CT (Computed Tomography) scan of the lungs was crucial in detecting the disease. CT imaging, on the other hand, has its own set of restrictions that must be addressed. The lack of specificity and similarity between lung lesions caused by other types of viral infection or community-acquired pneumonia (CAP) could lead to Covid-19 being misdiagnosed. The use of robust methods such as machine learning was hypothesised to be able to address CT imaging technical bias and human errors.

b) NEED FOR FPGA

Real-time analysis of scientific diagnostics using AI is important in healthcare structures. Advanced sensors with deep learning networks need to investigate the diagnostics in set time constraints. The want for high-pace actual-time structures is vital. Time-honoured computer structure slows down this technique. The use of low-latency networks with FPGAs will lower the evaluation time by means of decreasing idle cycles, and operating on aid usage. virtual picture processing (DIPs) fares higher on FPGAs.

Via time, FPGAs are an increasing number of being used for computationally intensive obligations. Image processing is one such mission. To improve the performance of digital picture Processing structures, it is essential to enforce them on hardware as opposed to software programs. FPGAs are inherently true in parallel processing due to the architecture. Incidentally, picture processing duties like function detection and extractions are relatively parallelizable-which makes FPGAs the correct candidates for this assignment. We have seen how the healthcare structures have been crushed all through the pandemic. Early detection plays a critical role during the onslaught of incredibly contagious diseases. Containment and early prognosis can lessen the losses inflicted for the duration of a plague. With the use of the FPGAs and cloud storage, the crew can create a robust photograph detection machine to hit upon the presence of the disorder, and use the cloud to distribute relevant records to the concerned medical doctors. Photograph detection with CT, MRI, X-Ray scans helps in detecting the sickness earlier in preference to later

Sensors provided by the FPGA are used to keep a track of the patients’ vitals in case they have been tested positive. Using the Intel FPGA and cloud storage, a robust image detection system is realised to detect the presence of the disease. Cloud is used to distribute relevant information to the concerned doctors and specialists.

c) LITERATURE SURVEY

Lin Li, Lixin Qin talks about the design and evaluation of a 3D deep learning model to detect coronavirus disease (Covid-19) from chest CT scans. It talks about the similarities in imaging characteristics of Covid-19 and community acquired pneumonia that are caused due to various other viruses. The paper concluded that a robust deep learning model is developed to differentiate coronavirus disease 2019 (COVID-19) and community-acquired pneumonia (CAP) from chest CT scans.

Ran Yang*, Xiang Li goes in depth about the CT severity score which is used to evaluate the severity of pulmonary involvement quickly and objectively in patients with COVID-19. The CT-SS assumes that the amount of lung opacification is a surrogate for Covid-19 burden. It concludes that the CT-SS could be used to evaluate the severity of pulmonary involvement quickly and objectively in patients with COVID-19.

A few of the restrictions that are related with the RT-PCR are that the response begins to create duplicates of the target arrangement exponentially as it were amid the exponential stage of the PCR response. RT-PCR may reflect the viral multiplication of clearance in tainted patients, Location of circulating tumor cells in blood can be found to be impediments. Since of inhibitors, storing up of pyrophosphate particles, and self-tempering of the amassing thing, the PCR response definitely stops to heighten target gathering at an exponential rate and a “level affect” happens, making the conclusion point measurement of PCR things conflicting. CT imaging contains a much higher affectability of ~80-98% but comparative precision of 70%. To upgrade the exactness of CT imaging location, machine learning was used.

APPLICATION SCOPE AND TARGET USERS

This project is a step towards a broad spectrum well-being platform for patients from all walks of life, making healthcare more accessible among the masses.

WHY INTEL FPGA WAS USED

Generic computer architecture slows down the image classification process. Using low-latency networks with FPGAs will decrease the analysis time by reducing idle cycles, and working on resource utilization. Digital image processing (DIPs) fares better on FPGAs.

Through time, FPGAs are increasingly being used for computationally intensive tasks. Image processing is one such task. To improve the performance of Digital Image Processing systems, it is necessary to implement them on hardware instead of software. FPGAs are inherently good in parallel processing because of the architecture. Incidentally, image processing tasks like feature detection and extractions are highly parallelizable-which makes FPGAs the ideal candidates for this task.

COMPUTATIONAL ADVANTAGE:

Hardware is important when it comes to implementation of computer intensive algorithms. DE10-Nano gives the user the full flexibility, power and computability of the ARM Cortex-A9 processor. By dumping major elements of the computational workload to the FPGA, and also integrate sensors to enhance performance of the FPGA

LEARNING CAPABILITY of the FPGA

Learning and gleaning patterns from the dataset is of great importance. Associative learning helps performs tassks faster.

EASE OF ACCESS

With the upcoming technical advancements in HPS, understanding and working with FPGAs and using them to build a sustainable future is getting easier. We use VNC viewer in our project for a seamless UI.

COST EFFECTIVE

The Intel Arria family and the Intel Cyclone family have many Digital Signal Processing blocks that help in computation and significantly reduce cost.

FLEXIBILITY

The scope of our idea is boundless. It does not stop with CT images of lungs, we can, with a little tweaks in the program, use it for MRI, CT and X-ray systems too.

AVAILABILITY OF IP CORES and BUILDING BLOCK CORES

IP cores and Building Block cores make it easier for faster programming.

The DE10-Nano board works like a well oiled archery machine, targetting seamless compuatation with accuracy and efficiency.

2. Block Diagram

3. Expected sustainability results, projected resource savings

The CPU is 1/12 -1/7 times slower than the FPGA. GPUs are limited by their memory architecture. GPUs are limited to smaller image processing algorithms. GPUs fail to deliver as the algorithms become more elaborate. Software implementation can process images upto 25 Frames Per Second(FPS) with a trade-off between accuracy and speed. FPGAs can work with high frame rates, around 100 FPS, usually in the same clock cycle. The scan window can be doubled or tripled depending on the characteristic gate capacity of the FPGA

The skeletal architecture of the algorithm is U-Net. Compared to the other networks, U-Net performs better in segmentation problems with an accuracy of about 95%. The network has about 39.39 million parameters and a dice coefficient value of about 0.92.U-Net needs a small dataset of annotated images to start with. All these points suggest that the U-Net is an ideal skeletal network architecture for biomedical segmentation.

Internet of things makes it easier for energy efficient technologies.

According to UNGA's SDP, the 7th goal is affordable and clean energy which the FPGAs contribute greatly towards. With FPGAs the developers can ensurethat the designs meet the timing and space requirements.

Conventional processors such as CPUs consume a large amount of power and cannot be optimised to build target applications

FPFAs stand the middle ground by giving a trade off between programmability and efficiency without sacrificing the throughput of the application

This program is scalable, meaning it can be applied to similar backgrounds without any major changes

Lastly, it is accurate in terms of detecting errors and correcting them

4. Design Introduction

PUPROSE OF THE DESIGN

Medical image processing is computationally intensive. This field will considerably improve with hardware acceleration. Field Programmable Gate Arrays are used to improve the computational abilities in this field. Using reconfigurable hardware in medical imaging is a relatively new concept, but the rarity does not justify the advantages of using this technology in the field. There are a variety of reasons why FPGAs are chosen for this task.

FPGAs are reconfigurable, which means that they can be configured based on the needs of the hour. This is very unlike ASICs and ASSPs where the entire unit has to be scrapped for the slightest errors or modifications. Another advantage is that the FPGAs are flexible, and can be applied to different ranges and sensors. This implies that in the future, with the required changes, this algorithm can be used for security check-in in various places.

Software-based image processing is different from hardware image processing in many ways. The result may be the same but the means to this end is entirely different. In the case of hardware, FPGAs and GPUs are used to perform computation. Software based image processing is slower compared to hardware based image processing. While GPUs are a popular option, FPGAs are increasingly becoming popular in the field of image processing

It has an inherent advantage as the integrated circuit can be programmed to work as per the situation. Reprogrammability and the ability to debug the programs also help FPGA assert dominance over all the other hardware options. FPGAs also have monetary advantages, easier to market Compared to the ASIC. They also have a smaller incubation period. Not to mention the most coveted property of the FPGA, parallel computing is the proverbial jackpot in processing. FPGAs execute pipelining and parallelism with ease, hence increasing the computing power exponentially. Finite State Machines are used to effectively control interrupts so that the processing flow is simplified and more efficient.

IP in FPGAs has helped increase the audience for the hardware. IPs have made it possible to expand the utilization of FPGA-which was previously reserved for seasoned electronics engineers. Softwares like Altera and Xilinx have significantly reduced the complexities of FPGAs, and have made them more accessible to the people.

Describing why the FPGAs are the pivotal point of this design, the team would like to elaborate the purpose of this project. Although this project focuses on the lung CT scan algorithm, FPGAs can be virtually used in every domain of medicine. Citing the advantages mentioned above, we can reduce the costs and substantially improve device performance. Patient monitoring facilities( eg: ventilation, life support etc.) will be greatly benefited from the FPGAs. Genomic sequencing can use FPGAs to tackle rare life threatening genetic diseases. Other radionuclide imaging methods like PET and MRI can use similar algorithms to process images. In the coming years, FPGAs will be used in every facet of this field. With the right implementation, this ambitious vision of making the best of healthcare more accessible to the common man can be realised.

5. Functional description and implementation

This project uses Intel FPGA SDK for OpenCL which uses the advantage of heterogeneous programs

The preprocessing of the CT scan of lungs is done by the CPU. The results are stored in the Azure cloud. The sensors' telemetry is also stored in Azure container registers for further plan of action. Infact, we also have the option to use Azure cloud to preprocess data with Azure Machine Learning paired with the FPGA. Edge computing can also be done using Azure

Necessary data is stored in cloud as the FPGA streamlines the data and trains and tests the datasets to differentiate between different pulmonary diseases, meanwhile the sensors feed also give indication about the overall wellbeing of the person.

One-hot encoding is used to enumerate different conditions and categorise the conditions as out of danger, critical, near critical, etc

File transfer protocols are used to transfer files between the host system and the FPGA board

After the algorithm runs, the results of the patients are stored in the Azure cloud, this enables connectivity and easy of access, even during times of emergencies. This is specially important in the case of viruses like SARs-CoV where the oxygen saturation drops exponentially near the tipping point.

Our design focusses on accessing and delivering life saving data at crucial moments

6. Performance metrics, performance to expectation

Various metrics have been introduced to compare the performance of different mechanisms. Some of these metrics help you compare the quality of service provided to users, while others help you monitor network resource usage.

THROUGHPUT OPTIMIZATION

Optimizing throughput requires moving as much data as possible in a given amount of time. In order to achieve high throughput, the speed of data movement must be maximized. The data transfer rate should be as high as possible. Data vectorization and parallel CUs were introduced into the system to increase the throughput of convolutional kernels. Input features and weights located anywhere in the adjacent feature map are grouped into one vectorized input. The size of vectorized data is controlled by the vector_size design parameter. The vectorized data stream is retrieved by the kernel and sent via OpenCL channels to multiple CUs in the convolution kernel. The number of parallel CUs used is controlled by another parameter cu_no. By changing the values of the vector_size and/or cu_no parameters, the implemented project can provide scalable performance and hardware cost without the need to modify the kernel code.

BANDWIDTH OPTIMIZATION

Bandwidth optimization summarizes the overall improvement in bandwidth to and from the network. We can generate reports according to the time period, port and traffic direction we choose. In order to reduce the load on the external memory bandwidth, a sliding window based data buffering method is introduced. The filter step S of the convolution window is usually smaller than the filter size K (S = 1 in most cases). Therefore, most of the data can be reused during convolution calculations. With a focus on utilizing data reuse, the kernel fetches a data window each time it covers the ft_no region of the convolution filter and caches the data in the built-in buffer. For sequential convolutional filtering operations, the feature map and weight data are reloaded from local memory to prevent access to external memory. To demonstrate the effectiveness of this scheme, we profiled the memory bandwidth (DDR SDRAM) of the implementation using various ft_no (filters) values on the FPGA. The average throughput reduction achieved reached 50%.

LATENCY OPTIMIZATION

Latency is a measure of machine learning to determine the performance of different models for a particular application. Latency refers to the time required to process one unit of data, assuming that only one data unit is processed at a time. Latency is inversely proportional to throughput. The lower the delay, the better. So, the latency is reduced when high throughput is reached.

7. Sustainability results, resource savings achieved

With catastropic changes in the environment it is increasingly important to sustainably approach problems.

As mentioned in the previous sections, FPGAs are increasingly used in the field of medicine to ensure accuracy and accessibility

Healthcare should be accessible to all, and should not be a luxury.FPGAs ensure that we can do so sustainably.

UNet algorithm is used to sustainably reduce the energy consumption. The sensors in our design are all photodiodes, which consume less energy compared to filament bulbs.They are also much faster in response

All of this suggests that our FPGA design is a sustainable alternative compared to ASICs and CPUs based algorithms

8. Conclusion

This deep learning method could replace other testing methods for the COVID-19 virus. This code allows you to create less complex models. Because this alternative is faster and more accurate, it is easier for radiologists to diagnose the patient. This model was developed with many challenges in the field of COVID-19 detection. In the case of a viral infection, time is of the essence. As a result of this experiment, we obtained a reliable, resource-efficient and capable OpenCL-based FPGA accelerator for deep convolutional neural networks(DCNN). The total power consumption of the design system is calculated. Implemented an improved and redesigned deep pipeline (kernel) core architecture for data reuse and task/feature mapping. By integrating this advanced architecture into a Cyclone V SoC FPGA, we were able to achieve approximately four times the performance of the mobile GPU software accelerator. This board consumes much less power when using DCNN's U-Net, just like the GPU platform.

FPGA for Healthcare and Wellness